Read a good blog https://web.mit.edu/6.031/www/sp20/classes/24-sockets-networking/
Ref: https://www.cs.dartmouth.edu/~campbell/cs50/socketprogramming.html
Sockets definition
Sockets are endpoints for sending and receiving data across a network.
Sockets are a fundamental abstraction for process-to-process / inter-process communication (IPC), particularly over networks. It can also be used within the same machine.
A socket is one endpoint of a two-way communication link between two programs running on a network. 💻 ↔️ 🌐 ↔️ 💻
Mainly two types:
- TCP Sockets (Stream Sockets)
SOCK_STREAM - UDP Sockets (Datagram Sockets)
SOCK_DGRAM
Address family: AF_INET for IPv4, AF_INET6 for IPv6, or AF_UNIX for local IPC
TCP/IP Ports
Ports are used to identify processes running in applications on a host.
If two applications (e.g., web browser and email client) run on the same PC, both send/receive packets with the same IP address. The Transport layer uses port numbers to differentiate between them.
Well-Known Ports (0-1023)
“Well-known” ports are reserved for common server applications. Clients know servers will be listening at these reserved port numbers.
| Port | Protocol | Description |
|---|---|---|
| 20, 21 | FTP | File Transfer Protocol (data/control) |
| 22 | SSH | Secure Shell |
| 23 | Telnet | Telnet remote login |
| 25 | SMTP | Simple Mail Transfer Protocol |
| 53 | DNS | Domain Name System |
| 80 | HTTP | Hypertext Transfer Protocol |
| 110 | POP3 | Post Office Protocol v3 |
| 143 | IMAP | Internet Message Access Protocol |
| 443 | HTTPS | HTTP Secure (TLS/SSL) |
| 3306 | MySQL | MySQL Database |
| 5432 | PostgreSQL | PostgreSQL Database |
Well-known port numbers are assigned by IANA (Internet Assigned Numbers Authority) - the same group that manages DNS Root and IP addresses.
Ephemeral or Dynamic Ports (1024-65535)
Client-side port numbers are generated and assigned by the Transport layer. They can be any number from 1024 to 65535. These are typically allocated for short-term use and are called “Ephemeral or Dynamic Ports”.
When a client initiates a connection:
- Destination port: Well-known port of the server (e.g., 80 for HTTP)
- Source port: Ephemeral port assigned by client’s OS (e.g., 52431)
How Sockets fit in Network stack
How does sockets fit in OS stack
Data flow between wire and app
Established Socket
A socket is created by an application running in a host. The application assigns a transport protocol (TCP or UDP) and source and destination addresses to the socket. It identifies sockets by assigning numbers to them.
Note the web server has two sockets opened: one for each web page it is serving. These sockets are differentiated by the destination port numbers.
[!NOTE]
One host does not assign the socket number on both sides of the communication channel. The socket numbers assigned to each socket are only used by the host that assigned them. In other words, socket number 1 created on one host may be connected to socket number 5 on another host.
[!NOTE]
Based on the Well-Known source port numbers assigned to each socket, we can determine sockets 1 and 2 were created by an HTTP server application and socket 3 was created by an SMTP or email server application.
Sockets numbers may not be same on both sides
This graphic shows a virtual TCP connection between a client and server. Note the socket numbers are not the same on both sides of the channel. Hosts create, close and number their own sockets.
Sockets
TCP
The accept() function creates a new socket from the first connection request for the specified socket_descriptor and returns the file descriptor of the new socket. The file descriptor of this new socket is used in the read() and write() functions to send and receive data to and from the client node.
Ref: https://people.cs.rutgers.edu/~pxk/417/notes/index.html
TCP sockets
The system call to create a socket is int socket (domain, type, protocol);
// AF_INET: IPv4 Internet protocols, AF_INET6 for Ipv6
// SOCK_STREAM: Provides sequenced, reliable, two-way, connection-based byte streams (TCP), SOCK_DGRAM for UDP
// 0: Specifies the default protocol for the given domain and type
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
Only the server needs to bind. Bind system call int bind (int sockfd, const struct sockaddr *my_addr, socklen_t addrlen);
struct sockaddr_in {
short sin_family; // e.g. AF_INET
unsigned short sin_port; // e.g. htons(3490)
struct in_addr sin_addr; // see struct in_addr below
char sin_zero[8]; // zero this if you want to
};
struct sockaddr_in my_addr;
int sockfd;
if ((sockfd = socket (AF_INET, SOCK_STREAM, 0) < 0) {
printf (“Error while creating the socket</span>n”);
exit(1);
}
bzero (&my_addr, sizeof(my_addr)); // zero structure out
my_addr.sin_family = AF_INET; // match the socket() call
my_addr.sin_port = htons(5100); // specify port to listen on
my_addr.sin_addr.s_addr = htonl(INADDR_ANY); //allow the server to accept a client connection on any interface
if((bind(sockfd, (struct sockaddr *) &my_addr, sizeof(saddr)) < 0 {
printf(“Error in binding</span>n”);
exit(1);
}
- We specify the IP address as
INADDR_ANY, which allows the server to accept a client connection on any interface, in case the server host has multiple interfaces
Convert socket to listening socket
- By calling listen, the socket is converted into a listening socket, on which incoming connections from clients will be accepted by the kernel.
- These three steps,
socket,bind, andlisten, are the normal steps for any TCP server to prepare what we call the listening descriptorsockfdin our case
Only the server needs to listen int listen (int sockfd, int backlog), backlog specifies the maximum number of pending connections the kernel should queue for the socket. Listen returns 0 if OK, -1 on error.
Only the server can accept the incoming client connections int accept (int sockfd, struct sockaddr *fromaddr, socklen_t *addrlen)
Clients
The client need not bind, listen or accept. All client needs to do is to just connect to the server.
int connect (int sockfd, struct sockaddr *toaddr, socklen_t addrlen)
TCP sockets, there is connect where there is 3 way handshake
Refer to the following code, Each new accept is creating a new socket and then we are creating a fork to handle it
https://github.com/remidinishanth/distributed_systems/blob/572e1af70522d37d83fadd097534525aa628c4b0/networking/sockets/tcpechoserver.c#L72-L78
How to differentitate parent and newly forked child, we need to use return value
See the welcome socket
UDP sockets
Our initial client/server application uses TCP as its transport layer. We would like to change that in order to have UDP running instead. Let’s explore what we need to change in order to achieve that: A. Remember that you have to use SOCK_DGRAM instead of SOCK_STREAM B. UDP is a much smaller protocol than TCP. It is connectionless, so there is no need to connect/listen/accept. Delete or comment out all the code that was used for those functions. D. We will not need any of the lines from: newsockfd = accept …. all the way to the line before: while (read(…) Also, since we do not use newsockfd in this case, reading and writing should happen at the sockfd socket, so change the reading/writing/closing to take sockfd instead. No need for the exit(EXIT_SUCCESS) either. C. Since we don’t open a connection to a certain address/port, we need to send the server address everytime we send packets of data. So, we cannot use read/write, that do not have functionality for client and server addresses. So, let’s change read() and write() to recvfrom() and sendto() respectively. The functions are as follows:
sendto(sockfd, &outbuffer, 1, 0, &serveraddr, sizeof(serveraddr))
instead of:
write(sockfd, &outbuffer, 1)
and:
recvfrom(sockfd, &inbuffer, 1, 0, &serveraddr, &serveraddrlength)
instead of:
read(sockfd, &inbuffer, 1)