CSCI.4210 Operating Systems Fall, 2009 Class 18
Sockets

Sockets

The mechanism that application programs use to communicate on a network is the socket. Sockets were first introduced in Unix BSD 4.1 (1982), and the popularity of this operating operating system among academics made the TCP/IP Internet protocol a standard (although the socket interface can handle other protocols as well). There are a whole series of Unix system calls that deal with sockets. The Win32 APIs that deal with sockets are virtually identical.

Unix socket system calls

A socket is one end of a connection. The system call to create a socket is socket. Here is the function prototype.

   #include <sys/types.h>
   #include <sys/socket.h>

   int socket(int domain, int type, int protocol);

For an Internet socket (the only kind that we will consider in this class), the first argument should be PF_INET, although the older form, AF_INET should also work. There are other domains as well, but we will not discuss them in this course.

The second argument should be SOCK_STREAM to create a TCP socket, or SOCK_DGRAM to create a UDP socket. Recall that the TCP protocol sends a continuous stream of bytes to its application, which is why this is called a stream type.

The third argument should always be zero.

The socket call returns a file descriptor if successful, and a negative number on failure (failure is unlikely if your arguments are correct).

Both client processes and server processes use sockets and the socket system call, but the sequence of system calls after creating the socket is different for clients and for servers.

socket system calls for servers

A server process first creates a socket using the socket system call. After a socket has been created, it must be bound to an address. Recall that this means both an IP address and a port.

In the header file sys/socket.h the following structure is defined.

struct sockaddr {
    u_short sa_family;  /* address family: AF_xxx value */
    char sa_data[14];   /* up to 14 bytes of protocol specific address */
};
The contents of sa_data are interpreted according to the type of address.

For internet addresses, the following structures are defined in the header file netinet/in.h.

struct in_addr {
    u_long s_addr;  /* 32-bit netid/hostid */
                    /* network byte ordered */
};

struct sock_addr_in {
    short sin_family;   /* AF_INET */
    u_short sin_port;   /* 16 bit port number */
                        /* network byte ordered */
    struct in_addr sin_addr;
    char sin_zero[8];   /* unused */
};
By now your eyes have certainly glazed over, but this is not as complicated as it seems. Here is some sample code which you can use to bind a server to port 8080.
1.   int sock, len, retval;
2.   unsigned short port;
3.   struct sockaddr_in server;
4.   struct sockaddr_in from;
5.
6.   port = (unsigned short) 8080; 
7.   sock=socket(AF_INET, SOCK_STREAM, 0);
8.   if (sock < 0) error("Opening socket");
9.   server.sin_family=AF_INET;
10.  server.sin_addr.s_addr=INADDR_ANY; 
11.  server.sin_port=htons(port);  
12.  len=sizeof(server);
13.  retval = bind(sock, (struct sockaddr *)&server, len);
14.  if (retval < 0) error("binding");
This code creates a socket called sock. It then sets the fields of server. The symbolic constant INADDR_ANY in line 9 is used to refer to the internet address of the machine on which the process is running.

The function htons() at line 11 stands for host to network short. This is one of four functions which address the problem that that different architectures represent data differently. For example, there are two different methods that a computer can use to represent a 32 bit integer. These are called big-endian and little-endian (these terms are borrowed from Gulliver's Travels).

On a big-endian computer, if a 4 byte int is stored at address A, the most significant byte is at byte A and the least significant byte is at byte A+3. On a little endian computer the low order byte is at byte A and the highest order byte is at byte A+3. Big endian computers include the IBM 370, Motorola 68000, and SPARC. Little endian computers include the DEC Vax series and, significantly, Intel processors such as the Pentium.

By convention, the Internet and all of its protocols are big endian. This means that when a process running on a Pentium wants to send data over the network, it has to convert all of its integers (and short ints) from its native language (little endian) to big endian, and when it receives data from the internet which contains integer values, it has to convert them from big endian to little endian.

The htons function takes a short int (16 bit) as an argument, which of course in in host byte order) and returns the value in network byte order. Since the network is big-endian, on a big-endian computer, this function simply returns its argument, but on a little endian computer, it converts its argument to big endian and returns that value.

This is one of a family of four functions.

Once a socket is created and bound to an address, a server socket just has to listen for connections. The system call for this is listen which takes two arguments. The first is a socket which is bound to an address, the second is the length of the backlog queue; note that this is not the number of connections that can be accepted, this is the number of connections that can be waiting to be accepted. The second argument should be set to 5.

Once a server socket is listening, it should enter an infinite loop to wait for clients to connect. The first (or at least one of the first) statements in this loop is a call to accept. Here is the function prototype.

     #include <sys/types.h>
     #include <sys/socket.h>

     int accept(int s, struct sockaddr *addr, socklen_t *addrlen);
The accept system call takes three arguments. The first is the socket, the second is a pointer to a struct sockaddr, and the third is the size of a struct sockaddr. A call to accept will block until a client connects to the server. This will wake up the server. Accept returns an int, which is another socket. This is confusing, because the server listens on one socket, but when a connection from a client is established, all of the communication is on a different socket, the value of which is returned by accept. The second argument to accept will be set to the address of the client so that the server knows whom it is talking to.

Once a connection has been accepted, both sides can communicate on the new socket. If you wish, you can use read and write since a socket is just a file descriptor.

The preferred call instead of read is
int recv(int sock, char *buf, size_t len, int flags);
If the last argument is zero, this works exactly like read. The preferred equivalent of write is
int send(int sock, const char *buf, size_t len, int flags);

Note that unlike a pipe, where one end always reads and the other end always writes, both ends can read from and write to a socket without getting confused.

Here is a complete program for a very simple server.

/* This program creates a tcp server process in the 
   internet domain, port is passed in as an arg */
/* to compile on solaris gcc server.c -lnsl -lsocket */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <unistd.h> 
#include <string.h>
#include <stdlib.h>
#include <arpa/inet.h>
#define BUFSIZE 1024
char *msg = "I got your message";

void error(char *msg)
{
    perror(msg);
    exit(0);
}

int main(int argc, char *argv[])
{
   int sock, newsock, len, fromlen,  n, pid;
   unsigned short port;
   struct sockaddr_in server;
   struct sockaddr_in from;
   char buffer[BUFSIZE];

   if (argc < 2) {
     fprintf(stderr,"usage %s portnumber\n",argv[0]);
     exit(0);
   }
   port = (unsigned short) atoi(argv[1]);
   sock=socket(AF_INET, SOCK_STREAM, 0);
   if (sock < 0) error("Opening socket");
   server.sin_family=AF_INET;
   server.sin_addr.s_addr=INADDR_ANY;
   server.sin_port=htons(port);  
   len=sizeof(server);
   if (bind(sock, (struct sockaddr *)&server, len) < 0) 
        error("binding socket");
   fromlen=sizeof(from);
   listen(sock,5);
   while (1) {
        newsock=accept(sock, (struct sockaddr *)&from, &fromlen);
        printf("A connection has been accepted\n");
        pid = fork();
        if (pid == 0) { 
            n = recv(newsock,buffer,BUFSIZE-1,0);
            if (n < 1) {
	        error("Reading from socket");
            }
             else {
	        buffer[n]='\0';
                printf("The message from %s is %s\n",
                  inet_ntoa((struct in_addr)from.sin_addr),
                  buffer);
	     }
	     n = send(newsock, msg, strlen(msg),0);
	     if (n < strlen(msg)) error("Writing");
             close(newsock);
             exit(0);
	}
        close (newsock);
   }
   return 0; /* we never get here */
}

This program creates a TCP server in the Internet domain. It takes one argument, the port number that the server should be bound to. It binds the socket to that port on the local host (i.e. the machine that it is running on), listens for connections, and then enters an infinite loop. Whenever it accepts a new connection from a client, it forks off a new process to handle the connection. The child reads a message from the socket, displays it on the screen, sends a message back to the client, and terminates.

The only new function call in this program is inet_ntoa which takes one argument, an in_addr, (which is just a 32 bit unsigned int) and returns a string which is the IP address of its argument in dotted decimal form.

A small problem with this sample program is that the child processes become zombies.

The Client

The client creates a socket just like the server. However, instead of binding to an address, it called the connect system call, which establishes a connection to a server. Here is the function prototype

#include <sys/types.h>
#include <sys/socket.h>

int  connect(int s, const struct sockaddr *name, int namelen);

You've probably already forgotten what a struct sockaddr is.

struct sockaddr {
    u_short sa_family;  /* address family: AF_xxx value */
    char sa_data[14];   /* up to 14 bytes of protocol specific address */
};
The contents of sa_data are interpreted according to the type of address. For Internet addresses, use
struct in_addr {
    u_long s_addr;  /* 32-bit netid/hostid */
                    /* network byte ordered */
};

struct sock_addr_in {
    short sin_family;   /* AF_INET */
    u_short sin_port;   /* 16 bit port number */
                        /* network byte ordered */
    struct in_addr sin_addr;
    char sin_zero[8];   /* unused */
};

You need to fill in the port number and the IP address of the server. Usually, you do not know the IP address, but you have the name of the computer. In this case, use the system call gethostbyname. This takes one argument, a character string which is the name of the machine on which the server is running, and it returns a pointer to a struct hostent. This has only one field of interest, h_addr, which is the IP address.

The function gethostbyname() accesses the Internet Domain Name System (DNS) and can potentially trigger a fairly complex chain of events, sending Internet packets all over the world.

Aside If you have been paying attention, you might have noticed that an IP address, which is a 32 bit unsigned int, is considered a struct in_addr or a char *. You ought to find this confusing.

Here is some code that you can use in a client to fill in these values. it connects to port 8080 on host quark@cs.rpi.edu

   int sock, retval;
   unsigned short port;
   struct sockaddr_in server;
   struct hostent *hp;

   port = (unsigned short)8080;
   sock= socket(AF_INET, SOCK_STREAM, 0);
   server.sin_family = AF_INET;
   hp = gethostbyname("quark@cs.rpi.edu");
   if (hp==NULL) error("Unknown host");
   bcopy((char *)hp->h_addr, (char *)&server.sin_addr, hp->h_length);
   server.sin_port = htons(port);
   retval = connect(sock, (struct sockaddr *)&server, sizeof server);
   if (retval < 0) error("Connecting");
The function bcopy stands for bytecopy, and it copies one field to another. You cannot use the more traditional strcpy because this is not a null terminated string.

If the connection is successful, the client can use either read or recv to read from the client (with sock as the first argument), and either write or send to write to the server.

Here is a complete sample client. The name of the server and the port number are passed in as arguments.

/* This program creates a tcp client process in
   the internet domain.    The name of the 
   server machine and the port number are 
   passed to this program as arguments.
*/
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <strings.h> /* for bzero */
#include <unistd.h>
#include <stdlib.h> /* for atoi */

char *msg = "Hello from the client";

void error(char *msg)
{
    perror(msg);
    exit(0);
}

int main(int argc, char *argv[])
{
   int sock, n;
   unsigned short port;
   struct sockaddr_in server;
   struct hostent *hp;
   char buffer[1200];
   
   if (argc != 3) { 
         printf("Usage: %s server port\n", argv[0]);
         exit(1);
   }
   sock= socket(AF_INET, SOCK_STREAM, 0);
   if (sock < 0) error("Opening socket");

   server.sin_family = AF_INET;
   hp = gethostbyname(argv[1]);
   if (hp==NULL) error("Unknown host");
   bcopy((char *)hp->h_addr, 
         (char *)&server.sin_addr,
          hp->h_length);
   port = (unsigned short)atoi(argv[2]);
   server.sin_port = htons(port);
   if (connect(sock,
         (struct sockaddr *)&server, 
                     sizeof server) < 0)
             error("Connecting");
   n = write(sock, msg, strlen(msg));
   if (n < strlen(msg))
             error("Writing to socket");
     n=read(sock, buffer, 1024);
     if (n < 1) error("reading from socket");
     else{
        buffer[n]='\0';
        printf("The message from the server is %s\n",buffer);
     }
   close(sock);
   printf("Client terminating\n");
   return 0;
}

Datagram sockets

Recall that there are are two widely used protocols for the transport layer TCP and UDP. Stream sockets as described above use TCP; the connection is established prior to any data being transmitted, and the TCP software assures that the bytes are reliably delivered to the process in a stream in the order that they were sent.

A socket which uses UDP is called a datagram socket. A datagram is connectionless and unreliable. It is simply a packet sent over the Internet on a best effort basis, with no acknowledgements or other reliability checks. The advantage of UDP is that it is much more efficient, and so for short requests, a time of day server for example, UDP is often used. UDP is also used for distributed file system requests which are typically on a LAN with very high reliability, and where performance is extremely important.

A server for UDP sockets uses the same system calls as the TCP server with the following differences.

Here is a complete program which creates a UDP server. It listens on port 51717.

/* This program creates a datagram server process 
   in the internet domain.  it listens on port
   51717.   compile like this
   gcc -g -Wall server.udp.c -lnsl -lsocket */
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<netdb.h>
#include<stdio.h>
#include<strings.h>
#include<arpa/inet.h> 
#define BUFSIZE 1024

void error(char *);
int main()
{
   int sock, length, fromlen, n;
   struct sockaddr_in server;
   struct sockaddr_in from;
   char buf[BUFSIZE];

   sock=socket(AF_INET, SOCK_DGRAM, 0);
   if (sock < 0)
       error("Opening socket");
   length = sizeof(server);
   bzero(&server,length);
   server.sin_family=AF_INET;
   server.sin_addr.s_addr=INADDR_ANY;
   server.sin_port=htons(51717);
   if (bind(sock,(struct sockaddr *)&server,length)<0) 
       error("binding"); 
   while (1) {
       n = recvfrom(sock,buf,BUFSIZE-1,0,(struct sockaddr *)&from,&fromlen);
       if (n < 0) error("recvfrom");
       buf[n]='\0';
       printf("The message from %s is %s\n",
            inet_ntoa((struct in_addr)from.sin_addr), buf);
       n = sendto(sock,"Got your message",16,0,(struct sockaddr *)&from,fromlen);
       if (n < 0) error("sendto");
   }
   return 0;
 }

void error(char *msg)
{
    perror(msg);
    exit(0);
}
You could probably write a udp client by yourself, but here it is. The server name and port number are passed in as arguments.

/* Datagram client in the internet domain */
#include<sys/types.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#include<netdb.h>
#include<stdio.h>
#include<strings.h>
#include<stdlib.h>
#define DATA "I love operating systems ..."
#define BUFSIZE 1024 

void error(char *);
int main(int argc, char *argv[])
{
   int sock, length, n;
   struct in_addr *hostaddr;
   struct sockaddr_in server;
   struct hostent *hp, *gethostbyname();
   char buf[BUFSIZE];

   
   if (argc != 3) { 
         printf("Usage: server port\n");
         exit(1);
   }
   sock= socket(AF_INET, SOCK_DGRAM, 0);
   if (sock < 0) error("socket");

   server.sin_family = AF_INET;

   hp = gethostbyname(argv[1]);
   if (hp==0) error("Unknown host");

   hostaddr =(struct in_addr *) hp->h_addr;
   bcopy((char *)hostaddr, 
        (char *)&server.sin_addr,
         hp->h_length);
   server.sin_port = htons(atoi(argv[2]));
   length=sizeof(struct sockaddr_in);
   n=sendto(sock,DATA,30,0,(struct sockaddr *)&server,length);
   if (n < 0) perror("Sendto");
   n = recvfrom(sock,buf,BUFSIZE,0,(struct sockaddr *) &server,&length);
   if (n < 0) error ("recvfrom");
   buf[n]='\0';
   printf("The return message was %s\n",buf);
   return 0;
}

void error(char *msg)
{
    perror(msg);
    exit(0);
}

Sockets on Windows

Socket programming on windows uses exactly the same system calls as Unix, with just a few annoying quirks (but you could have guessed that!). The socket system is called winsock

In order to use sockets on windows, the program have to initiate the Winsock dynamic link library WS2_32.DLL. This is done with a call to WSAStartup. This takes two arguments, WORD wVersionRequested and a pointer to a WSADATA structure. Rather than try to explain this, I'll just give you the code. This has to appear in any program which uses winsock prior to any calls to the socket APIs.

#include <winsock.h>
int retval;
WORD version;
WSADATA stWSAData;
...
version = MAKEWORD(2,2);
retval = WSAStartup(version, &stWSAData);
if (retval != 0) error(...)

Here is a link to the online help for WSAStartup.

Windows has no concept of a file descriptor, so the socket system call returns a value of type SOCKET. The accept call also returns a type SOCKET.

Finally, you need to link to the winsock library during the compile. Here is how you do this:

Here is the server code for a stream socket using winsock.

/* This program creates a tcp server process in the 
   internet domain, port is passed in as an arg */
/* to compile, add wsock32.lib to the list of
   libraries to link with in the project settings menu*/
#include<windows.h>
#include<stdio.h>
#include<winsock.h>

#define BUFSIZE 1024
char *msg = "I got your message";

void error(char *msg)
{
	printf("ERROR, %s, errno is %d\n",msg, WSAGetLastError());
	exit(0);
}

int main(int argc, char *argv[])
{
   SOCKET sock, newsock;
   int len, fromlen;
   DWORD n, retval;
   unsigned short port;
   struct sockaddr_in server;
   struct sockaddr_in from;
   char buffer[BUFSIZE];
   WORD version;
   WSADATA stWSAData;

   if (argc < 2) {
     fprintf(stderr,"usage %s portnumber\n",argv[0]);
     exit(0);
   }
   version = MAKEWORD(2,2);
   retval =WSAStartup(version,&stWSAData);
   if (retval != 0) error("WSAStartup");
   port = (unsigned short) atoi(argv[1]);
   sock=socket(AF_INET, SOCK_STREAM, 0);
   if (sock == INVALID_SOCKET) error("Opening socket");
   server.sin_family=AF_INET;
   server.sin_addr.s_addr=INADDR_ANY;
   server.sin_port=htons(port);  
   len=sizeof(server);
   if (bind(sock,
           (struct sockaddr *)&server, 
           len) != 0) 
        error("binding socket");
   fromlen=sizeof(from);
   listen(sock,5);
   while (1) {
        newsock=accept(sock,
                  (struct sockaddr *)&from, 
                  &fromlen);
        printf("A connection has been accepted\n");

            n = recv(newsock,buffer,BUFSIZE,0);
            if (n < 1) {
	        error("Reading from socket");
            }
             else {
	        buffer[n]='\0';
                printf("The message from %s is %s\n",
                  inet_ntoa(from.sin_addr),
                  buffer);
	     }
	     n = send(newsock, msg, strlen(msg),0);
	     if (n != strlen(msg)) error("Writing");
         closesocket(newsock); 
	}
   return 0; /* we never get here */
}

Design of Servers

Possible Server Designs

When would you want to use each?

Iterative server best if the response from the server is quick because there is minimal overhead.

Concurrent server with fork best if there will be few connections but each connection will do extensive reading and writing over an extended period. You have to deal with zombies.

Concurrent server with threads generally better than a concurrent server with fork, because the overhead of creating a new thread is much less than that for creating a new process.

server with select This design is best for a server which wants to listen on many sockets simultaneously but where connections are relatively rare. We will see a good example of this with the unix internet daemon inetd describe below.

Preforking server This is a new concept, but it is worth studying because it is probably the best design for servers which have to handle many requests and response time is important. For example, file servers or web servers usually use a preforking model. These receive many requests and have to respond very quickly. The overhead associated with creating a new process or even a new thread for each request would be prohibitive if the server gets heavy use.

To do this, the server calls socket, bind and listen exactly as we have seen, and then calls fork several times to create a number of identical processes. Each of these processes then enters its infinite loop and each calls accept.

Recall that when a call to fork creates a new child process, all of the file descriptor information is duplicated. This means that the child process is listening on the same port as the parent.

Each process goes to sleep. What happens when a connection occurs is somewhat system dependent; here is how it works on Berkeley Unix. When a connection arrives, all N processes are awakened. This is because all have been put to sleep on the same wait channel. Exactly one of these will accept the connection (accept will return). The others will go back to sleep.

The code should be written such that each connection is handled concurrently. The process that accepted the connection will read the request and supply the response. Meanwhile, if other connections arrive, another process will accept and handle it. When a particular process completes a request, it closes the socket on which it received and sent the data, and goes back to the accept statement again.

One issue is how many processes to create. If there are too few processes for the number of connections, clients may still be forced to wait if all of the processes are busy handling other connections. However, there is some minimal overhead associated with waking up many processes, and so it is inefficient to create too many processes.

This may not work on other Unix implementations. One solution is to put a lock or some other mutual exclusion primitive around the accept statement so that only one process will be able to accept at any given instance.

The apache web server uses preforking with an additional twist. It can change the number of processes based on load. It periodically checks to see how many processes are busy. If most are busy, it creates more processes; if most are idle, it can kill some of the processes. The system administrator can set a minimum and maximum number of child processes.

Prethreaded server This works in much the same way as a preforking server, and has more or less the same advantages and disadvantages. One potential problem with this is that if a fatal exception occurs, such as a segmentation fault, it will kill the entire process including all the threads, while if it happens in a preforked server, it will kill that process, but the other processes can continue. Of course if code is well written, this should never happen.

Daemon Processes and the inetd superserver

A daemon is a process that runs in background and is not associated with a controlling terminal. Typical Unix systems have 20 to 50 daemons running in background doing various administrative tasks. The windows equivalent is a service.

Most daemons are started at system initialization. There is a system initialization script that starts them. They generally have superuser privileges

One of these is the cron daemon, which keeps a table of events in a file such as /etc/crontab. It wakes up once a minute and sees if anything needs to be run.

If a daemon has to output a message, it can't do it directly because it has closed stdin, stdout and stderr. Therefore, messages that would normally be written to standard output or standard error are written to the system log. There is a syslogd daemon which daemons can use for this.

Here is some skeleton code for creating a daemon (modified from Unix Network Programming: The Sockets Networking API Vol 1, third edition,by W. R. Stevens, B Fenner, and A. M Rudoff, Addison Wesley, 2004)


int daemon_init(const char *pname, int facility)
{
   int i;
   pid_t pid;

   pid = fork();

   if (pid < 0) error("forking");

   if (pid > 0) exit(0);  // parent process terminates 

   if (setsid() < 0) error("setsid");  // sets a new session id
          //so that shell cannot send a kill signal

   signal(SIGHUP, SIG_IGN);  // ignore the hangup signal

   pid = fork();
   
   if (pid > 0) exit(0);

   //this guarantees that the child is not a session leader
   //and so it cannot obtain a controlling terminal

   chdir("/"); // change to the root directory

   for(i=0;i < MAXFD;i++) close(i);

   open("/dev/null",O_RDONLY);
   open("/dev/null",O_RDWR);
   open("/dev/null",O_RDWR);

   // This guarantees that anything written to stdout or stderr will
   // not cause a seg fault.

   openlog(pname, gLOG_PID, facility);  
   
The inetd Daemon

On a typical Unix system, there could be many servers in existence, waiting for a request. Before BSD4.3 each had a process associated with it. Each daemon took a slot in the process table, but was asleep most of the time.

Examples include ftp, telnet, rlogin, finger

These all do pretty much the same thing

The solution is inetd, the internet superserver

inetd starts, makes itself a daemon, reads /etc/inetd.conf and creates a socket for all services specified in the file.

Each socket is bound appropriately. Port is determined by calling getservbyname with the service-name and the protocol fields.

It listens on each socket

It calls select

Whenever a connection is received on any of the listening sockets, it wakes up, forks off a child and execs the appropriate process to handle the connection.

Return to the course home page