wksht4

CSCI.4210 Operating Systems
Unix Process System Calls

Now that you know what a process is, we can look at the system calls which create new processes. This module will discuss the Unix system calls related to processes, the next module will discuss the Win32 APIs related to processes.

The fork system call

On classical Unix systems , all processes are created with the fork() system call. This system call creates a new process which is an exact duplicate of the calling process. The process which called fork() is referred to as the parent and the new process which fork creates is called the child. Both processes, the parent and the child, are runnable and when run, start immediately after the fork system call.

Here is the function prototype

     #include <sys/types.h>
     #include <unistd.h>

     pid_t fork(void);

The data type pid_t refers to the type of a process id, which is an unsigned int on all systems that I am aware of. The return value of fork() is important. In the parent, the value returned from fork() is the process id of the child. In the child, the value returned from fork() is zero. In the case of an error, fork() will return a negative value.

Here is a simple program which demonstrates this.

#include <unistd.h>
#include <sys/types.h>
#include <stdio.h>
extern int errno;
int main()
{
    pid_t pid;
    pid = fork();
    if (pid == 0) 
       printf("I'm the child\n");
    else if (pid > 0) {
       printf("I'm the parent, ");
       printf("child pid is %d\n",pid);
   }
    else { /* pid < 0 */
       perror("Error forking");
       fprintf(stderr,"errno is %d\n",errno);
    }
   return 0;
}

When the running program executes the fork system call at line 8, a new process is created. This child process has exactly the same code as the parent process. In this example there are no other variables, but if there happened to be a variable called x in the parent process with the value 17, there would also be a variable called x in the child process and it would have the value 17. Both the parent and the child will start running at the line after the fork. The only way that the programmer can distinguish whether the code is the parent or the child is by the return value from fork.

This picture shows the layout of processes in memory before and after process 1234 calls fork to create a child process with process id 1235.

A call to fork is unlikely to fail under ordinary circumstances. However, all Unix systems have a limit on the total number of processes which can be run by a single user and the total number of processes which are in the process table at one time, and so if the creation of a new process would cause either of these limits to be exceeded, it will fail, returning a negative value and the child process will not be created.

The following are the same for the parent process and the child process

The text segments (code segments)
The values of all variables (except the value returned from fork())
The environment
The process priority
The controlling terminal
The current working directory
Open file descriptors

Note that although the values of all variables are the same, all of the various data segments, including the run time stack are copied, so that there are two instances of each variable, allowing each process to update data independently.

The following are different for the parent process and the child process

The process id
The parent process id
Data on resource allocation. For example, total run time is set to zero in the child, and process start time for the child is set to the current time

Note that every process except the init process (init has pid 0 and is the first process created at boot time. It runs until the system shuts down.) has a parent process so there is a tree of processes with init as the root.

Here is what fork does

Reserve swap space for the child's data and stack
allocate a new pid and kernel proc structure
initialize the kernel proc structure. Some fields (i.e. user id, group id, signal masks) are copied from the parent, some set to zero (i.e. cpu usage), others such as ppid point to child specific values
allocate address translation maps for the child
add the child to the set of processes sharing the text region of the program that the parent is executing
duplicate the parent's data and stack regions
acquire references to shared resources inherited by the child such as open files
initialize the hardware context by copying the parent's registers
make the child runnable and place it on the scheduler queue
arrange for the child to return from fork with a value of zero
return the pid of the child to the parent

Here is an exercise on fork()

The wait system call

Whether the parent or the child will be run first is undetermined; the term for this is a race condition. Thus there are two possible outputs for this program.

I'm the child
I'm the parent, child pid is 22970

is one possible output, the other is

I'm the parent, child pid is 22970
I'm the child

It is possible for the parent to control this by executing a wait() system call. This call causes the parent process to be blocked until a child dies. If the process has no children, a call to wait returns immediately.

Here is the function prototype

     #include <sys/types.h>
     #include <sys/wait.h>

     pid_t wait(int *stat_loc);

The return value from wait is the process id of the child that died.

Here is a short program which demonstrates this.

#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
extern int errno;
int main()
{
    pid_t pid, retval;
    int status;
    
    pid = fork();
    if (pid == 0) 
       printf("I'm the child\n");
    else if (pid > 0) {
       retval = wait(&status);
       printf("I'm the parent, ");
       printf("the child %d has died\n",retval);
   }
    else { /* pid < 0 */
       perror("Error forking");
       fprintf(stderr,"errno is %d\n",errno);
    }
   return 0;
}

If the parent happens to run first, it executes the wait system call, which causes the process to block (remember the process state diagram from Tuesday's class). It remains blocked until the child terminates, at which time a signal is sent to the parent which awakens it and it is returned to runnable status. If the child happens to run first and terminate before the parent runs, then the call to wait returns immediately. In either case, the return value is the process id of the child.

A parent of a dying child might want to know how the child died, and the dying child might want to send a message to the parent. Both of these are accomplished using the argument which is passed to wait. Because this is a reference argument, its value is set by the system call. The least significant byte indicates how the child process died. If the child terminated normally (i.e. the process reached the end of main() or it called the exit() system call), the lowest byte of status will be zero. If the child terminated abnormally, (e.g. was terminated by a memory exception error (segmentation fault) or by the user sending a kill signal (cntl-c)), the lowest byte will be set to the numeric value of the signal that killed it.

If the child terminated normally by calling exit(), the child can pass an argument to exit() and this value will be in the second byte of status. For example, if the child called exit(5), the value of status in binary would be
00000000 00000000 00000101 00000000
It would be 00 00 05 00 in hexadecimal.

The C language has some bitwise operations which you may not know. >> is the right shift operator, << is the left shift operator, & is the bitwise and operator, and | is the bitwise or operator. You can use these to check the value of each byte of status. For example, to check whether the lowest order byte is zero, use the bitwise and operator with 0xFF (0x preceding a numeric constant in C indicates that the value is in hexadecimal).

  if (status & 0xFF != 0)  
     printf("The child died abnormally");

To examine the value of the third byte, rightshift the value 8 bits and then perform a logical and with 0xFF.

   int temp;
   ....
   temp = status >> 8; /* right shift */
   temp = temp & 0xFF; 
   printf("exit status was %d\n",temp);

If a process dies before all of its children have terminated, the children become orphans. Since all processes except init have a parent, orphans are adopted by the init process.

Zombies

If a child dies before its parent calls wait(), it is possible that the parent might call wait at some later time, and would want information about the status of the dead child. In this case, the process is not really terminated, but some information is retained. A process which has terminated but whose parent has not called wait() is called a zombie. Zombies occupy a slot in the process table of the operating system although they do not consume other resources. When you examine the processes on the computer with the ps command, the status of zombies is defunct. A zombie is terminated either when the parent calls wait and gets the information about that child or when the parent dies, because zombies whose parent is init will be killed.

The exec family of system calls

It should have occurred to you that since fork can only create a copy of itself, it is of limited use. The fork call is usually used with another system call, exec, which overwrites the entire process space with a completely new process image. Execution of the new image starts at the beginning. Exec is actually a family of six system calls, the simplest of which is execl Here is the function prototype

     #include <unistd.h>  /* standard unix header file */

     int execl(const char *path, const  char  *arg0,  ...,  const
     char *argn, NULL);

The first argument, path should be the pathname of an executable program. The remaining arguments are the arguments to be passed to this program as argv. The argument list is terminated by a NULL. Here is a short sample program.

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <wait.h>
extern int errno;
int main()
{
    pid_t p;
    p=fork();
    if (p == 0)  { /* child */
         execl("/bin/ls", "ls", "-l", NULL);
         perror("Exec failed");
    }
    else if (p > 0) {
        wait(NULL);
        printf("Child is done\n");
    }
    else {
        perror("Could not fork");
    }
    return 0;
}

This program forks, creating a new process. The image of the child process is overwritten by the image for the command /bin/ls, and this is called with two arguments, ls and -l (recall that by convention, argv[0] is the name of the command). The child then runs ls. When it terminates, the parent is awakened, displays its message, and also terminates.

Any of the exec calls can fail. The obvious cause of failure is that path is not the pathname of an executable file. If any of the exec calls succeed, there is no return because all of the code for the calling process is overwritten by the new image. If it fails, it returns a negative value like any other system call, but there is no need to check for this because it can only return if it failed. This is why there is no if before the perror call; the only way that the program can get to that line is if the call to execl failed.

There are five other system calls in the exec family. They all overwrite the current process with a new image; they differ only in the arguments that they accept and in other minor ways.

int execv(const char *path, char *const argv[]) This call is the same as execl except that it takes only two arguments, the second being an argument vector.
Here is a short program which demonstrates the use of execv
int execle(const char *path, const char *arg0, ..., const char *argn,
char * /*NULL*/, char *const envp[]) Like execl, this call takes a variable number of arguments, but its final argument is a vector which represents the new environment.
By default, the environment of the process which is exec'ed is the same as that of the parent, but this allows the user to change the environment.
Here is a short program which demonstrates the use of execle
int execve(const char *path, char *const argv[], char *const envp[]) this is the same as execv except that it passes the environment vector as a third argument.
int execlp(const char *file, const char *arg0, ..., const char *argn,
char * /*NULL*/) This differs from the above calls in that its first argument is just a filename rather than a path, and the call searches the PATH environment variable for the executable.
Here is a short program which demonstrates the use of execlp
int execvp(const char *file, char *const argv[]) this is the same as execlp except that the arguments are passed as a single argument.

The shell

The Unix command processor, or shell, is just another process. It gets commands entered by the user, forks off a process, the child calls exec to execute the command and and the parent waits for the child process to finish.

Here is some very simplistic pseudocode for a shell.

    pid_t pid;
    while (1) {
       GetNextCommand();
       pid = fork();
       if (pid == 0) {
          ExecCommand();
          PrintErrorMsg()
          exit(0);
       }
       else if (pid > 0) 
          wait();
       else 
          HandleForkFailure();
    }

Unix allows the user to run a command in the background by appending an ampersand (&) to the command line. When a program runs in background, it cannot receive user input from the terminal and the shell prompt is displayed immediately, allowing the user to enter a new command before the background process is completed. What change would you make to the above pseudocode to allow a command to run in the background?

Creating a New Process in Win32

The Win32API to create a new process is

BOOL CreateProcess( 
    LPCTSTR lpApplicationName, /* executable program */
    LPTSTR lpCommandLine,      /* command line args  */
    LPSECURITY_ATTRIBUTES lpProcessAttributes,  /* use NULL */
    LPSECURITY_ATTRIBUTES lpThreadAttributes,   /* use NULL */
    BOOL bInheritHandles,  /* does proc inherit parents open handles */
    DWORD dwCreationFlags, 
    LPVOID lpEnvironment, /* if NULL, use parent environment */
    LPCTSTR lpCurrentDirectory, /* if NULL, use parent curr dir */
    LPSTARTUPINFO lpStartupInfo, 
    LPPROCESS_INFORMATION lpProcessInformation 
); 

typedef struct PROCESS_INFORMATION {
    HANDLE hProcess;
    HANDLE hThread;
    DWORD  dwProcessId;
    DWORD  dwThreadId;
} PROCESS_INFORMATION;

This is more or less equivalent to both fork() and exec() on Unix.

Win32 APIs almost always take a great many more arguments than do the equivalent Unix system calls.

LPCTSTR lpApplicationName This is the path name of the process to execute; this is more or less equivalent to the first argument of the Unux execl command. This can be either a relative or absolute pathname.

LPTSTR lpCommandLine This should normally be NULL, but it provides an opportunity to pass arguments to the process.

LPSECURITY_ATTRIBUTES lpProcessAttributes We will discuss security attributes later in the course. For the moment, this should be NULL

LPSECURITY_ATTRIBUTES lpThreadAttributes See above

BOOL bInheritHandles If TRUE, each open file handle (or other handle) is also open in the child process

DWORD dwCreationFlags There are lots of flags that can be set. For now, this can be set to zero.

LPVOID lpEnvironment You can change the environment if you wish. If this is NULL, the child process inherits the environment of the parent.

LPCTSTR lpCurrentDirectory You can change the current working directory of the child process. If NULL, the child has the same current working directory as the parent.

LPSTARTUPINFO lpStartupInfoThis is a pointer to information about how to render the new process. Since in a windows environment, a new process is typically a new window, this tells the OS where to put the window, the height and width of the window, and other information.

LPPROCESS_INFORMATION lpProcessInformation This is a structure that returns information to the parent about the child, such as the process id and the handle.

This call, like most Win32 APIS, returns TRUE on success and FALSE on failure.

Here is a short piece of sample code that starts a new instance of Notepad.

#include <windows.h>
#include <stdio.h>
#include <string.h>

char *GetErrorMessage() 
{
    char *ErrMsg;
        FormatMessage( 
             FORMAT_MESSAGE_ALLOCATE_BUFFER | 
             FORMAT_MESSAGE_FROM_SYSTEM |              
                         FORMAT_MESSAGE_IGNORE_INSERTS,
             NULL,
             GetLastError(),
             MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT), // Default language
             (LPTSTR) &ErrMsg,
             0,
             NULL 
        );
        return ErrMsg;
}

int main()
{
    char commandline[255];
    PROCESS_INFORMATION ProcessInfo;
    STARTUPINFO StartupInfo;
    strcpy(commandline,"notepad");
    GetStartupInfo(&StartupInfo);
    if (CreateProcess (
        "c:\\winnt\\notepad.exe",
                commandline, NULL, NULL,
                FALSE, 0, NULL, NULL, 
                &StartupInfo,
                &ProcessInfo) == TRUE) {
                        printf("Create was successful\n");
                        printf("Proc id is %d\n",
                                ProcessInfo.dwProcessId);
        }
    else {

                printf("error, CreateProcess failed, error number %d\n",GetLastError());
                printf("%s\n",GetErrorMessage());
        }
    return 0;
}

Pay particular attention to the first argument of CreateProcess. Note that since the backslash is the escape character in a string, the backslash character in a string has to be represented by a double backslash. This doesn't come up much with Unix, but it is a constant issue with Windows, because the backslash is the directory delimiter.

You can just copy my ugly function char *GetErrorMessage() which displays an error message if the process fails.

Here is a link to the full on-line help page for CreateProcess().

Redirecting Input and Output

On both Unix and Windows, when a new process is created, three I/O streams are automatically created. These are called Standard Input, Standard Output, and Standard Error. Standard Input defaults to the keyboard of the controlling terminal; Standard Output and Standard Error default to the monitor of the controlling terminal. The reason why there are two separate output streams which default to the monitor is that it is sometimes the case that the user would like to redirect the normal output to a file, but would like to have error messages displayed on the terminal (or sent to a different file).

Users can redirect the output of a process from the terminal to a file with the > operator at the shell prompt. For example:
ls -l > outfile
will send the output of the ls -l command to a file called outfile instead of the terminal.

The shell can also redirect standard input from the keyboard to a file with the < character.

Every Unix process has an array of file descriptors associated with it. Recall from the exercise that the open system call returned a file descriptor. A file descriptor is a low positive integer. By convention, standard input is file descriptor zero, standard output is file descriptor 1, and standard error is file descriptor 2. When your program opens a file, it is usually assigned the lowest unused file descriptor.

The various C library functions that write to the terminal (e.g. printf) or read from the keyboard (e.g. scanf, getchar) have code in them which calls the write and read system calls with the file descriptors set to standard output and standard input.

To redirect standard input or standard output to a file from within the program, use the dup or dup2 system calls. These two calls duplicate a file descriptor. The call to dup2 takes two arguments, fd1 and fd2, both file descriptors. The first argument, fd1, must be an already open file descriptor. The call makes fd2 refer to the same file as fd1. If fd2 is open, it is closed first. An example will clarify this.

/* dup2.c - a demo of the dup2 call */
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
int main()
{
   int fd, retval;
   fd = open("temp",O_WRONLY | O_CREAT | O_TRUNC, 0200 | 0400); 
   /* fd is probably 3, the lowest unused file descriptor */
   if (fd < 0) {
          perror("Error on opening temp");
          exit(0);
   }
   printf("This line written on the terminal\n");
   retval = dup2(fd,1); 
   if (retval != 1) {
       perror("Error on dup2");
       exit(0);
   }   
   printf("This line is written to the file temp\n");
   return 0;
}

This program opens a file called temp for writing. It then calls printf, which writes a line to the terminal. The next line is the crucial statement of the program

 retval = dup2(fd,
1);

This line makes the file descriptor 1 refer to the file that fd refers to. Since file descriptor 1 refers to standard output, this is first closed.

In the printf function, there is a line of code that looks like this
n = write(1,.....
Ordinarily, the file descriptor 1 refers to standard output, but our program has redefined it so that it refers to the file temp. This call to write will thus write to the file instead of to the terminal.

If successful, dup2 returns the value of its second argument. Otherwise it returns a negative value.

dup is an earlier version of dup2 which took only one argument, the file descriptor to be duplicated, and it set the lowest unused file descriptor to be equivalent. To use this, close file descriptor zero or one as appropriat before calling dup.

Here is the man page for dup2

A call to fork creates a new process which is identical (except for the return value from fork) to the old process. The file descriptor table of the parent is also copied to the child, so if the parent has an open file, the child will inherit this open file. Also, if the parent has duplicated a file descriptor, the child will inherit this. The file descriptor table is also copied across a call to a member of the exec family of system calls.

This is important for the implementation of a shell. Recall that a shell forks off a new process to execute the command. If the user specifies the output to go to a file rather than to standard output, the shell can use dup2 to redirect standard output to a file, then call exec to execute the command.

Here is the specification for the second programming assignment, a simple shell