Deadlock

CSCI.4210 Operating Systems
Deadlock

Recall that one definition of an operating system is a resource allocator. There are many resources that can be allocated to only one process at a time, and we have seen several operating system features that allow this, such as mutexes, semaphores or file locks.

Sometimes a process has to reserve more than one resource. For example, a process which copies files from one tape to another generally requires two tape drives. A process which deals with databases may need to lock multiple records in a database.

In general, resources allocated to a process are not preemptable; this means that once a resource has been allocated to a process, there is no simple mechanism by which the system can take the resource back from the process unless the process voluntarily gives it up or the system administrator kills the process. This can lead to a situation called deadlock. A set of processes or threads is deadlocked when each process or thread is waiting for a resource to be freed which is controlled by another process. Here is an example of a situation where deadlock can occur.

Mutex M1, M2;

/* Thread 1 */
while (1) {
   NonCriticalSection()
   Mutex_lock(&M1);
   Mutex_lock(&M2);
   CriticalSection();
   Mutex_unlock(&M2);
   Mutex_unlock(&M1);
}

/* Thread 2 */
while (1) {
   NonCriticalSection()
   Mutex_lock(&M2);
   Mutex_lock(&M1);
   CriticalSection();
   Mutex_unlock(&M1);
   Mutex_unlock(&M2);
}

Suppose thread 1 is running and locks M1, but before it can lock M2, it is interrupted. Thread 2 starts running; it locks M2, when it tries to obtain and lock M1, it is blocked because M1 is already locked (by thread 1). Eventually thread 1 starts running again, and it tries to obtain and lock M2, but it is blocked because M2 is already locked by thread 2. Both threads are blocked; each is waiting for an event which will never occur.

Traffic gridlock is an everyday example of a deadlock situation.

In order for deadlock to occur, four conditions must be true.

Mutual exclusion - Each resource is either currently allocated to exactly one process or it is available. (Two processes cannot simultaneously control the same resource or be in their critical section).
Hold and Wait - processes currently holding resources can request new resources
No preemption - Once a process holds a resource, it cannot be taken away by another process or the kernel.
Circular wait - Each process is waiting to obtain a resource which is held by another process.

The dining philosophers problem discussed in an earlier section is a classic example of deadlock. Each philosopher picks up his or her left fork and waits for the right fork to become available, but it never does.

Deadlock can be modeled with a directed graph. In a deadlock graph, vertices represent either processes (circles) or resources (squares). A process which has acquired a resource is show with an arrow (edge) from the resource to the process. A process which has requested a resource which has not yet been assigned to it is modeled with an arrow from the process to the resource. If these create a cycle, there is deadlock.

The deadlock situation in the above code can be modeled like this.

This graph shows an extremely simple deadlock situation, but it is also possible for a more complex situation to create deadlock. Here is an example of deadlock with four processes and four resources.

There are a number of ways that deadlock can occur in an operating situation. We have seen some examples, here are two more.

Two processes need to lock two files, the first process locks one file the second process locks the other, and each waits for the other to free up the locked file.
Two processes want to write a file to a print spool area at the same time and both start writing. However, the print spool area is of fixed size, and it fills up before either process finishes writing its file, so both wait for more space to become available.

Exercise

Solutions to deadlock

There are several ways to address the problem of deadlock in an operating system.

Just ignore it and hope it doesn't happen
Detection and recovery - if it happens, take action
Dynamic avoidance by careful resource allocation. Check to see if a resource can be granted, and if granting it will cause deadlock, don't grant it.
Prevention - change the rules

Ignore deadlock

The text refers to this as the Ostrich Algorithm. Just hope that deadlock doesn't happen. In general, this is a reasonable strategy. Deadlock is unlikely to occur very often; a system can run for years without deadlock occurring. If the operating system has a deadlock prevention or detection system in place, this will have a negative impact on performance (slow the system down) because whenever a process or thread requests a resource, the system will have to check whether granting this request could cause a potential deadlock situation.

If deadlock does occur, it may be necessary to bring the system down, or at least manually kill a number of processes, but even that is not an extreme solution in most situations.

Deadlock detection and recovery

As we saw above, if there is only one instance of each resource, it is possible to detect deadlock by constructing a resource allocation/request graph and checking for cycles. Graph theorists have developed a number of algorithms to detect cycles in a graph. The book discusses one of these. It uses only one data structure L a list of nodes.

A cycle detection algorithm

For each node N in the graph

Initialize L to the empty list and designate all edges as unmarked
Add the current node to L and check to see if it appears twice. If it does, there is a cycle in the graph.
From the given node, check to see if there are any unmarked outgoing edges. If yes, go to the next step, if no, skip the next step
Pick an unmarked edge, mark it, then follow it to the new current node and go to step 3.
We have reached a dead end. Go back to the previous node and make that the current node. If the current node is the starting Node and there are no unmarked edges, there are no cycles in the graph. Otherwise, go to step 3.

Let's work through an example with five processes and five resources. Here is the resource request/allocation graph.

The algorithm needs to search each node; let's start at node P1. We add P1 to L and follow the only edge to R1, marking that edge. R1 is now the current node so we add that to L, checking to confirm that it is not already in L. We then follow the unmarked edge to P2, marking the edge, and making P2 the current node. We add P2 to L, checking to make sure that it is not already in L, and follow the edge to R2. This makes R2 the current node, so we add it to L, checking to make sure that it is not already there. We are now at a dead end so we back up, making P2 the current node again. There are no more unmarked edges from P2 so we back up yet again, making R1 the current node. There are no more unmarked edges from R1 so we back up yet again, making P1 the current node. Since there are no more unmarked edges from P1 and since this was our starting point, we are through with this node (and all of the nodes visited so far).

We move to the next unvisited node P3, and initialize L to empty. We first follow the unmarked edge to R1, putting R1 on L. Continuing, we make P2 the current node and then R2. Since we are at a dead end, we repeatedly back up until P3 becomes the current node again.

L now contains P3, R1, P2, and R2. P3 is the current node, and it has another unmarked edge to R3. We make R3 the current node, add it to L, follow its edge to P4. We repeat this process, visiting R4, then P5, then R5, then P3. When we visit P3 again we note that it is already on L, so we have detected a cycle, meaning that there is a deadlock situation.

Once deadlock has been detected, it is not clear what the system should do to correct the situation. There are three strategies.

Preemption - we can take an already allocated resource away from a process and give it to another process. This can present problems. Suppose the resource is a printer and a print job is half completed. It is often difficult to restart such a job without completely starting over.
Rollback - In situations where deadlock is a real possibility, the system can periodically make a record of the state of each process and when deadlock occurs, roll everything back to the last checkpoint, and restart, but allocating resources differently so that deadlock does not occur. This means that all work done after the checkpoint is lost and will have to be redone.
Kill one or more processes - this is the simplest and crudest, but it works.

Deadlock avoidance

The above solution allowed deadlock to happen, then detected that deadlock had occurred and tried to fix the problem after the fact. Another solution is to avoid deadlock by only granting resources if granting them cannot result in a deadlock situation later. However, this works only if the system knows what requests for resources a process will be making in the future, and this is an unrealistic assumption. The text describes the bankers algorithm but then points out that it is essentially impossible to implement because of this assumption.

Deadlock Prevention

The difference between deadlock avoidance and deadlock prevention is a little subtle. Deadlock avoidance refers to a strategy where whenever a resource is requested, it is only granted if it cannot result in deadlock. Deadlock prevention strategies involve changing the rules so that processes will not make requests that could result in deadlock.

Here is a simple example of such a strategy. Suppose every possible resource is numbered (easy enough in theory, but often hard in practice), and processes must make their requests in order; that is, they cannot request a resource with a number lower than any of the resources that they have been granted so far. Deadlock cannot occur in this situation.

As an example, consider the dining philosophers problem. Suppose each chopstick is numbered, and philosophers always have to pick up the lower numbered chopstick before the higher numbered chopstick. Philosopher five picks up chopstick 4, philosopher 4 picks up chopstick 3, philosopher 3 picks up chopstick 2, philosopher 2 picks up chopstick 1. Philosopher 1 is hungry, and without this assumption, would pick up chopstick 5, thus causing deadlock. However, if the lower number rule is in effect, he/she has to pick up chopstick 1 first, and it is already in use, so he/she is blocked. Philosopher 5 picks up chopstick 5, eats, and puts both down, allows philosopher 4 to eat. Eventually everyone gets to eat.

An alternative strategy is to require all processes to request all of their resources at once, and either all are granted or none are granted. Like the above strategy, this is conceptually easy but often hard to implement in practice because it assumes that a process knows what resources it will need in advance.

Livelock

There is a variant of deadlock called livelock. This is a situation in which two or more processes continuously change their state in response to changes in the other process(es) without doing any useful work. This is similar to deadlock in that no progress is made but differs in that neither process is blocked or waiting for anything.

A human example of livelock would be two people who meet face-to-face in a corridor and each moves aside to let the other pass, but they end up swaying from side to side without making any progress because they always move the same way at the same time.

Addressing deadlock in real systems

Deadlock is a terrific theoretical problem for graduate students, but none of the solutions discussed above can be implemented in a real world, general purpose operating system. It would be difficult to require a user program to make requests for resources in a certain way or in a certain order. As a result, most operating systems use the ostrich algorithm.

Some specialized systems have deadlock avoidance/prevention mechanisms. For example, many database operations involve locking several records, and this can result in deadlock, so database software often has a deadlock prevention algorithm.

The Unix file locking system lockf has a deadlock detection mechanism built into it. Whenever a process attempts to lock a file or a record of a file, the operating system checks to see if that process has locked other files or records, and if it has, it uses a graph algorithm similar to the one discussed above to see if granting that request will cause deadlock, and if it does, the request for the lock will fail, and the lockf system call will return and errno will be set to EDEADLK.

Signals

Recall that an interrupt is an asynchronous event which can happen at any time. When an interrupt occurs, the processor stops executing instructions in the current running process and executes an interrupt handler function in the kernel. Unix systems have a software interrupt mechanism called signals.

An example of a signal that you are probably familiar with is an interrupt signal which is sent by the user to a running process when the user enters Control-C. The default action of this signal is to kill the process.

A signal is represented as an integer. These integers are assigned symbolic names in the header file signal.h. The interrupt signal has the value 2 but you should use the symbolic name SIGINT.

Every signal has a default action. The default action for SIGINT is to abort the program. A program can modify the default action for most signals or they can choose to ignore a signal.

The system call which does this has the following function prototype.

void (*signal (int sig, void (*disp)(int)))(int);

This says that the function signal takes two arguments, the first, sig is a signal, and the second is function name. This function takes one argument, an integer and returns a pointer. The call to signal changes the signal handling function for its first argument from the default to the function of its second argument.

Here is a simple example.

#include <signal.h>
#include <stdio.h>
void *SigCatcher(int n)
{
    printf("Ha Ha, you can't kill me\n");
    signal(SIGINT,(void (*))SigCatcher);   
    return (void *)NULL;
}

int main()
{
    int i;

    signal(SIGINT,(void (*))SigCatcher);
    for (i=0;i<10;i++) {
        sleep(1);
        printf("Just woke up, i is %d\n",i);
    }
    return 0;
}

The main function calls signal to change the default action to the function SigCatcher then enters a loop where it alternately sleeps for one second, then displays a message on stdout. Normally, the user could kill this program by hitting Control-C while it was running, but because the default signal action has changed, when the user hits Control-C while this program is running, instead of the program dying, it displays the message
Ha Ha, you can't kill me
Try it.

Notice that the signal handler function calls signal. On some Unix systems, once a signal handler has been called, the system resets the handler to the default unless it is reset again.

Here is a list of the predefined signals on Solaris (there are some slight differences from one Unix system to another).

#define	SIGHUP	1	/* hangup */
#define	SIGINT	2	/* interrupt (rubout) */
#define	SIGQUIT	3	/* quit (ASCII FS) */
#define	SIGILL	4	/* illegal instruction (not reset when caught) */
#define	SIGTRAP	5	/* trace trap (not reset when caught) */
#define	SIGIOT	6	/* IOT instruction */
#define	SIGABRT 6	/* used by abort, replace SIGIOT in the future */
#define	SIGEMT	7	/* EMT instruction */
#define	SIGFPE	8	/* floating point exception */
#define	SIGKILL	9	/* kill (cannot be caught or ignored) */
#define	SIGBUS	10	/* bus error */
#define	SIGSEGV	11	/* segmentation violation */
#define	SIGSYS	12	/* bad argument to system call */
#define	SIGPIPE	13	/* write on a pipe with no one to read it */
#define	SIGALRM	14	/* alarm clock */
#define	SIGTERM	15	/* software termination signal from kill */
#define	SIGUSR1	16	/* user defined signal 1 */
#define	SIGUSR2	17	/* user defined signal 2 */
#define	SIGCLD	18	/* child status change */
#define	SIGCHLD	18	/* child status change alias (POSIX) */
#define	SIGPWR	19	/* power-fail restart */
#define	SIGWINCH 20	/* window size change */
#define	SIGURG	21	/* urgent socket condition */
#define	SIGPOLL 22	/* pollable event occured */
#define	SIGIO	SIGPOLL	/* socket I/O possible (SIGPOLL alias) */
#define	SIGSTOP 23	/* stop (cannot be caught or ignored) */
#define	SIGTSTP 24	/* user stop requested from tty */
#define	SIGCONT 25	/* stopped process has been continued */
#define	SIGTTIN 26	/* background tty read attempted */
#define	SIGTTOU 27	/* background tty write attempted */
#define	SIGVTALRM 28	/* virtual timer expired */
#define	SIGPROF 29	/* profiling timer expired */
#define	SIGXCPU 30	/* exceeded cpu limit */
#define	SIGXFSZ 31	/* exceeded file size limit */
#define	SIGWAITING 32	/* process's lwps are blocked */
#define	SIGLWP	33	/* special signal used by thread library */
#define	SIGFREEZE 34	/* special signal used by CPR */
#define	SIGTHAW 35	/* special signal used by CPR */
#define	SIGCANCEL 36	/* thread cancellation signal used by libthread */
#define	SIGLOST	37	/* resource lost (eg, record-lock lost) */

Signal 11, SIGSEGV is the signal that is received when the program detects a segmentation fault (memory exception error). The default action for this is to display the message
Segmentation Fault (core dumped)
dump the core, and terminate the program.

You can change the action for this so that it displays a different message, but of course you cannot try to continue to run the program.

Signal 9, SIGKILL, is the kill signal. A program is not allowed to change the signal handler for this signal. Otherwise, it would be possible for a program to change all of its signal handlers so that no one could kill a rogue program. To send a kill signal from the shell to a particular process, enter
kill -9 ProcessNumber

Signal 14 SIGALRM sends an alarm to a process. The default SIGALRM handler is to abort the program, but this can be changed. The system call
unsigned int alarm(unsigned int sec);
sends a SIGALRM signal to the process after secseconds. If you have changed the signal handler function for this, then you can arrange for an event to happen after a set period of time.

You can choose to ignore any signal (except SIGKILL) by using SIG_IGN as the second argument of signal. You can also reset the signal handler for a particular signal to its default by using SIG_DFL as the second argument to signal.

Killing Zombies

Recall that if a child dies before its parent calls wait, the child becomes a zombie. In some applications, a web server for example, the parent forks off lots of children but doesn't care whether the child is dead or alive. For example, a web server might fork a new process to handle each connection, and each child dies when the client breaks the connection. Such an application is at risk of producing many zombies, and zombies can clog up the process table.

When a child dies, it sends a SIGCHLD signal to its parent. The parent process can prevent zombies from being created by creating a signal handler routine for SIGCHLD which calls wait whenever it receives a SIGCHLD signal. There is no danger that this will cause the parent to block because it would only call wait when it knows that a child has just died.

There are several versions of wait on a Unix system. The system call waitpid has this prototype

#include <sys/types.h>
#include <sys/wait.h>

pid_t waitpid(pid_t pid, int *stat_loc, int options)

This will function like wait in that it waits for a child to terminate, but this function allows the process to wait for a particular child by setting its first argument to the pid that we want to wait for. However, that is not our interest here. If the first argument is set to zero, it will wait for any child to terminate, just like wait. However, the third argument can be set to WNOHANG. This will cause the function to return immediately if there are no dead children. It is customary to use this function rather than wait in the signal handler.

Here is some sample code

#include <sys/types.h>
#include <stdio.h>
#include <signal.h>
#include <wait.h>
#include <unistd.h>

void *zombiekiller(int n)
{
  int status;
  waitpid(0,&status,WNOHANG);
  signal(SIGCHLD,zombiekiller);  
  return (void *) NULL;
}
int main()
{
  signal(SIGCHLD, zombiekiller);
  ....
}

Pipes

Note: this topic does not real fit with the other lessons of the week, but you will need it for the exercise.

A second form of redirection is a pipe. A pipe is a connection between two processes in which one process writes data to the pipe and the other reads from the pipe. Thus, it allows one process to pass data to another process.

The Unix system call to create a pipe is
int pipe(int fd[2])
This function takes an array of two ints (file descriptors) as an argument. It creates a pipe with fd[0] at one end and fd[1] at the other. Reading from the pipe and writing to the pipe are done with the read and write calls that you have seen and used before. Although both ends are opened for both reading and writing, by convention a process writes to fd[1] and reads from fd[0]. Pipes only make sense if the process calls fork after creating the pipe. Each process should close the end of the pipe that it is not using. Here is a simple example in which a child sends a message to its parent through a pipe.

#include <unistd.h>
#include <stdio.h>

int main()
{
  pid_t pid;
  int retval;
  int fd[2];
  int n;

  retval = pipe(fd);
  if (retval < 0) {
    printf("Pipe failed\n"); /* pipe is unlikely to fail */
    exit(0);
  }

  pid = fork();
  if (pid == 0) { /* child */
    close(fd[0]);
    n = write (fd[1],"Hello from the child",20);
    exit(0);
  }
  else if (pid > 0) { /* parent */
    char buffer[64];
    close(fd[1]);
    n = read(fd[0],buffer,64);
    buffer[n]='\0';
    printf("I got your message: %s\n",buffer);
  }
  return 0;
}

There is no need for the parent to wait for the child to finish because reading from a pipe will block until there is something in the pipe to read. If the parent runs first, it will try to execute the read statement, and will immediately block because there is nothing in the pipe. After the child writes a message to the pipe, the parent will wake up.

Pipes have a fixed size (often 4096 bytes) and if a process tries to write to a pipe which is full, the write will block until a process reads some data from the pipe.

Here is a program which combines dup2 and pipe to redirect the output of the ls process to the input of the more process as would be the case if the user typed
ls | more
at the Unix command line.

#include <stdio.h>
#include <unistd.h>

void error(char *msg)
{
     perror(msg);
     exit(1);
}

int main()
{
    int p[2], retval;
    retval = pipe(p);
    if (retval < 0) error("pipe");
    retval=fork();
    if (retval < 0) error("forking");
    if (retval==0) { /* child */
          dup2(p[1],1); /* redirect stdout to pipe */
          close(p[0]);  /* don't permit this 
                process to read from pipe */
          execl("/bin/ls","ls","-l",NULL);
          error("Exec of ls");
       }
    /* if we get here, we are the parent */ 
     dup2(p[0],0);  /* redirect stdin to pipe */
     close(p[1]);  /* don't permit this 
                  process to write to pipe */
     execl("/bin/more","more",NULL);
     error("Exec of more");
     return 0;
}

Here is this week's exercise