CSCI.4210 Operating Systems Fall, 2009 Class 14
Input/Output Systems

Design Issues

Devices are divided into two broad categories, block oriented and character oriented, sometimes called stream oriented.. Block oriented devices store data in fixed size chunks (blocks), and transfers are done a block at a time. Disk drives and tapes are examples of block devices. Character oriented devices transfer data either as single characters or as a stream of characters with no block structure. Terminals, printers, and network communication ports are examples of character oriented devices.

Block oriented devices tend to be random access, which means that they support a seek operation; i.e. data can be read from anywhere on the device. Both Unix and Win32 allow reads to start at an arbitrary point; it is possible to say "read the ten bytes starting at byte 1000 in the file" without first reading the preceding 1000 bytes. Character oriented devices to not generally support seek operations; the data has to be accessed in the order that it is presented on the device. When the user enters a string of keystrokes at the keyboard, there is no way for a process to read the fifth keystroke without reading the first four keystrokes first.

The job of the I/O systems is to read data from whatever device and get the data into the memory space of the calling process or to write data from the process memory space to the device. The kernel provides a number of services to accomplish this. One of these, the file system, will be discussed in a different lesson.

The kernel typically provides a number of buffers for I/O operations, particularly for read operations from block devices like a disk drive. Whenever it processes a request for an I/O operation, it assigns one or more buffers. A block of data is read in from the device to the kernel buffer, and then the kernel copies the data from the kernel buffer to the user space memory.

Input performance can be improved if the system assigns multiple buffers to a read operation. As soon as one buffer is filled, the system not only starts transferring the contents of this buffer to the memory in the calling process, but it also starts reading the next block of data, even if the process has not requested this data yet. This makes sense for two reasons. First, most processes read files sequentially, so if they have requested a particular block of data from a file, it is likely that the process will soon be requesting the next block of data. Second, reading sequential blocks of data from a disk is far faster than reading random blocks, so the additional time and overhead needed to read the next block is pretty small. This makes it worthwhile to read this block even though there is a chance that the data will not be needed.

Assigning two buffers to an input process when only one is apparently needed is called double buffering and this is used in most modern operating systems.

If the operating system reads ahead as described above, this means that when a process makes a read system call to get data from a file on a disk, it is likely that in fact the requested data has already been read and is in a kernel buffer. The obvious exception is the first read from a file or other device.

Buffering can also speed up output. When a process makes a call to write, the system first copies the data to be written from user space to a kernel buffer. Copying from one memory location to another takes several orders of magnitude less time than copying from memory to disk. After this copy is finished, the calling process can continue execution while the I/O system copies the data from the kernel buffer to the disk.

Each device has a device controller, a hardware/software device that does the physical read or write and converts the data to or from software to hardware. For example, a disk drive controller reads a string of bits off the disk and converts these into bytes. Typically, in addition to the actual data bits, it may read some preamble bits and a check sum. The controller calculates the check sum from the bits that it read and confirms that the data that it read are correct. It then transfers these bits into main memory, either in the kernel, or in the case of DMA, to the DMA controller, which in turn copies the data to the memory space of the process.

Each device also has a device driver. A device driver is a piece of software, usually residing in the kernel, which communicates requests between the kernel I/O system and the device controller. Linux and other newer operating systems have a mechanism in which device controllers can be installed on the fly, i.e. without having to reboot the operating system.

Device drivers communicate with the device controllers with registers. The controller has several registers. Often there are four, command (read vs write), status, in, and out. Most operating systems have a standard interface which the kernel uses to communicate with the device driver.

Hard drives

Hard disks are probably the most important I/O device, and so we will start off with a discussion of how they operate. There are lots of good web sites which describe the basics of hard drives, and many students probably know all this stuff anyway, but you are responsible for reading these two web sites, which together could be called hard drives for dummies.

How stuff works; how hard disks work By the way, Marshall Brain, the author of the How Stuff Works series, is a Rensselaer graduate.

The PC Guide: hard Disk Drives

This is a computer science course, not a computer engineering course, so we are not particularly concerned with the hardware of the disk drive per se, but rather with the software/hardware interface. Recall that a disk drive is a block oriented device, which means that data are read and written in fixed sized blocks. A typical block size is 512 bytes. A sector contains one block, along with some alignment information.

The software interface to the disk controller will tell the disk controller which track and sector to read from. To do this, the controller has to move the read/write head in or out to the appropriate track. The time that it takes to do this is called the seek time. Once the head is correctly positioned, the controller has to wait for the desired sector to spin around so it is under the head. This time is called the rotational delay

Calculating seek time is hard to pin down because it involves startup time and some settling time to stop the arm when it is in the correct position. Typical seek times today are 5 to 10 msec. Typical hard disks rotate at 7,200 or 10,000 rpm, which is equivalent to 1 revolution every 6 msec. Thus, the average rotational delay will be 3 msec.

The total transfer time can be expressed as

T = b / rN

where T is transfer time, b is the number of bytes to be transferred, N is the number of bytes on the track, and r is the rotation speed in revolutions per second. Thus total access time can be expressed as

Ta = Ts + 1/2r + b/rN

where Ts is the average seek time.

If there are 320 sectors per track, here are some typical values.

Ts10 msec
1/2r3 msec
b/rN.02 msec

Essentially all of the access time is in the seek time and rotational delay. There is an obvious point to be made here. Once the disk controller has moved the read/write arm to the appropriate track and waited for the sector to come under the arm, reading additional blocks on the same track takes very little time. This is why reading additional data in the same file is essentially free if subsequent sectors on the same track have sequential data in the same file.

Disk scheduling policies

In a typical system, the disk drive will be receiving many requests for reads and writes, and these will have to be queued up. We can look at several strategies for scheduling I/O events on a disk.

First come first served: The obvious default scheduling algorithm is first come, first served; in other words, process the requests in the order that they are received.

Suppose a platter has 128 tracks. To do a simulation of this we will make some simplifying assumptions. First let's assume that requests for data come in random order. In practice this is not the case, typically files are read sequentially, and as we saw above, operating systems take advantage of this to read ahead. Second, we will ignore any times related to starting and stopping the read/write head. We will assume that it moves at a constant speed, taking 12.8 msec to get from the outermost track to the innermost track or vice versa, in other words, it takes .1 msec to move across one track. Third, we will assume that rotational delay is always 3 msec.

Here is a burst of requests, the start of the burst is time 0. Initially the head is at track 0.

timetrack
1034
22110
3577
4712
51262
61831
720108
82243

When the disk controller receives the first request, it starts moving toward track 34. It takes 3.4 msec to position itself over track 34, and an additional 3 msec for rotational delay. So the first operation takes 6.4 msec. By this time, two more requests have come in, one for track 110 and one for track 77. Since we process these in order, the controller moves the head to track 110 from track 34. This takes 7.6 msec. and an additional 3 msec of rotational delay, so operation two takes 10.6 msec and is completed at time 17. But by this time three more requests have come in. The controller processes the third request, moving the head from track 110 to track 77. This takes 3.3 msec, so the entire operation takes 6.3 msec, and completes at time 23.3.

The following table shows how long it would take to complete these eight operations and the end time of each.

16.46.4
210.617.0
36.323.3
49.532.8
58.040.8
66.146.8
710.757.6
89.567.1

It would take a total of 67.1 msec. to complete these eight I/O operations. Can we do better?

Closest Track Next It might make sense to always do the closest job next. This would obviously result in shorter seek times. However, even though the seek times might, on average, be shorter, there are two problems. The first is a fairness issue; data on tracks in the middle tracks would be accessed far faster than data in the very low numbered or very high numbered tracks. It should be obvious that on a platter with 128 tracks, a sector on or near track 64 is more likely to be near a randomly selected track than is a sector on track 1 or track 127. It might be possible for an operating system to use this unfairness to its advantage by placing often used files on the middle tracks and rarely used files on the outermost and innermost tracks. However, when files are created, the system does not generally know whether or not the file will be heavily used, and so no operating system uses this method.

The second, more serious problem with the closest track next algorithm is the potential for starvation. Recall that the definition of starvation is that the time to honor a particular request is unbounded. A request for an operation on a high numbered track could be blocked indefinitely if requests for low numbered tracks continuously arrived.

Because of these two problems and because the performance of the next algorithm is almost as good and does not have the same problems with fairness or starvation, closest track next is not used.

The elevator algorithm

There are some similarities between a read/write head seeking tracks and requests for an elevator in a skyscraper. In both cases, requests come in to go to a particular floor or track on a more or less random basis and the elevator has to get its passengers to the requested floors as efficiently as possible. Elevators solve this problem by going in one direction as long as there are more requests in that direction, and then going in the other direction as long as there are more requests.

It turns out that this algorithm works well for disk seek operations. At any given instant, the read/write head is either moving in toward the center of the disk or out toward the outside. If it can satisfy a new request by moving in the same direction, it does so, but if it would have to switch directions and there are additional requests that it could satisfy by going in its current direction, it will not satisfy the new request until it turns around.

If the head is moving in the direction of higher track numbers and it is at track 40 and if there is a request for data on track 100, it would not attempt to satisfy a request for data on track 38 until it had satisfied all of the requests which were higher, even though it only has to move two tracks. When the read/write head gets to track 100, it would turn around and go to track 38 if there were no requests for tracks greater than 100.

The elevator analogy clarifies this. If you are on an elevator going down to the first floor, and the elevator is on the 20th floor, and a request comes in to go to the 22nd floor, you would be pretty mad if the elevator went up two flights without first going to the first floor.

Let's do a simulation of the elevator algorithm using the same data. The disk arm starts at position zero and the first request comes in for track 34, so this takes 3.4 msec of seek time and 3.0 msec of rotational delay for a total of 6.4 msec, the same as FIFO. However, now two more requests have arrived, one for track 110 and one for track 77. The read-write head is going up, and so it honors request 3 (track 77) before request 2. This takes 4.3 msec of seek time and 3 msec of rotational delay for a total of 7.3 msec. But by now two more requests have come in. Request 4 is for track 12 and request5 is for track 62. Although track 62 is closer, the elevator algorithm first goes to track 110. Here is the complete simulation.

RequestCumulative
Numbertracktimetime
1346.46.4
3777.313.7
21106.320.0
71083.223.2
5627.630.8
8434.935.7
6314.239.9
4124.944.8

Note that it took a total of only 44.8 msec to process the eight operations with elevator algorithm in contrast to 67 msec with the First In First Served algorithm.

There is no problem with starvation with the elevator algorithm because the time to get to a particular track is always bounded. If the head is at track 40, going up, and a request comes in for track 38, the upper bound would be the number of tracks greater than 40 as the head moved upward plus the number of tracks between the maximum track and 38 as it moved back down. This may be a large number, but it is bounded, so it satisfies our definition of no starvation.

There is still a problem with fairness, but it is much smaller. Some tracks will, on average, receive slower service than other tracks. If this is a matter for concern, a small variant of the elevator algorithm is called C-SCAN. In this algorithm, the arm always goes in the same direction, from lowest track request to highest track request for example. After processing the request with the highest track number, the arm moves back to the first track or to the track with the lowest numbered request. It then starts moving upward again, processing the requests in track order.

RAID

Performance (i.e. speed) of disk drives has not improved nearly as rapidly as the performance of memory or CPUs, but disk drives have become very inexpensive. This has lead to the use of multiple disk drives to improve performance. With multiple disk drives, separate operations can be done in parallel as long as the operations are on separate disks. Even a single operation can be speeded up if the data to be retrieved or written can be distributed properly across several disks.

The computer industry has agreed on a standard for transparently distributing a file system across multiple disks. It is known by its acronym RAID which stands for Redundant Array of Independent Disks. There are seven different RAID architectures. They all share some common characteristics.

Here are the seven levels:

RAID Level 0

Level 0 is not technically a RAID scheme because it does not include redundancy to improve performance. Data are striped, that is, divided up into strips which are distributed in an orderly fashion across all of the disks. Strips can be blocks or sectors or some other unit. Consecutive strips are mapped to consecutive disks. If there are four disks, strip 1 of a file would be on disk 0, strip 2 of the file would be on disk 1, strip 3 of the file would be on disk 2, strip 4 of the file would be on disk 3, strip 5 of the file would be on disk 0, and so on.

A set of logically consecutive strips that maps exactly one member to each disk is referred to as a stripe. Thus, logical strips 1, 2, 3, and 4 of the file constitute a stripe. This is important, because when the file is read (or written), note that strips 1, 2, 3, and 4 can be read in parallel. This can dramatically improve reads and writes of large files to and from disk.

This performance improvement can only happen if the entire data pathway is completely parallel; that is, not only do the reads happen in parallel, but each disk has to have its own data line to the controller and from the controller to the system bus and eventually to memory.

Note also that this improvement in performance can only happen if the system makes large data requests. However, as we saw, modern I/O systems generally do make large requests, larger than the initial read. If the initial read is for a few bytes, all of which reside on strip 1, most systems now would read the entire file into memory buffers on the assumption that the process is likely to read other strips shortly.

Even without striping, a system with data distributed across multiple disks can show improved performance. Consider a transaction processing system in which there are a large number of requests for small records in a database. If the records are distributed randomly across the disks, several different records can be read at the same time, thus enhancing performance.

RAID level 1

All of the levels of RAID other than 0 incorporate redundancy to preserve data in the event of the loss of data on a disk. One of the most common types of hardware failure in a computer is the head crash of a disk. The read/write head moves very close to the platter, and the platter is spinning very rapidly (typically 7,200 rpm). This requires very fine tolerances. If the head is jarred ever so slightly or if a piece of dust gets between the head and the platter, the platter can get scratched, causing some or all of the data to be lost.

All of the levels of RAID other than 0 incorporate redundancy to preserve data in the event of the loss of data on a disk. Level 1 is the simplest, but is not used much. In level 1, data striping is used as in RAID 0, but each logical disk is mapped to two different physical disks so that every disk has a mirror disk with the same data. With our example, there would be eight disks in the array, and strip 1 would be on both disk 0 and disk 4.

This has two advantages. First, a read request can be made from either of the two disks, whichever involves the minimum seek time plus rotational latency. Second, if a disk crashes and all of its data is lost, recovery is trivial because all of the data is on the mirror disk.

Write operations have to be done twice, but this does not usually impose much of a performance penalty because the two writes can be done in parallel. However, the write time is the time of the slower of the two operations.

The principle disadvantage of RAID level 1 is the cost, since it requires twice as much disk space as the logical disk space.

RAID level 1 is often called mirroring for obvious reasons.

RAID level 3

I will discuss RAID level 3 before RAID level 2 because it is simpler to understand and more widely used.

Both RAID level 2 and level 3 use a parallel access technique in which all member disks participate in every I/O operation. Typically the spindles and the read/write heads are synchronized so that all of the heads are at the same place on each platter at all times. Strips consist of very small units, on the order of a single byte or word. Parity information is also stored to provide error detection and/or error correction. If there is a single bit error, the RAID controller can detect and correct the error immediately without other components of the system being aware of the error.

RAID level 3 has a single parity disk. In the event of a disk failure, the data can be reconstructed. Note that every read and write involves all of the data disks and the parity disk, and that the disks are so synchronized that we can have corresponding bits (or bytes or words) on all of the disks. Typically strips are very small, on the order of a single byte. Thus successive bytes of a file are stored on successive disks.

The parity value is calculated by performing an exclusive or operation on the data bits. One way to think about this is that the value on the parity disk is set to either zero or one in order to make the total number of one bits be an even number.

Here is an example. Suppose that a system has four disks for data and a fifth disk for parity. Strips are one byte wide. The table below shows the bits for four, 8-bit bytes and the value of the parity disk. The columns represent different disks, so there are five columns.

 0  1  0  1  0
 1  1  1  1  0
 0  1  1  1  1
 0  0  0  0  0
 1  0  0  0  1
 0  0  1  1  0
 0  1  0  0  1
 1  0  0  1  0
Thus, the first disk has the byte 01001001, the second disk has the byte 11100010, the third disk has the byte 01100100, and the fourth disk has the value 11100101. These would typically be four consecutive bytes of a file. Note that the value on the parity disk is the result of performing an exclusive or operation on the four bit values.

The parity disk allows for data to be reconstructed in the event of a disk failure. Suppose that the leftmost disk crashes and is completely unreadable. Because of the parity bits, we can reconstruct all of the data on this disk. If you look at the top row, we still have the three data bits 1, 0 and 1, and we know that the parity bit is zero, so we know that the missing value must be zero.

Notice that reads and writes can still be done in parallel. Because there are four disks, four bytes can be read at the same time, thus speeding up the time of reads by a factor of four. In fact, the parity disk is read as well, but is also done in parallel, so there is no time penalty.

There is one parity disk no matter how many data disks there are. If there are 16 data disks, meaning that 16 bytes can be read simultaneously, there is still a single parity disk and if any single disk fails, all of the data can be reconstructed.

RAID level 2

RAID 2 uses Hamming Code. This is a method of storing data redundantly so that errors can be not only detected, but often corrected on the fly. Hamming code is often used for memory error checking.

Thus, with RAID level 2, if there are four data disks, there would need to be three additional parity disks for a total of seven. This is not widely used, because most disk drives now include error correcting code.

RAID level 4

The remaining levels of RAID use striping and parity, but the strips are relatively large. With RAID level 4, bit-by-bit parity is calculated across corresponding strips on each data drive, and the parity bits are stored on the corresponding strip on the parity drive.

This level incurs a performance penalty with small write operations because each write operation has to write not only the data, but also the corresponding parity bits, and this means that it has to read the old parity bit and recalculate it. Consider a system with four data disks and a parity disk. In order to write a single byte which happens to be on disk 1, the system first has to read the old data on disk 1 and the corresponding parity byte on disk 5. It then recalculates the value of the parity bits, and rewrites both the data and the parity byte. Thus one logical write operation involves two read operations and two write operations. This only makes sense on systems where most of the write operations involve writing large amounts of data. Even then, since every write has to involve the parity disk, this can be a bottleneck.

Here is a diagram for RAID level 4.

RAID level 5

RAID level 5 is similar to RAID level 4 except that the parity bits are distributed across all of the disks instead of being concentrated on one disk. This overcomes the problem of the bottleneck associated with the fact that every write operation involves the parity disk.

This is a very widely used system. It is very efficient for operations involving large files. For example, suppose that a process has to read a file involving 20 sectors. Rather than reading each sector sequentially, it can read the sectors four at a time in parallel (if there are four data disks), thus achieving a 400 percent speedup. When it later writes the file, it can write four sectors plus the parity sector at the same time, so again there is a 400 percent speed up compared to a single disk (minus the relatively small overhead of calculating the parity bits). Unlike levels 2 and 3, it does not involve synchronizing the drives, which is difficult to achieve.

It also has redundancy built in so that if any one disk fails completely, all of its contents can be reconstructed automatically with no loss of data. Of course if two disks fail simultaneously, all of the data is lost.

RAID level 6

RAID level 6 was introduced later. It is for systems which require very high levels of redundancy and availability. It requires the computation of two different parity calculations for each each stripe, and thus it requires two additional disks. A system which requires four data disks would have six disks in all. The advantage of RAID level 6 is that if two disks fail, no data is lost. The disadvantage is that there is a more substantial penalty incurred for small write operations since every write operation also requires calculating and writing two separate parity values.

Clock Interrupts

All modern operating systems have one or more system clocks. These are I/O devices of sorts (at least according to your text). While they do not read data in or write it out, they periodically issue a clock tick, which causes an interrupt. The frequency of these clock ticks can vary widely from one system to another; a typical frequency is once every 10 to 20 msec.

Like other interrupts, whenever a clock tick occurs, the CPU stops whatever it was doing and jumps to an interrupt handler. We have already encountered some of the the things that the clock tick handler has to do

There are other housekeeping chores that the clock tick interrupt handler has to do.

It has to maintain the time of day clock. Unix systems maintain this value as the number of seconds which have passed since Jan 1, 1970. Since this is stored in a 32 bit field, this field will overflow in the year 2106, and at that time all current Unix systems will cease to function (start planning for this now!). Systems need some way of maintaining this value, even when the power is off. This is done either by maintaining a separate low-power clock that runs on a battery, or by downloading the current time from the network when the system is booted up.

Most system have events which are scheduled to occur at a particular time. We have already seen the alarm system call, in which a process can set an alarm to go off in a certain number of seconds. Several of the sample programs have used the sleep system call, which causes a process to be blocked for a number of seconds (or milliseconds on Windows systems). Many Unix systems have a page daemon which is awakened periodically to check whether there are enough empty pages in memory for page faults to happen efficiently, and if there are not, it will mark pages to be overwritten. Most systems also allow a user to schedule an event to occur at a particular time. On Unix systems, the command to do this is cron. The clock interrupt handler has to check for all of these events and if necessary, call another interrupt handler to process an event which is scheduled.

There are benefits to having clock ticks more frequently or less frequently. The obvious advantage of more frequent clock ticks is a more accurate clock. Clearly the system cannot time any event more accurately than the clock tick rate. A common exercise in programming courses is to time programs. In theory, the user can report the time to the nearest millisecond, but in practice, the times are not nearly this accurate because they cannot be more accurate than the clock tick rate. Even that is not particularly accurate, because if there are other, higher priority, interrupts waiting to be processed, it might take the system a few milliseconds to process the clock tick.

The obvious advantage of less frequent clock ticks is less overhead. Processing a clock tick takes time away from application programs.

Graphical User Interfaces

Essentially all modern operating systems use a Graphical User Interface, or GUI. A GUI has a bit mapped monitor, in which each pixel of the monitor can be accessed by the operating system. This is in contrast to a character oriented monitor, which was the norm during the 1980s. Character oriented monitors could only display characters in fixed positions on the screen. These are of course much easier to program.

GUI interfaces typically have four features

In general, each window is the front end of a process which is running or runnable. Typically, at any given time, one window has the focus, which means that when a keystroke is entered, the value is passed to the underlying process of that window.

The classical GUI interface was developed at Xerox PARC. The first commercial implementation was on the Apple Macintosh in 1984. This was an overwhelming commercial success and received a huge amount of favorable press because up until its release, almost all computing had been text based.

At the same time, Project Athena was launched at MIT (it actually started in 1983). Predominantly funded by IBM and DEC, it developed the prototype for the modern distributed computer network. One of the products of Project Athena was X Windows, which was a GUI for Unix workstations. Today, essentially all Unix workstations, regardless of the Operating system, run X Windows. It is now generally referred to as simply X, or X11 to refer to the current version, which is 11.

Microsoft was a latecomer to the GUI world. Their command driven DOS operating system was so dominant in the PC world that they did not feel that they needed to develop GUI capabilities. Microsoft's first two attempts at a GUI interface Windows 1.0 and Windows 2.0 were commercial failures (I bet you've never heard of them). Windows 3.1 released in 1991 was their first successful GUI product, although it was nothing more than a graphical front end for DOS. Windows 95, released in 1995, was a complete reworking of the operating system and was wildly successful.

GUIs, including X, Windows 95 and its successors are event driven. Event driven programming is somewhat different from the kinds of programs that you have written so far in all of your courses. The logic is very simple, although the implementation is often messy. An event driven program displays one or more windows, tells the operating system what kinds of events it will respond to, and then has code to handle each of the possible events. Here are some examples of events.

Mouse moves
Mouse button clicks
Pressing a key
Releasing a key
Exposing a window

Typically the main portion of the program simply enters an infinite loop which senses events and for each one, it calls the appropriate function. Here is some simple pseudocode for an event driven program.

main()
{
   DisplayWindow();
   SetUpEventMask(); /* tell the os what events the program
                        will respond to */
   while (true) {
       GetNextEvent();
       ProcessEvent();
   }
}

MouseMoveEventHandler() { ... }
KeyDownEventHandler() { ... }
etc.

It is important to remember that events are asynchronous; they can occur in any order, and usually they are processed in the order that they are received.

A GUI system always has an extensive array of graphics calls, which allow the user to draw lines, text etc. in a window. Graphics on a GUI system is fairly complicated because in general, the programmer cannot make any assumptions about the size of the window. Usually the user can change the size of a window as well as the aspect ratio (the ratio of width to height), and it is up to the programmer to make sure that whatever is being displayed in the window looks reasonable no matter what the dimensions of the window.

Although there are some conceptual similarities between these two windowing systems in that they both use very similar concepts of event driven programming, there are also some major design differences.

One major difference between X and Microsoft Windows programming is the all of the X code runs in user space, completely independent of the operating system, while Microsoft Windows code is an integral part of the operating system. There are thousands of Win32 APIs, and less than 200 Unix system calls, and the reason for this is that all of the GUI calls are APIs in Windows, but not in Unix.

Another major difference between X and Microsoft Windows is that X is designed for a distributed system while Microsoft is not. With X, the window for an application can be running on a different computer than the process itself. To use terms that you will become more familiar with later, there is an X server on a particular machine, which displays windows; and clients, which are processes which may be running on different machines, which establish a connection to a particular X server.

X Windows is much more flexible than the Windows GUI. X allows a user to choose among various window manager programs. A window manager is a program which runs on the X server and is responsible for displaying the windows. There are a number of different window managers, each of which has a slightly different look and feel.

Here is a link to a web site which discusses the concept of a window manager and has links to many of the common ones. Read the introduction. It's short.

Students often ask me why we still use text based I/O in our lower division courses. The answer is the even the simplest hello world program in Windows or X is extremely complex. Although this course will not ask you to actually write a GUI program for either X or Windows, we will discuss how this programming works and show some short samples.

Programming X Windows

It is possible to write an X Windows program at a number of different levels.

Here is a short example of an X Windows program. It displays a window with two pushbuttons, labeled Hello and Quit. When the user pushes the Hello button, a popup window appears with the default appearance for a popup window (three buttons, labeled OK, Cancel, and Help). When the user presses the Quit button, the program terminates.

/* an X demo program.
   It creates a toplevel widget and four additional widgets.
   A RowColumn widget holds the two buttons.
   There are two PushButton widgets, "Hello" and "Quit".   
   Pushing the hello button causes the fourth widget, a
         PopupDialog to appear.
   Pushing the Quit button terminates the program.
   On Solaris, compile like this gcc prog1.c -lXm -lXt
*/
#include <Xm/PushB.h>
#include <Xm/RowColumn.h>
#include <Xm/MessageB.h>

void message_callback(), quit_callback();
int main(int argc, char *argv[])
{
    Widget       toplevel, manager, hello_button, quit_button;
    XtAppContext app;
    
    toplevel = XtVaAppInitialize(&app,"Demo1",NULL,0,
               &argc, argv, NULL, NULL);

    manager = XtVaCreateWidget("manager", xmRowColumnWidgetClass,
               toplevel,NULL); 

    hello_button = XtVaCreateWidget("Hello",
               xmPushButtonWidgetClass, manager,NULL);
    XtAddCallback(hello_button, XmNactivateCallback, message_callback,
               NULL);

    quit_button = XtVaCreateWidget("Quit",
               xmPushButtonWidgetClass, manager,NULL);
    XtAddCallback(quit_button, XmNactivateCallback, quit_callback, NULL);

    XtManageChild(hello_button);
    XtManageChild(quit_button);
    XtManageChild(manager);
    XtRealizeWidget(toplevel);
    XtAppMainLoop(app);
    return 0; /* we never get here */
}

void message_callback(Widget w, XtPointer client_data, XtPointer call_data)
{
    Widget message_box;
    Arg args[3];
    XmString xmsg;

    char *msg = "You have pushed the Hello Button";

    xmsg = XmStringCreateLocalized(msg);

    XtSetArg(args[0],XmNmessageString,xmsg);
    message_box = XmCreateInformationDialog(w, "info", args, 1);
    XmStringFree(xmsg);
    XtManageChild(message_box);
    XtPopup(XtParent(message_box), XtGrabNone);
}

void quit_callback(Widget w, XtPointer client_data, XtPointer call_data)
{
    exit(0);
}

The function XtAppMainLoop() is the library function which contains an infinite loop to respond to events. Each widget class has predefined the events to which it will respond. The function XtAddCallback adds a callback routine to a particular widget.

Windows GUI Programming

The Microsoft approach to software development is to tightly couple the programming with the operating system. The advantage of this approach is that the user can quickly develop fairly powerful applications which do very sophisticated things without much programming, but the disadvantage is that all portability is lost. Also, the learning curve to write even trivial applications is fairly steep.

Windows programming can be done at two levels: programming in C with the Windows Application Program Interface (API), and programming with the Microsoft Foundation Classes (MFC). The latter is a set of classes which build on the Windows API.

Code using either of these methods does not look much like the code that you have developed so far. For example, there is no concept of a controlling terminal, so you can't use the I/O library functions like printf, scanf (or cin and cout in C++). There is not even a function called main() (at least not in any code that the programmer writes; main is hidden in one of the API calls). The entry point for a windows program is a function named WinMain; this function creates a window and then enters an infinite loop. This loop receives events (called messages in Windows terminology) and acts on them. Events include such things as moving the mouse, hitting a key, depressing a mouse button, releasing a mouse button, or messages from the operating system to resize the window. The actions which are called in response to events are called callbacks. There are about 200 message types; here is a list of the ten most commonly used.

WM_CHAR A character is input from the keyboard
WM_COMMAND The user selects an item from a menu
WM_CREATE A window is created
WM_DESTROY A window is destroyed
WM_LBUTTONDOWN The left mouse button is pressed
WM_LBUTTONUP The left mouse button is released
WM_MOUSEMOVE The mouse pointer is moved within the window
WM_PAINT The window needs repainting (because it was resized
or because part of it had been covered by another
window and has become exposed
WM_QUIT The application is about to terminate
WM_SIZE A window is resized

The user writes callback code which is invoked by each of the various messages, and this is the essence of windows programming. As events occur, they are put in a queue, and the main loop of the the program retrieves these messages and acts on them according to these functions.

The Microsoft Foundation Classes (MFC)

The Microsoft Foundation Classes are a set of some 200 classes which place an object-oriented wrapper around the Windows APIs. Consistent with the OO philosophy, everything is a class. For example, there is an application class CWinApp and your application is a class derived from this. There are several window classes, such as CFrameWnd, and your windows are classes derived from these. MFC makes heavy use of inheritance, and all classes fit into a class hierarchy. The root of this hierarchy is the class CObject.

Here is a simple (Ha!) program which uses MFCs. Notice that there is no obvious entry point (i.e., no function called main or WinMain) and no obvious flow of control.

// When the user depresses the left mouse button, the
// program draws the coordinates of the pointer location
// on the screen.
// Unfortunately, the window is cleared whenever it
// is resized, covered, or minimized

#include <afxwin.h>
class CMyApp : public CWinApp
{
public:
    virtual BOOL InitInstance();
};

class CMyWindow : public CFrameWnd
{
public:
    CMyWindow();
protected:
    afx_msg void OnLButtonDown(int flags, CPoint point);
    DECLARE_MESSAGE_MAP();
};

CMyApp myApp;

BOOL CMyApp::InitInstance()
{
    m_pMainWnd = new CMyWindow;
    m_pMainWnd->ShowWindow(m_nCmdShow);
    m_pMainWnd->UpdateWindow();
    return TRUE;
}

BEGIN_MESSAGE_MAP (CMyWindow, CFrameWnd)
     ON_WM_LBUTTONDOWN()
END_MESSAGE_MAP()

CMyWindow::CMyWindow()
{
    Create(NULL, "My First Window Application");
}

void CMyWindow::OnLButtonDown(int flags, CPoint point)
{
       CClientDC dc(this);
	   char s[32];
	   sprintf(s,"[%d %d]",point.x, point.y);
	   dc.TextOut(point.x, point.y, s);    
}

I/O in Windows and Unix

Windows

A key component device management in Windows systems is the Plug and Play manager. Plug and Play (PnP) is a combination of hardware and software support that enables the operating system to recognize and adapt to hardware configuration changes with little or no intervention by a human. New devices can be added (or devices removed) without the awkward configuration changes required by earlier systems. For example, a user can dock a portable computer and use the docking station keyboard, mouse and monitor without making manual configuration changes.

We can contrast this with earlier PC operating systems in which the process of adding a new device was a little tricky. After installing the device driver, it was usually necessary to reboot the system, and sometimes set some dip switches. Occasionally two different devices would attempt to use the same interrupt request line, with disastrous results.

With PnP, when a change in devices occurs, the operating system will automatically determine the resources needed by the device (i.e. I/O Ports, IRQs (Interrupt Request Lines), DMA channel, and memory allocation) and make the appropriate assignments. It can even dynamically load the appropriate device driver. If two devices request the same IRQ, The PnP manager will automatically resolve the conflict.

This leads to another feature of modern device management. The old model of a computer with a single bus has been replaced by a much more complex, multiple bus system. Compare Figure 1.6 on page 19 of your text with Figure 1.12 on page 31 of your text. An operating system with PnP will build a device tree at boot time, and update this as needed. Here is an example (copied from the Microsoft Device Driver Development Kit Help pages).

For example, many systems now have a SCSI (Small Computer System Interface) Adapter. This is technically a bus rather than just an adapter, since several different peripheral devices can be attached to this at the same time. The transmission rate is faster than the standard serial or parallel ports.

SCSI has been around for a while. A newer I/O external bus standard is the Universal Serial Bus (USB). Like the SCSI, this allows numerous (up to 127) peripherals to be connected through one port.

I/O in Unix

In most Unix systems, devices are made to look like files. They are opened with the same system calls and read and written with the same system calls. All of the information about devices is in a directory named /dev. Here are some of the files in a typical linux directory along with a description.
/dev/fd0Floppy disk
/dev/hdaFirst IDE disk
/dev/hda2Second partition of first IDE disk
/dev/hdbSecond IDE disk
/dev/ttyp0Terminal
/dev/consoleConsole
/dev/lp1Parallel printer

It is possible to open a particular device directly and read from it using our standard system calls.

The following snippet of code opens the floppy drive on my linux box and reads the first sector (sectors are 512 bytes). Subsequent calls to read would read subsequent sectors.

fd = open("/dev/fd0",O_RDONLY);

n = read(fd,buffer,512);

Linux allows users to install a device driver on the fly, without rebooting the system.

Return to the course home page