Unix System Programming
- System calls
- Library functions
- Data structures
Unix Versions
- BSD - University of Cal. at Berkeley
- System V - AT&T
- POSIX - IEEE
- X/Open - Group of Vendors
Process ID (PID)
Every process is assigned a unique identifier by the kernel. Typically the PID is a small
positive integer in the range 0-32767 (the range of possible positive 16 bit 2s
complement integers).
getpid()
If you would like your program to find out the PID that has been
assigned when the program is run (the PID of your process):
int getpid(); /* BSD based systems */
pid_t getpid(); /* POSIX, Sys V */
Parent Processes
Every process has a parent process. To find out the process ID of a your parent process -
your program would need to include a call to the getppid system call:
int getppid(); /* BSD based systems */
pid_t getppid(); /* POSIX */
User ID (UID)
Every user account is assigned a unique (positive integer) identifier by the system
administrator. A process can find out the UID of the user executing the process
with the getuid system call:
unsigned short getuid(); /* BSD */
uid_t getuid(); /* POSIX */
Set-UID
A process can run with the privileges of the user who owns the program being run rather
than the user running the program. In this situation the process has an effective
user ID that is different from the value returned by getuid. To find out
the effective user ID :
uid_t geteuid(); /* POSIX */
Group ID
Each user is assigned a positive integer group ID (GID) by the system administrator.
Typically there are many users with the same GID. A process can find out the
GID of the user executing the process with the getgid system call:
gid_t getgid(); /* POSIX */
Effective Group ID
There is also an effective GID which determines the group whose
permissions will be assumed by a process.
gid_t getegid(); /* POSIX */
/etc/passwd
The assignment of UIDs is (typically) kept in the file /etc/passwd
which also contains other information about each account on the
system. For each account the following information is present:
/etc/passwd File Format
- Each field in a record in /etc/passwd is delimeted by a ':'.
- A line from /etc/passwd might look like this:
Accessing the password database
There are a number of library functions which can be used to access information from the
password database. This is important for 2 reasons:
- You don't need to read lines from /etc/passwd and split the fields up.
- Most systems today use a distributed password database - which means that reading
/etc/passwd is useless (the information is not there).
passwd Database Access Functions
- lookup by UID:
struct passwd *getpwuid( int uid );
- lookup by login name:
struct passwd *getpwnam( char *name );
- sequential access:
struct passwd *gepwent( void );
#include pwd.h
The include file pwd.h defines prototypes for functions which access the
password database and defines the type struct passwd:
struct passwd {
char *pw_name; /* login name */
char *pw_passwd; /* encrypted pw */
int pw_uid; /* user ID */
int pw_gid; /* group ID */
char *pw_gecos; /* misc. (name)*/
char *pw_dir; /* home directory */
char *pw_shell; /* login shell */
};
Password Database Access Example
#include < stdio.h >
#include < pwd.h >
main() {
struct passwd *pwd;
pwd = getpwnam("hollingd");
printf("Login name: \n",pwd->pw_name);
printf("UID: 0\n",pwd->pw_uid);
printf("GID: 0\n",pwd->pw_gid);
printf("Real Name: \n",pwd->pw_gecos);
}
Group Database Access
Group information is kept in /etc/group and is accessed via the
following functions:
- Lookup by GID:
struct group *getgrdid( int gid );
- Lookup by login name:
struct group *getgrnam( char *name );
- Sequential access:
struct group *getgrent( void );
#include grp.h
The include file grp.h defines struct group:
struct group {
char *gr_name; /* group name */
char *gr_passwd; /* encrypted password */
int gr_uid; /* user ID */
char **gr_mem; /* array of ptrs to names */
};
Filenames & Pathnames
Every Unix file (or directory or special file) has a name. The only ascii characters not
allowed are '\0' and '/', although there are other characters that should be
avoided.
A pathname is a null terminated string made up of one or more
filenames seperated by the '/' character. If a pathname begins with
the '/' character - the pathname is absolute, otherwise the pathname
is relative to the current working directory.
Pathname examples
- The pathname / is called the root directory.
- The pathname /etc/passwd is an absolute pathname which
refers to the file named passwd which is found in the directory /etc.
- The pathname courses/networks is a relative pathname
which refers to the file (or directory) named networks which
is located in the directory courses which is found in the
current working directory.
File Descriptors
- A file descriptor is a small integer that is used to identify a file that has been opened for I/O.
- Each process is allowed a fixed number of file descriptors.
- Many programs associate the file descriptors 0, 1 & 2 with the standard input, standard
output and standard error of a process.
File Attributes
Files have many attributes. The include file
sys/stat.h includes a definition of a structure which is filled
in by the stat and fstat system calls.
struct stat {
dev_t st_dev; /* ID of device */
ino_t st_ino; /* inode number */
umode_t st_mode; /* type, access perms. */
link_t st_nlink; /* # of hard links */
uid_t st_uid; /* user id */
gid_t st_gid; /* group id */
dev_t st_rdev; /* device type */
off_t st_size; /* total size in bytes */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last mod.*/
time_t st_ctime; /* time of last sts chnge */
unsigned long st_blksize; /* blocksize */
unsigned long st_blocks; /* # blocks */
};
st_mode
The st_mode field of a stat structure contains information about the file type, access
permissions, the set-uid and set-gid flags and something called the sticky bit.
There are bitmasks and other constants defined in sys/stat.h which
are used to access the individual bitfields within st_mode.
st_mode - File Access Permissions
File access permissions control which users can access a file. Keep in mind that each
process is running with some effective UID and effective GID - these are used
to determine what operations are permitted for that process on
individual files.
File Access (cont.)
Every file has the attributes st_uid and st_gid which define the user and group ownership
of the file. Each time a process attempts to acccess a file (either directly
through a system call or indirectly via a library function) the kernel
decides if access is allowed according to the following algorithm:
File Access Algorithm
- If the effective UID of the process is 0 (root) access is allowed.
- If the effective UID of the process matches the st_uid field - access is
determined by the 3 owner bits of the st_mode.
- If the effective UID of the process does not match st_uid, but the effective
GID matches st_gid - access is determined by the group permissions (in
st_mode).
- If neither the effective UID nor the effective GID match, the access
permissions are determined by the permissions for other (in st_mode).
File Mode Creation Mask (umask)
Every process has an attribute called the file mode creation mask. This mask is used whan a
process creates a new file or directory. The LS 9 bits of the umask
correspond to the LS 9 bits of the st_mode field. Each time a file or
directory is created, each bit that is set in the process umask specifies
that the corresponding bit in st_mode should be cleared.
umask example
For example, if a new file is created and the mode (permissions) specified are 0664, and
the process umask is 022 - the result will be that the file will be created with
permissions 644 (group write disabled).
Current Working Directory
- Each process has an attribute called the current working
directory. This pathname is used as the starting point for all
relative pathnames (those that do not start with '/').
- From the shell you can find out your current working directory (of
the shell) with the 'pwd' command.
Finding out and changing the current working directory.
I/O System Calls
The standard I/O library includes many functions that provide high-level I/O services to a
process. We will concentrate on the low-level I/O services provided by the
kernel.
System calls typically return a -1 on error and set the global
variable errno to indicate the exact error condition.
The open system call
oflag parameter to open()
oflag is specified by a logical OR of the following (defined in fcntl.h):
O_RDONLY Open for read access only
O_WRONLY Open for write access only
O_RDWR Open for read and write
O_NDELAY Do not block
O_APPEND Append to EOF on each write
O_NDELAY Do not block
O_CREAT Create the file if it does not exist
O_TRUNC If the file exists - truncate length to 0
O_EXCL Error if O_CREAT and the file exists
The creat() system call
The creat system call creates a file (if it does not already exist), sets the permissions
according to mode and the process umask, sets the file UID to the effective
UID of the process, the file GID to the effective GID of the process
and returns a file descriptor to the open file.
creat() continued
The close() system call
- The close system call closes a file descriptor so that it not longer
refers to any file and may be reused.
int close( int fildes );
- close returns 0 on success or -1 on error.
close() side effects
If the file descriptor passed to close() is the last copy of a particular file descriptor - the
resources associated with the open file are freed (locks will be removed, or
the file may be removed).
The read() system call
read() continued.
- Depending on whether or not the file has been set for non-blocking I/,
read may return less than nbytes bytes.
- If the file is set to do non-blocking I/O and there is no data
available, read will return -1 (ERROR) and errno will be set to EAGAIN
(defined in errno.h).
- This is one of many examples where it is important to find out
what the error is instead of giving up and exiting the process!!!
The write() system call
The write system call attempts to write data to an open file:
int write( int fildes, char *buff, unsigned int nbytes);
write() return value
- The number of bytes written are returned, or -1 on error.
- write() can return less than nbytes indicating that the entire
requested amount has not been written.
- It is important to check the return value (this situation is much
more common when writing to a socket than witing to a file).
dup() system calls
- The dup system call creates a duplicate of an existing file
descriptor.
int dup( int fildes );
- dup() returns a new file descriptor or -1 on error. The new file descriptor may be used
interchangeably with the old file descriptor.
dup() continued
- Both the original and new file descriptors refer to the same file (or
socket or pipe), and they share locks, file position pointers, access
mode, etc. If one file descriptor is closed the other can continue to
be used.
- The new file descriptor returned by dup is the lowest numbered
available file descriptor.
dup2() system call
The fcntl() system call
fcntl() cmd argument values
these constants are defined in fcntl.h:
F_DUPFD duplicate the file descriptor (like dup)
F_SETFD set the close-on-exec flag to arg
F_GETFD return the value of the close-on-exec flag
F_SETFL set file status flags for the file to arg
O_NDELAY - nonblocking
O_APPEND - all writes append
O_SYNC - synchronous I/O
F_GETFL return the file status flags
There are others ...
Unix Signals
- A signal is a notification to a process that an event has occurred.
- Signals can be sent by one process to another (or to itself) or by
the kernel to a process.
- The signal() system call provides a mechanism for user programs to react to signals by
associating a function called a signal handler with a specific signal.
signal() system call
void (*signal (int sig, void (*func)(int)))(int);
Example:
signal(SIGUSR1,myfunc);
This would tell the kernel to call the user defined function myfunc()
whenever the signal SIGUSR1 is received.
Signals
- There are many different signals - each has a name specified in the
include file signal.h.
- The set of signals supported by an operating system varies - we
will concentrate on the set as defined in the POSIX standard.
POSIX Signals
Sources of Signals
kill() restrictions and options
- A signal can be sent via the kill system call only if the effective
UID of the sending process is 0 (root) or matches the effective UID of
the receiving process.
- There are various special cases for the pid argument - these are
described in the book (if pid==0 - the signal is sent to all processes
in the sender's process group, etc).
Other sources of signals
- Certain terminal characters generate signals, for example ^C.
Hardware conditions can generate signals (division by zero, memory violation). These
signals are passed to the process from the kernel.
- Some software related conditions can cause the kernel to send a
signal (for example the receipt of out-of-band data).
POSIX & Signals
- Modern Unix implementations provide more control over signals -
including the ability to block signals.
- The POSIX standard includes a much more powerful mechanism for controlling signals
than the system calls described in the book.
- Although we will need to use signals in some projects - we can
use the facilities described in the book.