CSCI.4210 Operating Systems Fall, 2008, Class 23

CSCI.4210 Operating Systems Fall, 2009 Class 23
Security II

Computer Security is a hot issue these days. The popular press is full of articles about security matters, and corporations and government agencies are devoting much more time, energy and money to security issues than they did just a few years ago.

It is important to keep in mind that the vast majority of security problems are not technical, but social/interpersonal. Most breakins result not from a skilled hacker decrypting a password file or discovering a new software bug, but from events like someone calling the help desk claiming to be an employee who has forgotten his password, and having the help desk give out or reset the password for this person.

Many security holes of a technical nature can be attributed to software bloat; every time a new feature is added to the operating system, it introduces new opportunities for security holes. For example, when email was just used for ascii text, it could not be used to transmit viruses. When it became possible to email executable files; and mail clients would automatically execute such files, this became a wildly popular way for jerks to commit all kinds of havoc.

User Authentication

The first step in any kind of security system is authentication, i.e. determining that users are who they say they are. Authentication systems are based on
- Something the user knows, e.g. a password
- Something the user has, e.g. a key
- Something the user is, e.g. biorecognition systems

The most common authentication method is authentication with passwords. Most users, when left to their own devices, choose common words or names. This makes sense because it's easy to remember. However, this also leads to an obvious security breach; a malicious user can try all of the words in an online dictionary until he gets a hit. As a result, most systems now require that a user's password contain at least one non-alpha character.

In the old days, the password system on Unix systems was public. Anyone, even the bad guys, could read it. This was OK because the actual password was encrypted with a one-way encryption scheme. This meant that there was no way to figure out the plaintext password from the encrypted password. However, as computers became faster, it was feasible to get hold of the password file on a system, and run all of the words in a dictionary through the encryption scheme and see if there were matches with the encrypted passwords. Password files on modern systems are now generally private, which makes it more difficult to do this.

Before encrypting a user's password, Unix adds an extra 12 bits, called salt. Whenever users change their password, the salt is recalculated and this is stored as a part of their account information. The salt serves two purposes. First, it makes brute force methods more difficult, because not only do you need to search all of the the words in the dictionary, but you need to consider all words in combination with all 2¹² possible values of the salt.

Second, it makes it highly unlikely that two people who choose the same password would also have the same encrypted password. If the salt was not added, and Alice was able to read the password file, and some other user had chosen the same password as Alice, then Alice could see that some other user had the same encrypted password as she did, so she could log in as that user.

There is a Unix system call crypt, with this prototype.

char *crypt(const char *key, const char *salt);

This takes a password and a salt as arguments, and returns the encrypted version of the password, which is the value stored in the password file. The text points out that this is not technically an encryption algorithm, , because it is not possible to decrypt the value in the password file to get the original password and salt (even brute force doesn't work), but everyone calls this encryption anyway.

Modern systems usually allow only a limited number of login attempts (often three) before blocking that account. This effectively foils someone trying to guess a password.

Password systems are still vulnerable. Shoulder surfing, i.e. looking over someone's shoulder as they type their password, is hard to prevent. Network sniffing is commonplace; it is relatively easy for someone to read all of the traffic on a LAN. Utilities like telnet and ftp require a user to type their password and send it over the network, and so a network sniffer could get passwords in this way. This is why telnet and ftp have been replaced on most systems by the more cumbersome ssh (secure shell) and scp (secure copy). Both of these encrypt passwords before sending them over the network. More on these below.

Human failure is still the largest source of stolen passwords. Users often write their password on a piece of paper (or even on a post-it which they put on their monitor), or they share their password with a friend.

Some systems require that users periodically change their password as an additional security precaution. This is controversial because forcing users to change their passwords once a month (or whatever) encourages them to write down their password, which is a larger potential loophole.

Some systems are beginning to require higher levels of security. One system that I am familiar with issues each employee a gizmo called an authenticator. This displays a six digit random number which changes every few minutes. This gizmo is synchronized with the authentication server on the computer network. In order for an employee to be allowed to log in, they need to enter not only their password, but also the number currently showing on the gizmo.

One of the cutting edge authentication mechanisms now is using biometrics for authentication. This includes hand prints, retinal scans or voice prints. Currently these are still too expensive and too unreliable for routine use.

Privileges and access levels

The DOS operating system had no security or protection mechanisms. There was no concept of logging in or other kind of authentication, and any users with access to the computer could do anything that they wanted, including modifying or deleting any files. Most modern computer systems have protection mechanisms in place to prevent this. In such systems, each object (i.e. files, devices such as printers), has associated with it a list of privileges, i.e. a list of who can do what with it.

The classical Unix file permissions are a typical example. Each file has an owner, and a group associated with it. There are a number of permission bits associated with each file. Each one, when set, gives privileges to an individual or a group. There are three types of permission, read, write, and execute, associated with three categories of users, the owner, the group, and anyone. Thus there are nine permission bits associated with each file resulting from the three levels of permission for the three categories of users. The ls -l command displays these bits like this

drwxr-xr-x   foo
-rw-------   bar
-rw-rw-r--   myfile.c
-rwxr-x---   a.out

The first character indicates the type of file, d stands for directory, so this says that foo is a directory. The owner has write privileges for foo, meaning that the owner can add or delete files in the directory, the group and the world have read and execute privileges for foo (execute privileges for a directory means that someone can look at the contents).

The file bar is a regular file, indicated by a - in the first column. The owner has read and write privileges for bar, no one else has any privileges. The file myfile.c can be read and written by the owner or the group, and anyone can read it. Note that read privileges include copying privileges. The file a.out can be read and written by the owner, read and executed by the group, but others cannot read or execute it.

Operating systems with such protection mechanisms need to allow one or more users to override all privileges. This is important so that a system administrator can kill rogue processes, delete illegal files and so on. This account is called root on Unix systems. It is a common occurance that a process which is executable by anyone needs to access privileged information. For example, consider a computer game which maintains a file of the high scores. The file should be world readable, but you certainly don't want just anyone to be able to write to it. The solution is that some executable processes allow a process to "run as root", i.e. one of the permission flags is called the set-user-ID flag, and when it runs, rather than having the privileges associated with the user who is running the process, which is usually the case, it acquires the privileges of the owner, often root. Any shell command which requires reading or writing kernel data structures has this feature.

Needless to say, this is potentially a major security loophole if a process is running as root and the user is able to modify it so that it can do nefarious things.

Recall that on RCS and other file systems that are running AFS, the Unix permission bits are ignored. AFS has a more sophisticated permission system in which each file has an access control list (ACL). Unix has a three by three matrix of permissions, with three types of permission and three categories of users. AFS has more types of permissions, and allows the owner of a file or directory to confer or remove privileges to other individual users. Privileges are allocated on a directory basis, not an individual file basis. There are four types of privileges for directories, lookup, insert, delete, and administer. There are three types of privileges for files within a directory, read, write, and lock.

Whenever a process attempts to access a resource (file, printer, or whatever), the operating system checks the access control list for that object to make sure that the user has the correct privileges.

An alternative to an ACL is to associate a Capability List with each user. This is a list of privileges for that user. In general, capabilities lists are larger, harder to administer, and more susceptible to tampering, so they are seldom used.

Types of Security Attacks

The generic term malware is used to describe any kind of software which is loaded without the user's knowledge and which has malicious intent.

Trojan Horses

A Trojan horse is malware embedded inside a seemingly legitimate program. Someone may download software from the web that seems to perform as advertised, a game for example, but has code embedded in it that does evil things.

There are other ways besides downloading from the web as well. Early in the course we talked about the PATH environment variable and whether to put the dot at the front of the path, at the end of the path, or not at all. You can write a trojan horse version of ls, put it in your home directory, put dot at the first entry in your path, then induce the superuser to run it by doing something that would attract attention, such as creating a fork bomb (a program that just calls fork repeatedly).

Viruses

A virus is a set of executable statements that attaches itself to another executable file or replaces it completely. These can potentially insert themselves into any executable code, including kernel code such as interrupt vectors or even the boot sector. In this sense they are similar to biological viruses, which are unable to exist independently but are able to implant themselves in the genome of cells of other organisms.

Viruses have been installed in many different places in the Operating System. Traditionally, they have infected a program which is part of the kernel, but they can be in any executable code.

Your text gives a number of examples.

On a Windows computer, create an executable with the same name as a commonly used utility, only with the suffix .com instead of .exe. .com is a early file format that is not much used now, but if the user enters dir in a console window, the system first looks for a file called dir.com, and if it can't find it, it will search for dir.exe. This means that if you have inserted a file called dir.com in one of the directories in the path, it will execute this instead.

You can also change the target of a shortcut so that it executes your program rather than what it is supposed to point to.

Viruses can also be inserted even deeper into the system. A memory resident virus is in a part of the kernel which is always loaded, such as the interrupt handler vector, or a device driver. A boot sector virus replaces the boot sector of a disk with its own code. The boot sector contains a disk address which the OS jumps to when the system is started, so it is a simple task to change this address so that it jumps to the virus code instead of the normal boot.

In all of these examples, after executing the virus code, the last instruction is a jump to the normal code, so that the user does not suspect anything.

A variant is the macro virus. Many applications, such as spreadsheets, allow the user to write a macro, which is a script consisting of a number of keystrokes or commands, which can then be executed as a single command. These macros are very powerful, and in some cases can run Visual Basic programs. Thus the bad guy can embed such a virus in a spreadsheet or word document and email it to people. The unsuspecting user opens it, triggering the virus.

There are a number of ways to spread viruses. They can be embedded in executable programs on a website or just emailed to people. Anyone that runs the program then gets infected. Because a virus can do anything, one popular thing that they do now is to locate the user's email addressbook and send itself to everyone (or selected people) in the address book.

One of the most common and dangerous types of viruses now are botnets. A virus (often called a zombie) is distributed to many different unsuspecting users. It sits undetected for a while, and then wakes up and accesses the internet. These can be used to send out oceans of spam or to perform a distributed denial of service attack (DDoS). In the latter example, all of the zombies repeatedly try to connect to a web site with the intention of overloading it so that legitimate users cannot access it. This is one of the most difficult types of security threats to defend against, and because the attacks come from innocent victims, it is difficult to identify the real culprits.

Protecting against security violations

There are a number of tools which system administrators can use to detect and prevent outside attacks. For example, there is a freely downloadable program called SATAN (System Administrator's Tool for Analyzing Networks) which a system administrator can run to detect a large number of known security holes on Unix systems, including the following:

easy to guess passwords
unauthorized setuid programs
unauthorized progs in system directories
unexpected long running processes.
inproper directory protections
improper protections on password files, device drivers, etc
dangerous entries in the program search path
changes to system programs detected with checksum values
NFS file systems exported to unprivileged programs

A newer product is Octave, (Operationally Critical Threat, Asset and Vulnerability Evaluation) - a suite of tools for risk based information security assessment and planning.

Virus detection

There are organizations, both commercial and noncommercial, that track viruses in order to detect them. Whenever a new virus is reported, they try to isolate it and find a signature, that is, a fragment of code that will uniquely identify the virus. It is then possible to scan every executable file on a system to see it has the signature of any known virus.

There is continual escalation between the virus creators and the virus detectors. The virus creator can create many versions of the virus, which differ in subtle ways. These are called polymorphic viruses. This means that the virus detection software has to perform a fuzzy search, not only looking for the signature, but also looking for small variants in the signature. This is more time consuming and more likely to result in false positives, legitimate programs that happen to look similar to the virus.

This method cannot detect brand new viruses.

Another method of detecting viruses is to scan all of the executable files on the disk which are known to be virus free, and calculate a value for each one. This can be as simple as the length of the file, or it can be a checksum, or an MD5 hash function. Periodically run the virus scan software and see if any of these values no longer match.

One problem with this is that the virus might be able to find the file where these values are stored, and update the appropriate value.

Virus avoidance

There are some obvious guidelines that users can follow to reduce the risk of getting infected.

First, don't download and install any software unless it comes from a site that you know and trust. This includes plugins, and virus detection software.

Some websites now do digital signing. The software vendor generates a public/ private key pair. A digital signature such as MD5 is calculated for each application, and encrypted with the vendor's private key. After the application has been downloaded, the user calculates the MD5 value on the downloaded software, and then uses the vendor's public key to decrypt the attached signature. If they are the same, the user knows that the software came from where it was supposed to, and not from a malicious intruder.

Second, don't do your everyday computing on an account that has administrative privileges (almost everyone ignores this one).

Third, don't run email attachments that could have malicious macros in them, such as word or excel.

Back up your files often so that if a virus is detected, you can reinstall the OS without losing your stuff.

Encapsulation

Lots of software now runs in a web browser. An example is java applets. Browsers have built-in protection schemes so that applets run in their own protected space and are not allowed to execute instructions outside of this space. This is called sandboxing. Often a program is given two blocks of memory, one for code and one for data. The code section is not permitted to be modified.

Firewalls

Firewalls have become an almost universal protection tool. A firewall is a process, or often a separate computer, that sits between the world and a network (or between two networks) and it only lets certain types of packets pass in and out. A firewall is configurable on a port basis so that you can exclude packets coming to particular ports. There are two basic schemes, they can contain a list of the types of packets they will accept and reject all others, or they can contain a list of the types of packets they will reject and accept all others. Configuring a firewall is a constant tradeoff between the ease of getting work done vs. the risk of an intrusion.

Firewalls offer good protection against one common sort of intrusion, port scanning The bad guy will run a program that systematically tests all of the ports on your system to see which ones accept connections, and whenever they find such a port, they try to intrude through that port.

Here is a link to a good website about firewalls

Orange Book Security

The Department of Defense is interested in computer security, and they have developed a set of security standards for computer systems called the Trusted Computer System Evaluation Criteria, more commonly known as the Orange book. The Orange Book has four layers of security, although some of these have sublayers.

Level D - no security at all. (MS-DOS, Windows 95/98/Me)
Level C1 - Discretionary Protection
- protected mode kernel
- authenticated user login
- discretionary access control (owner of an object decides what privileges to give it)
Many Unix systems, particularly early ones fit into this category.
Level C2
- objects must be given minimal access until changed by the owner
- minimal auditing
- objects (files, memory, etc) must be set to all zeros before being allocated to users.
- More fine grained access control (such as AFS)
The CS Dept system does not meet this criterion because when a new user account is created, his or her home directory is public by default. Many students in this course may have gotten in trouble if they kept their assignments in their home directory.
Historically, when a C or C++ program used malloc or new to get new memory, this memory was not zeroed out; it simply retained the contents that were already there. This could potentially lead to security problems since an evil or curious user could read this memory. Usually it would contain garbage, but occasionally it might contain interesting text. Level C2 forbids this.
Level B - Mandatory Protection
This level requires all controlled users and all objects (i.e. files) to be assigned a security label, such as unclassified, secret, or top secret. Each user has a security clearance. It requires mandatory access control for all objects and subjects.
Level A Verified Protection
This level requires a formal model of the protection system, proof that the model is correct, and a demonstration that the implementation conforms to the model. The system has to have a verifiable security design. Very few systems have reached this level.

Here is a link to a web site that describes these levels in more detail.

ssh and its relatives

Two of the most widely used application layer protocols on the Internet used to be telnet which allowed a user to remotely log into a computer, and ftp, the file transfer protocol, which allowed a user to download files. Both of these are now generally considered to be obsolete because they required the user to enter a password which was sent over the network unencrypted, and this was a major security loophole.

telnet and similar programs such as rlogin (remote login) and rsh (remote shell) have been replaced by ssh, the secure shell, which most of you have been using all semester. This protocol requires that all communication, including passwords, be encrypted. In some ways it is similar to PGP, discussed in the previous lesson.

There is an ssh client and server as you would expect. Here is a typical set of steps that ssh goes through in establishing a connection using the SSH1 protocol.

The client contacts the server. The ssh server typically runs on port 22.
The client and server exchange information about the protocols that they support. This information is in plain text. They agree on an encryption method.
The server sends the client a host key. The client compares this host key to its database. If it has connected with this server before and if the host key is the same, it accepts the connection. Otherwise it asks the user whether to continue or not.
The server sends the client its public key for RSA encryption
The client generates a random session key, encrypts it with the server public key, encrypts it again with the host key, and sends it to the server.
The server decrypts the session key. This is used to encrypt and decrypt all further communication, using a method such as IDEA. Note that even if the bad guy had been able to read all of this communication through network sniffing, he would be unable to read the session key.
The server prompts the client for a login name and password. These are encrypted by the client using the session key, and sent to the server.
Once authentication has been established, all other communication both ways is encrypted using the session key, which is known by the client and the server, but not to the bad guys

Protocols like scp (secure copy) work in more or less the same way.

The standard for secure communication for the World Wide Web is the Secure Socket Layer (SSL), developed by Netscape. This protocol provides data encryption, server authentication, message integrity, and optional client authentication for a TCP/IP connection. SSL is built into all the major browsers and web servers.

In order to activate SSL, a server has to obtain a digital certificate. This is obtained from a trusted Certificate Authority (CA). There are a number of such authorities such as VeriSign. Applicants for certificates have to provide extensive documentation to the CA confirming that they are who they say they are. This prevents a malicious web site from pretending to be a vendor such as Amazon in order to get credit card numbers or other information.

When a client (browser) connects to a server, before any information is exchanged, the server sends an authentication packet, containing a time stamp, the digital certificate, and other information, all encrypted with their private key. The client decrypts the packet with the server's public key, obtained from the CA, and confirms that the information is correct.

Often a client may also have a digital certificate. This allows a server to confirm that clients are who they say they are.

Once each side has authenticated the other side, they agree on an encryption method, and a session key (exchanged encrypted), and begin communicating. All subsequent communication is encrypted.

Other types of security problems

Even the most secure operating systems are vulnerable with certain types of new technology. It is possible for a person to be sitting in a car outside your home or office and with a device that picks up the electomagnetic signals from your monitor. Certain types of (very expensive) equipment can decipher these signals to reconstruct everything that is on your screen. The Federal Government has a program called tempest to develop defenses against this. Here is a website about Tempest if you wish to learn more about it.

Your text describes a technique called steganography in which a secret message is encoded in an ordinary image. An image which is stored as a matrix of pixel data generally stores the data at a higher resolution than the human eye can detect. Steganography stores the message by overwriting the least significant bit of each pixel. If the red value of a particular pixel is supposed to be 100, but it is 99 instead, no human can detect the difference. A color image has three bytes for each pixel, one each for the red, green, and blue values, so a 1024 by 768 image can contain a secret message of up to 2.3M bits.

The Computer Emergency Response Team (CERT)

In 1988 a Cornell Grad student released a worm into the Internet. It brought down thousands of Unix computers all over the world. It exploited a buffer overflow bug in the finger daemon (a server which allowed users to locate email addresses on the system). One effect of this was the creation of The Computer Emergency Response Team (CERT). at Carnegie Mellon. This provides a central place to report breakins, viruses etc. It is staffed with security experts to track down security breaches.

Security and Authentication on Windows

Windows provides a uniform access control facility that applies to processes, threads, files, semaphores, and other objects. Access control is governed by two entities, an access token associated with each process and a security descriptor associated with each object.

Users are authenticated with a typical password system. The login process creates an access token. This token is inherited by any processes which are created by this initial process. The access token contains the following information.

A Security ID (SID) - unique for each user
Group SIDs - a list of groups to which this user belongs
A list of security related privileges. These are set by the system administrator for each user. A typical user has no privileges
A default ACL - an initial list of protections applied to objects that the user creates

Whenever an object (file, process, thread, semaphore, etc) is created, a Security Descriptor is created for it. This contains the owner's SID, a set of flags defining the types of privileges, A System Access Control List (SACL) set by the system, and a Discretionary Access Control List (DACL) which determines which users and groups can access this object. Recall that one of the arguments to the CreateProcess API was a pointer to a security attributes structure. If this is NULL, then the security attributes associated with the new process is the default, but it is possible for the user to set different security attributes (within limits).

The two Access Control Lists each consist of a number of Access Control Entries (ACEs). Each of these consists of a user, and the privileges that he/she has. There are two forms of ACE, allow and deny.

Recall that the NTFS file system has encryption built in; it does this using public key cryptography, and aims to ensure that decrypting the files is extremely difficult without the correct key. This is based on the user's password.

Once a user has specified that a file should be encrypted, the actual process of data encryption and decryption is completely transparent to the user. The user does not need to understand this process. Encryption of files works as follows:

Each file has a unique file encryption key, which is later used to decrypt the file's data (default is AES).
The file encryption key is itself encrypted; it is protected by the user's public key corresponding to the user's EFS certificate.
The file encryption key is also protected by the public key of each additional EFS user that has been authorized to decrypt the file.

To decrypt a file, the file encryption key must first be decrypted. The file encryption key is decrypted when the user has a private key that matches the public key.

Return to the course home page

CSCI.4210 Operating Systems Fall, 2009 Class 23 Security II

Types of Security Attacks

CSCI.4210 Operating Systems Fall, 2009 Class 23
Security II