Here is Graham Doig's slide of the submtting for the Biotech Center
The HTTP protocol is ridiculously simple. A request consists of the request line, followed by zero or more headers, followed by a blank line (that means two conscutive newline characters), followed by a body, which is only present in POST requests.
In HTTP/1.0 there are only three types of requests.
GET - requests a document
HEAD- requests only the header info for a document
POST - sends some information
The newer version of HTTP, version 1.1 added two more request types, PUT and DELETE, that are hardly ever used.
Here is a link to an overview of the HTTP headers
The response consists of the response line, zero or more headers, a blank line, and the body.
There are only two important response codes, 200 OK and 404 document not found.
Do this right now. Log onto a unix machine and at the prompt, type
telnet www.rpi.edu 80then type
GET / HTTP/1.0and hit the Enter key twice
You should get the reply from the RPI web server, which consists of the headers followed the html code for the RPI home page.
If you just want to look at at the headers, type
HEAD / HTTP/1.0Here is a request sent from firefox. I ran a server on monica at port 8888, then typed
http://monica.cs.rpi.edu:8888
in the browser.
Here is what the brower sent.
GET / HTTP/1.1 Host: monica.cs.rpi.edu:8888 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-aliveA Proxy Server is a network entity that satisfies HTTP requests on behalf of an original web server.
It caches recent documents. If the document is cached, it returns it immediately. Otherwise it passes the request on to the server. The server sends the document back to the proxy. The proxy stores a copy of the document and passes it on to the requesting client.
Note that it is both a client and a server at the same time.
In practice hit rates range from .2 to .7, thus dramatically reducing network traffic.
To download a file, the client has to locate someone who happens to have that file, and then sends an http GET request. The search is the hard part. There have been three models.
Three problems
Each peer knows about its immediate neighbors and establishes a persistent connection to them. When someone wants to make a request, they send it to their immediate neighbors. If one of them can satisfy it, it sends a reply back and the requester then sends a GET request to download the file. Otherwise, each neighbor sends the request on to each of its neighbors, and the process continues.
This process is called query flooding. One result is an exponential growth of packets. If each peer is connected to ten neighbors, the first round results in 10 requests, the second round results in 100 requests, the third round results in 1000 requests, etc.
To squelch this, each request has a TTL (Time to Live) field, set to, say, 7. Each cycle decrements this, and when it gets to zero, it is not propagated any more.
Each group leader has connections to other group leaders and they are constantly sharing information.
Kazaa also uses some performance enhancements. (it can limit the number of simultaneous uploads) It gives incentives to those peers who upload more files. It can support parallel downloads (If a file is on more than one peer, one can upload the first half and the other can upload the second half.)