http://
Use HTTP, the Hypertext Transfer Protocol.
hypothetical.ora.com
Contact a computer over the network with the hostname of hypothetical.ora.com .
:80
Connect to the computer at port 80. The port number can be anylegitimate IP port number: 1 through 65535, inclusively.[] If the colon andport number are omitted, the port number is assumed to beHTTP's default port number, which is 80.
/
Anything after the hostname and optional port number is regarded as adocumentpath. In this example, the document path is /.
So the browser connects to hypothetical.ora.com on port 80 using theHTTP protocol. The message that the browser sends to the server is:
GET / HTTP/1.1Accept: image/gif, image/x-xbitmap, image/ jpeg, image/pjpeg, */*Accept-Language: en-usAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT)Host: hypothetical.ora.comConnection: Keep-Alive
Let's look at what these lines are saying:
The first line of this request(GET / HTTP/1.1)
requests a document at / from theserver. HTTP/1.1
is given as the version of theHTTP protocol that the browser uses.
The second line tells the server what kind of documents are acceptedby the browser.
The third line indicates that the preferred language is English. Thisheader allows the client to specify a preference for one or morelanguages, in the event that a server has the same document inmultiple languages.
The fourth line indicates that the client understands how tointerpret a server response that is compressed with the gzip ordeflate algorithm.
In the fifth line, beginning with the stringUser-Agent
, the client identifies itself asMozilla version 4.0, running on Windows NT. In parenthesis itmentions that it is really Microsoft Internet Explorer version 5.01.
The sixth line tells the server what the client thinks theserver's hostname is.This header is mandatory in HTTP 1.1, but optional in HTTP 1.0. Sincethe server may have multiple hostnames, the client indicates whichhostname is being requested. In this environment, a web server canhave a different document tree for each hostname assigned to it. Ifthe client hasn't specified the server's hostname, theserver may be unable to determine which document tree to use.
The seventh line (Connection:
) tells theserver to keep the TCP connection open until explicitly told todisconnect. Under HTTP 1.1, the default server behavior is to keepthe connection open until the client specifies that the connectionshould be closed. The standard behavior in HTTP 1.0 is to close theconnection after the client's request. See the discussion in later in this book for details.
Together, these seven lines constitute a request . Lines twothrough seven are request headers . discusses each header in more detail.
Responses
Given arequest like the one previously shown, the server looks for theserver resource associated with "/" and returns it to thebrowser, preceding it with header information in its response. Theresource associated with the URL depends on how the server isimplemented. It could be a static file or it could be dynamicallygenerated. In this case, the server returns:
HTTP/1.1 200 OKDate: Mon, 06 Dec 1999 20:54:26 GMTServer: Apache/1.3.6 (Unix)Last-Modified: Fri, 04 Oct 1996 14:06:11 GMTETag: "2f5cd-964-381e1bd6"Accept-Ranges: bytesContent-length: 327Connection: closeContent-type: text/htmlSample Homepage
Welcome
Hi there, this is a simple web page. Granted,it may not be as elegant as some other webpages you've seen on the net, but there are some common qualities:
If you look at this response, you'll see it begins with aseries of lines that specify information about the document and aboutthe server itself. After a blank line, it returns the document. Lines2-9 are called the response header, and the part after thefirst blank line is called the body or entity, or entity-body.Let's look at the header information:
The first line, HTTP/1.1 200 OK
, tells the clientwhat version of the HTTP protocol the server uses. But moreimportantly, by returning a status code of 200, it says that thedocument has been found and will transmit the document in itsresponse.
The second line indicates the current date on the server. The time isexpressed in Greenwich Mean Time (GMT).
The third line tells the client what kind of software the server isrunning. In this case, the server is Apache version 1.3.6 on Unix.
The fourth line specifies the most recent modification time of thedocument requested by the client. Thismodification time is often used for cachingpurposesso a browser may not need to request the entire HTMLfile again if its modification time doesn't change
The fifth line indicates an entity tag. This provides the web clientwith a unique identifier for the server resource. It is highlyunlikely for two different server resources to have the same entitytag. This tag provides a powerful mechanism for caching.