Uniform Resource Locators (URLs) and HyperText Transfer Protocol(HTTP)
URL:
- A descriptor for a specific Internet resource. A URL
- can be thought of as a "Location" on the Internet.
Unfortunately, a formal definition of a URL would require a tree-like structure so
for this working definition we provide several examples, those that we will use most frequently in this course.
Each URL begins with a "Protocol" , a string followed by a colon. The examples we will see most often are
- http:
- ftp:
- telnet:
- mailto:
The complexities begin with the next "two" fields, the "Internet Resource Address" and
"Resource Details". In particular, they may the same. Here are some examples.
- http://www.umsl.edu/~siegel/newcourse/part1/URL-HTTP.html
- ftp://ftp.umsl.edu
- telnet://jinx.umsl.edu
- mailto:jerrold_siegel@umsl.edu
In the first example the Internet Resource Address is //www.umsl.edu and the "Resource Details" is
/~siegel/newcourse/part1/URL-HTTP.html. In the other examples these fields are the same. Note also that the IRA for
mailto: does not begin with "//"
-
Focusing on http:
- The IRA may contain a "Port", the format is Domainname:port. For example, www.umsl.edu:81 was the IRA of our old CERN server (1995), we used to run it at the same time a
www.umsl.edu, which is, by default Port 80.
- The Resource Details may also contain a "Fragment", the format is Filepath#FragmentName.For example http://www.umsl.edu/~siegel/newcourse/part1/URL-HTTP.html#bottompart a specific location on this
Web Page.
-
Finally, the URL can contain a "Query String", which is passed to the server for processing (examples to be provided). The format is
http:DomainnameFilepath?QueryString. Query Strings are encoded using "URL Encoding". Again, we will discuss this in some detail later in the course, but for the moment the following examples begin to tell the story.
- The string hello is passed to the server, as is.
- The string hello+world is passed as the string "hello world"
- the string hello=loneliness&goodbye=happiness is passed (and usually decoded) as two name-value pairs.
name |
value |
hello |
loneliness |
goodbye |
happiness |
HTTP:
- The default Client/Server communications protocol for the WorldWideWeb.
HyperText Transfer Protocol comes in two flavors 1.0 and 1.1. For this introduction, we will focus on 1.0.
HTTP/1.0 is a stateless protocol. That is, after a transaction is completed the connection is
broken and the server does not "remember" what transpired. Thus, from the clients
point of view, if a second request depends upon the results of the first, all
relevent information must be returned to the server (through a Query String (?)).
The Telnet log below is from a simulated web transaction.
louie.umsl.edu> telnet www.umsl.edu 80
Trying 134.124.1.234...
Connected to www.umsl.edu.
Escape character is '^]'.
GET /~siegel/little.html HTTP/1.0
Accept: text/html
HTTP/1.1 200 OK
Server: Netscape-Enterprise/3.6
Date: Sat, 25 Mar 2000 15:39:44 GMT
Content-type: text/html
Link: <http://www.umsl.edu/~siegel/little.html?PageServices>; rel="PageServices"
Last-modified: Sun, 21 Dec 1997 18:13:42 GMT
Content-length: 34
Accept-ranges: bytes
Connection: close
<html><body>Hello!</body></html>
Connection closed by foreign host.
note:
- Again, 80 is the default port for the World Wide Web.
- Note the form in which the URL presented to the Web server. Web servers
can be very unforgiving on this point.
- The client-side of the dialog ends with a blank line.
- Note that the server returned the "Last-modified:" date. This Page was created
the last time this course was offered.
- Since we are using telnet as a "browser," we see the raw HTML.
A real browser would render this as a WebPage.
URL: https://www.umsl.edu/~siegelj/newcourse/part1/URL-HTTP.html
Copyright: Jerrold Siegel for The University of Missouri -St. Louis
Last modified on 06/26/2000 14:29:59