Re: default documents

Micah A. Williams (micah@sequent.uncfsu.edu)
Mon, 22 Apr 96 12:07:45 EDT


In the words of Jakob Faarvang,
>
> How does a robot know what the default document
> (index.html/default.htm/home.html) is called?
>
> I mean, how does it know that, say, http://www.mydomain.com/test/ is
> the same as http://www.mydomain.com/test/index.html or
> http://www.mydomain.com/default.htm ?
>
> - Jakob

My robot, Pioneer, does not make any assumptions as to what a
"default document" may be. My search database therefore has
a few "xxxx/" and "xxxx/index.html" pairs laying around that point to
the same document.

Now if the HTTP protocol had servers make certain provisions for this,
it would be different.

Like for example, what if, instead of just sending back the
document, the server sent the actual true location as a
redirect (code 301). Say the robot decides to get the
document, "http://www.foo.org/info/" ... the server at
foo.org sees that the GET request is in 'short form', consults its internal
configuration, and then returns a redirect with the
Location set to the full absolute location of its
default document, "http://www.foo.org/info/index.html".
This being the case, if the robot already had this URL
indexed, it would ignored.

This may be helpful for robots, but since
so many URLs are published in the 'short cut' format, I wonder
whether or not dealing with this redirection frequently
would slow down normal browsers.

-Micah

-- 
============================================================================
Micah A. Williams | Computer Science | Fayetteville State University	
micah@sequent.uncfsu.edu | http://sequent.uncfsu.edu/~micah/ 
Bjork WebPage: http://sequent.uncfsu.edu/~micah/bjork.html
Though we do not realize it, we all, in some capacity, work for Keyser Soze. 
============================================================================