I am the father of Scooter, and for the second time I need to rescue my robot
from groundless accusations. Soon I'll have enough material for a book...
Scooter is a regular robot: it follows links, and only follows links. It does
not guess IP addresses, or try out all possible files names (one of my
favorite), or spy on sites to guess the "secret test port", or anything like
that. In this particular instance, I have to insist that Scooter does not
"extract data from a host by connecting to EVERY tcp port". Over 130,000 sites
times 32,000 possible ports would amount to a lot of stupid pinging with not
much return!
The Web is large enough, there is no need to invent new and exotic techniques to
access more data. My current estimate BTW is that there are at least 50 million
Web pages (text of some sort) publicly available and indexable (not covered by a
robots.txt file), so there is really no lack of raw material.
Could the next person who feels an urge to speak for the Alta Vista robot please
check with me first? Nothing about this project is very secret, all you have to
do is ask.
Cheers,
--Louis