Robots Mailing List Archive by subject
Starting: Wed 00 Jan 1970 - 16:31:48 PDT
Ending: Thu 18 Dec 1997 - 14:33:60 PDT
Messages: 2106
-  "Good Times" hoax
-  "hidden text" vs. META tags for robots/search engines
-  "real-time" spidering by Lycos
-  "What's new" in web pages is not necessarily reliable
-  (no subject)
-  (OTP) RE: Political economy of distributed search (was topical
-  (OTP) RE: Political economy of distributed search (was topical search...)
-  *Help: Writing BOTS*
-  ??: reload problem
-  [ MERCHANTS ] My Sincerest Apologies
-  [1]Announcing NaecSpyr, a n
-  [1]Contact for Intouchgroup
-  [1]RE>[2]RE>[5]RE>Checking
-  [1]RE>[3]RE>[5]RE>Checking
-  [1]RE>[5]RE>Checking Log fi
-  [1]RE>Checking Log files
-  [1]Wobot?
-  [2]Announcing NaecSpyr, a n
-  [2]RE>[2]RE>[5]RE>Checking
-  [2]RE>[3]RE>[5]RE>Checking
-  [2]RE>[5]RE>Checking Lo
-  [2]RE>[5]RE>Checking Log fi
-  [2]RE>Checking Log files
-  [2]Wobot?
-  [3]Announcing NaecSpyr, a n
-  [3]RE>[5]RE>Checking Log fi
-  [3]RE>Checking Log files
-  [3]Wobot?
-  [4]Announcing NaecSpyr, a n
-  [4]RE>[5]RE>Checking Log fi
-  [4]RE>Checking Log files
-  [4]Wobot?
-  [5]RE>[5]RE>Checking Log fi
-  [5]RE>Checking Log files
-  [5]Wobot?
-  [ANNOUNCE] CFP: AAAI-96 WS on Internet-based Information Systems
-  [Fwd: WebCrawler & Excite]
-  [ROBOTS JJA] Lib-WWW-perl5
-  A bad agent?
-  A few copyright notes
-  A modest proposal...<snip> to discourage indexing ?
-  A new robot -- ask for advice
-  A new robot...TOPjobs(tm) USA JOBbot 1.0a
-  About integrated search engines
-  About Mother of All Bulletin Boards
-  about robots.txt content errors
-  Accept:
-  ActiveAgent
-  Activity from 205.252.60.5[0-8]
-  Add This Search Engine to your Results.
-  ADMIN: Archive
-  Admin: how to get off this list
-  Admin: List archive is back
-  ADMIN: mailing list attack :-(
-  ADMIN: Spoofing vs xxx.lanl.gov
-  Admitting the obvious
-  Advice
-  Agent Specification
-  agents ignoring robots.txt
-  algorithms
-  algorithms too
-  Allow/deny robots from major search services
-  alta vista and virtualvin.com
-  Alta Vista getting stale?
-  Alta Vista searches WHAT?!?
-  Altavista indexing password files
-  AltaVista Meta Tag Rumour
-  AltaVista's Index is obsolete
-  AltaVista's Index is obsolete; but what about the other
-  AMDIN: The list is dead
-  An extended verion of the robot exclusion standard
-  An extended version of the Robots...
-  An extended version of the Robots... (fwd)
-  An extended version of the Robots...)
-  an image observer
-  An updated extended standard for robots.txt
-  Announce: ActiveX Search (IFilter) spec/sample
-  ANNOUNCE: Don Norman (Apple) LIVE! 15-May 5PM UK = noon EDT
-  Announcement
-  Announcement and Help Requested
-  Announcing NaecSpyr, a new. . . robot?
-  another dumb robot (possibly)
-  another rare attack
-  Another rating scam!  (And a proposal on how to fix it
-  Another rating scam!  (And a proposal on how to fix it)
-  another suggestion
-  anti-robot regexps
-  Any info on "E-mail America"?
-  any robots/search-engines which index links?
-  Anyone doing a Java-based robot yet?
-  Anyone doing a Java-based robot yet?6
-  Anyone know who owns this one?
-  Apologies || communal bots
-  Apology -- I didn't mean to send that last message to the list
-  articles or URL's on search engines
-  avoiding infinite regress for robots
-  Back of the envelope computations
-  BackRub robot
-  BackRub robot warning
-  Bad agent...A *very* bad agent.
-  Bad robot: WebHopper bounch! Owner: peter@cartes.hut.fi
-  Belated notice of spider article
-  Blackboard for Discussing Domain-specific Robots
-  BOUNCE robots: Admin request
-  BSE-Slurp/0.6
-  Bug in LibWWW perl + Data::Dumper (libwwwperl refs are strange)
-  Bye Bye HyperText: The End of the World (Wide Web) As We Know It!
-  Cache Filler
-  Can I retrieve image map files?
-  Cannot believe it "Morons"
-  cc:Mail SMTPLINK Undeliverable Message
-  changes to robots.txt
-  Checking Log files
-  CitedSites(sm): Citation Indexing of Web Resources
-  Clean up Bots...
-  Client Robot 'Ranjan'
-  Collected information standards
-  Commercial Robot Vendor Recoomendations Request
-  communal bots
-  Comparing robots/search sites
-  Conceptbot spider
-  Contact for Intouchgroup.com
-  Content based robot collectors
-  Content based search engine
-  Copyrights on the web
-  Copyrights, let them be !
-  Crawlers and "dynamic" urls
-  Crawling & DNS issues
-  crawling accident
-  crawling FTP sites
-  Cron <robh@us2> /usr/home/robh/show_robots (fwd)
-  CyberPromo shut down at last!!!
-  Cyclones sign MacLeod
-  databases for spiders
-  Dead account
-  Dearnaley Auto Reply Cannon?
-  default documents
-  Defenses against bad robots
-  Defenses against bad robots)
-  define a page?
-  defining "robot"
-  depth first vs breadth first
-  Description or Abstract?
-  desperately looking for a news searcher
-  Do robots have to follow links ?
-  do robots send HTTP_HOST?
-  Does anyone else consider this irresponsible?
-  Does anyone else consider...
-  Does this count as a robot?
-  Does this..)
-  Domains and HTTP_HOST
-  Dot dot problem...
-  download robot
-  dumb robots and xxx
-  Duplicate docs (was avoiding infinite regress...)
-  Either a spider or a hacker? ww2.allcon.com
-  email address grabber
-  email grabber
-  email grabber)
-  email spider
-  escaped vs unescaped urls
-  Excite Authors?
-  Extracting info from SIG forum archives
-  FAQ again.
-  FAQ?
-  fdsf
-  fetching .map files
-  file retrieval
-  FILEZ
-  Filtering queries on a robot-built database
-  Finding the canonical name for a server
-  For Sale for 12/26!
-  Forsberg
-  forwarded e-mail
-  Frames ? Lycos ?
-  Freely available robot code in C available?
-  FS- Sharks Tkts- Front Row (2nd Deck) Jan 13
-  FS-Jan 7 Row 1 Sec 211
-  Game tonight
-  General Information
-  Get official!
-  Getting a Reply-to: field ...
-  Good HREFs vs Bogus HREFs: 80/20
-  Gopher Protocol Question
-  grammar engines
-  Handling keyword repetitions
-  harvest
-  Harvest question
-  Harvest-like use of spiders
-  Have you used the Microsoft Active-X Internet controls for Visual Basic?  (Or know someone who does?)
-  HEAD
-  help
-  Here is WebWalker
-  Heuristics....
-  hey man gimme a break
-  Hip Crime
-  hipcrime
-  Hipcrime no more
-  hoohoo.cac.washington = bad
-  Horror story
-  HOST:  header
-  How can IR Agents be evaluate ?
-  How do I let spiders in?
-  how do they do that?
-  How frequently should I check /robots.txt?
-  How long to cache robots.txt
-  How long to cache robots.txt for?
-  How to get listed #1 on all search engines (fwd)
-  How to get the document info ?
-  How to...???
-  htaccess
-  HTML Parser
-  HTML query to .ps?
-  HTML query to .ps? ....
-  http directory index request
-  http://HipCrime.com
-  hypermail archive not operational
-  i need a bot!
-  I vote NO (Was: Robot Gripes forum?)
-  I vote NO (Was: Robot Gripes forum?) - I vote YES
-  Identifying identical documents
-  IIS and If-modified-since
-  image map traversal
-  Image Maps
-  implementation fo HEAD response with meta info
-  in-document directive to discourage indexing ?
-  in-document directive..)
-  Indexing a set of URL's
-  indexing intranet-site
-  Indexing two-byte text
-  indexing via redirectors
-  Infinite e-mail loop
-  info for newbie
-  Info on authoring a Web Robot
-  Info on large scale spidering?
-  Info on large scale spidering?)
-  Information about AltaVista and Excite
-  infoseek
-  infoseeks robot is dumb
-  InfoSpiders/0.1
-  Ingrid ready for prelim alpha testing....
-  Inktomi & large sca
-  Inktomi & large scale spidering
-  Inktomi & large scale spidering)
-  inquiry about robots
-  Inter-robot Comms Port
-  Inter-robot communication
-  Inter-robot Communications - Part II
-  interactive generation of URL's
-  Introducing myself
-  Invalid request
-  IROS 97 Call for Papers
-  Is a robot visiting?
-  Is it a robot or a link-updater?
-  Is their a web site...
-  It's not only robots we have to worry about ...
-  itelligent agents
-  Java and robots...
-  java applet sockets
-  Java intelligent agents and compliance?
-  Java Robot
-  Just when you thought it might be interesting to standardize
-  Just when you thought it might be interesting to standardize robots.txt...
-  Keyword indexing
-  keywords in META-element
-  Koen Holtman: Content negotiation draft 04 submitted
-  Lack of support for "If-Modified-Since"
-  Last message
-  Lead Time
-  Library Agents(sm): Library Applications of Intelligent Software Agents
-  libww and robot source for Sequent Dynix/Ptx 4.1.3
-  Limiting robots to top-level page only (via robots.txt)?
-  Links
-  Links (don't bother checking; I've done it for you)
-  Links This Site is about Robots Not Censorship
-  Linux and Robot development...
-  loc(SOIF)
-  Looking for a search engine
-  Looking for a spider
-  Looking for good one
-  Looking for News robot
-  looking for specific bot...
-  Looking for subcontracting spider-programmers
-  Looking for...
-  Lycos
-  lycos patents
-  Lycos unfriendly robot
-  Lycos' HEAD vs. GET
-  Lynx. The one true browser.
-  MacPower
-  MacPower (an apology, I am very sorry)
-  Magic, Intelligence, and search engines
-  Mail failure
-  Mail robot?
-  Mailing list
-  make people use ROBOTS.txt?
-  Matching the user-agent in /robots.txt
-  McKinley -- 100% error rate
-  McKinley robot
-  McKinley Spider hit us hard
-  MD5 in HTTP headers - where?
-  Merry Christmas, HipXmas-SantaSpam!
-  Merry Christmas, spidie-boyz&bottie-girlz!
-  message to USSA House of Representatives
-  message to USSA Senate
-  Meta refresh tags
-  Meta Tag Article
-  meta tag implementation
-  META tag standards, search accuracy
-  Meta Tags
-  Meta Tags only on home page ?
-  Meta-seach engines
-  Microsoft Tripoli Web Search Beta now available
-  mini-robot
-  MOMSpider problem.  Broken Pipe
-  Money Spider WWW Robot for Windows
-  More dangers of spiders...
-  More Robot Talk
-  More ways to spam search engines?
-  More with the Cypherpunk antics
-  multiple copies
-  must be something in the water
-  nastygram for xxx.lanl.gov
-  nastygram from xxx.lanl.gov
-  nastygram from xxx.lanl.gov)
-  NaughtyRobot
-  NCSA Net Access_log  Analysis Tool for Win95
-  Need help again.
-  Need help on Search Engine accuracy test.
-  net agents)
-  NetJet
-  Netscape Catalog Server
-  Netscape Catalog Server:  An Eval
-  netscape spec for RDM
-  Netscape-Catalog-Robot
-  New engine on the loose?
-  New Robot Announcement
-  New robot turned loose on an unsuspecting public... and a DNS question
-  New Robot???
-  New Site
-  New URL's from Equity Int'll Webcenter
-  Newbie question
-  News Clipper for newsgroups - Windows
-  non 2nn repsonses on robots.txt
-  not a robot
-  Not so Friendly Robot - Teleport
-  Notification protocol?
-  Offline Agents for UNIX
-  On the subject of abuse/pro-activeness
-  Patents?
-  PERL Compilers & Interpretive Tools
-  Perl Spiders
-  PHP stops robots
-  please add my site
-  Please Help ME!!
-  Please take Uninvited Email discussion elsewhere
-  pointers for a novice?
-  Polite Request #2 to be Removed form List
-  Political economy of distributed search (was topical
-  Political economy of distributed search (was topical search
-  Political economy of distributed search (was topical search...)
-  Possible MSIIS bug?
-  Possible robot?
-  Possible robots.txt addition
-  Possible robots.txt addition (fwd)
-  Possible robots.txt addition - did I say that?
-  Prasad Wagle: Webhackers: Java servlets and agents
-  Preferred access time
-  Preliminary robot.faq (Please Send Questions or Comments)
-  privacy, courtesy, protection
-  Private Investigator Lists
-  Problem with your Index
-  Project Aristotle(sm)
-  Proposed URLs that robots should search
-  Proxies
-  PS
-  PS)
-  psycho at xxx.lanl.gov
-  Public Access Nodes / Copywrited Nodes
-  Q: Cooperation of robots
-  Q: meta name="robots" content="noindex" ?
-  Q: size of the web in bytes, comprehensive list
-  Quakebots
-  Question about Robot.txt
-  Quick--who knows listproc?
-  Quiz playing robots ?
-  RCPT
-  re Email Grabber
-  RE: "Good Times" hoax
-  RE: ActiveAgent and E-Mail Spam
-  RE: alta vista and virtualvin.com
-  RE: Alta Vista searches WHAT?!?
-  RE: Altavista indexing password files
-  RE: AltaVista's Index is obsolete; but what about the other
-  RE: AltaVista's Index is obsolete; but what about the others
-  RE: any robots/search-engines which index links?
-  RE: Anyone know who owns this one?
-  RE: avoiding infinite regress for robots
-  RE: changes to robots.txt
-  RE: Client Robot 'Ranjan'
-  RE: copyright, etc.
-  RE: Copyrights on the web
-  RE: Copyrights, let them be !
-  RE: crawling FTP sites
-  RE: databases for spiders
-  RE: email grabber
-  RE: email spider
-  RE: How can IR Agents be evaluate ?
-  RE: How to get listed #1 on all search engines (fwd)
-  RE: How to get the document info ?
-  RE: http directory index request
-  RE: Indexing two-byte text
-  RE: indexing via redirectors
-  RE: Inter-robot Communications - Part II
-  RE: Introducing myself
-  RE: Is a robot visiting?
-  RE: Java Robot
-  RE: Looking for...
-  RE: Mailing List
-  RE: make people use ROBOTS.txt?
-  RE: nastygram from xxx.lanl.gov
-  RE: Netscape-Catalog-Robot
-  RE: Notification protocol?
-  RE: On the subject of abuse/pro-activeness
-  RE: Preferred access time
-  RE: Recursion
-  RE: RFC, draft 1
-  RE: Robot Databases
-  RE: robots.txt unavailability
-  RE: robots.txt usage
-  RE: Scalpers (SJPD does crack down)
-  RE: Search Engine
-  RE: Server name in /robots.txt
-  RE: shot clock?!....
-  RE: Should I index all ...
-  RE: Tagging a document with  language
-  RE: The Internet Archive robot
-  RE: The Internet Archive robot (fwd)
-  RE: To: ????  Robot
-  RE: VB. page grabber...
-  RE: verify URL
-  RE: Web pages being served from an SQL database
-  RE: WebAnalyzer - introduction
-  RE: What to rate limit/lock on, name or IP address?
-  Re[2]: Anyone doing a Java-based robot yet?
-  Re[2]: Harvest question
-  Re[2]: Lycos' HEAD vs. GET
-  Re[2]: robots on an intranet
-  Re[2]: SpamBots
-  Re[2]: verify URL
-  Re[3]: SpamBots
-  Really fast searching
-  Recherche de documentation sur les agents intelligents ou Robots.
-  Recursion
-  Referencing dynamic pages
-  Regexp Library Cook-off
-  ReHowto...???
-  Remember Canseco.....
-  Report of the Distributed Indexing/Searching Workshop
-  Req for ADMIM: How to sunsubscribe?
-  Request for Source code in C for Robots
-  Requesting info on database engines
-  Responsible behavior, Robots vs. humans, URL botany...
-  Returned mail: Can't create output: Error 0
-  Returned mail: Host unknown (Name server: webcrawler: host not found)
-  Returned mail: Service unavailableHELP AGAIN HELP AGAIN!
-  Returned mail: Service unavailableHELP HELP!
-  Returned mail: User unknown
-  RFC, draft 1
-  Robo-phopbic Mailing list
-  robot ?
-  robot algorithm ?
-  robot authentication parameters
-  Robot books
-  Robot Databases
-  robot defined
-  robot definition
-  Robot Exclusion Standard Revisited
-  Robot Exclusion Standard Revisited (LONG)
-  Robot exclustion for for non-'unix file' hierarchy
-  Robot for Sun
-  Robot Gripes forum? (Was: Anyone know who owns this one?)
-  Robot logic?
-  robot meta tags
-  Robot Mirror with Username/Password feature
-  Robot on the Rampage
-  Robot Research
-  robot source code
-  Robot Specifications.
-  Robot to collect web pages per site
-  robot to get specific info only?
-  robot vaiable list
-  Robot's Book.
-  Robot-HTML Web Page?
-  robot.polite
-  robot?
-  robots & copyright law
-  Robots / source availability?
-  robots and cookies
-  Robots and search engines technical information.
-  Robots available for Intranet applications
-  Robots in the client?
-  Robots not Frames savy
-  robots on an intranet
-  robots on an intranet (replies to list...)
-  robots source code in C
-  robots that index comments
-  robots, what else!
-  robots.txt
-  robots.txt (A *little* off the subject)
-  robots.txt , authors of robots , webmasters ....
-  robots.txt , authors of robots , webmasters ....OM
-  robots.txt , authors of robots , webmasters ....OMOMOM[D
-  robots.txt buffer question.
-  robots.txt changes how often?
-  robots.txt extensions
-  robots.txt syntax
-  robots.txt syntax)
-  robots.txt unavailability
-  robots.txt usage
-  robots.txt: allow directive
-  robots: lycos's t-rex: strange behaviour
-  roverbot - perhaps the worst robot yet
-  Safe Methods
-  Search accuracy
-  Search Engine
-  Search Engine article
-  Search Engine end-users
-  Search Engine Tutorial for Web Developers
-  Seeing is Believing: Candidate Web Resources for Information Visualization
-  Server Indexing -- Helping a Robot Out
-  Server name in /robots.txt
-  Server name in /robots.txt)
-  Server name in /robots.txt]
-  servers that don't return a 404 for "not found"
-  Servers vs Agents
-  SetEnv a problem
-  Should I index all ...
-  Showbiz search engine
-  Simple load robot
-  Site Announcement
-  Small robot needed
-  Smart Agent help
-  Social Responsibilities  (was Safe Methods)
-  sockets in PERL
-  Somebody is turning 23!
-  Something that would be handy
-  Sorry!
-  Source code
-  Spam Software Sought
-  spam? (fwd)
-  SpamBots
-  specialized searches
-  Specific searches
-  Standard
-  Standard?
-  Standard?)
-  Standard}
-  stingy yahoo server?
-  stingy yahoo server?]
-  Stop 'bots using apache, etc. or php?
-  Stupid robots cache DNS and not IMS
-  Suggestion to help robots and sites coexist a little better
-  Tagging a document with  language
-  Tagging a document with language
-  technical descripton[D[D[D
-  Test server for robot development?
-  test. please ignore.
-  test; please ignore
-  Thanks!
-  That wacky Wobot
-  The "Robot and Search Engine FAQ"
-  The Big Picture(sm): Visual Browsing in Web and non-Web Databases
-  The End of The World (Wide Web) / Part II
-  The Internet Archive robot
-  The Internet Archive robot (fwd)
-  The Internet Archive robot)
-  The Letter To End All Letters
-  The Metacrawler, Reborn
-  the POST myth... a web admin's opinions..
-  The Robot And Search Engine FAQ
-  The robots mailing list at WebCrawler
-  Tim Freeman
-  To: ????  Robot
-  To: ????  Robot]
-  To: ???? Robot
-  Too Many Admins (TMA) !!!
-  Topic drift (archive robot, copyright...)
-  Topic-specific robots
-  topical search tool -- help?!
-  Try robot...
-  tryme
-  Tutorial Proposal for WWW95
-  un-subcribe
-  UN/LINK protocol is standardized!  wasn't that quick!
-  UN/LINK protocol is standardized! wasn't that quick!
-  Unfriendly Lycos , again ...
-  Unfriendly robot
-  Unfriendly robot at 192.115.187.2
-  Unfriendly robot at 205.177.10.2
-  Unfriendly robot at 205.252.60.50
-  Unfriendly robot owner identified!
-  unix robot
-  unknown robot
-  Unregistered MIME types?
-  unscribe
-  unsubscibe
-  UNSUBSCRIBE ROBOTS
-  Unsubscribing from Robots (was "your mail")
-  Unusual request - sorry!
-  Up to date list of Robots
-  Updating Robots
-  url locating
-  URL measurement studies?
-  Use of robots.txt to "check status"?
-  User-Agent
-  user-agent in Java
-  USER_AGENT and Apache 1.2
-  USER_AGENT spoofing
-  Vacation wars
-  VB and robot development
-  VB. page grabber...
-  verify URL
-  Virtual (was: RFC, draft 1)
-  Wanted: Web Robot code - C/Perl
-  Washington again !!!
-  Washington again !!!)
-  We need robot information
-  We need to Shut down Roenick
-  we should help spiders and not say NO!
-  Web pages being served from an SQL database
-  Web Robot
-  Web Robots
-  Web robots and gopher space -- two separate worlds
-  Web spaces of strange topology. Where?
-  web topology
-  WebAnalyzer - introduction
-  WebCrawler & Excite
-  Webfetch
-  Welcome to cypherpunks
-  What Is wwweb
-  What is your favorite search engine - a survey
-  What to rate limit/lock on, name or IP address?
-  White House/PARC "Leveraging Cyberspace"
-  Who sets standards (was Server name in /robots.txt)
-  who/what uses robots.txt
-  WININET caching
-  Wobot?
-  word spam
-  WRITERS WANTED (re-post)
-  www.kollar.com/robots.html
-  www.pl?
-  wwwbot.pl problem
-  xxx.lanl.gov - The thread continues....
-  xxx.lanl.gov a real threat?
-  xxx.lanl.gov/robots.txt
-  yet another robot
-  yet another robot, volume 2
-  your mail
Last message date: Thu 18 Dec 1997 - 14:33:60 PDT
Archived on: Sun Aug 17 1997 - 19:13:25 PDT
This archive was generated by hypermail 1.02.