Robots Mailing List Archive by subject
Starting: Wed 00 Jan 1970 - 16:31:48 PDT
Ending: Thu 18 Dec 1997 - 14:33:60 PDT
Messages: 2106
- "Good Times" hoax
- "hidden text" vs. META tags for robots/search engines
- "real-time" spidering by Lycos
- "What's new" in web pages is not necessarily reliable
- (no subject)
- (OTP) RE: Political economy of distributed search (was topical
- (OTP) RE: Political economy of distributed search (was topical search...)
- *Help: Writing BOTS*
- ??: reload problem
- [ MERCHANTS ] My Sincerest Apologies
- [1]Announcing NaecSpyr, a n
- [1]Contact for Intouchgroup
- [1]RE>[2]RE>[5]RE>Checking
- [1]RE>[3]RE>[5]RE>Checking
- [1]RE>[5]RE>Checking Log fi
- [1]RE>Checking Log files
- [1]Wobot?
- [2]Announcing NaecSpyr, a n
- [2]RE>[2]RE>[5]RE>Checking
- [2]RE>[3]RE>[5]RE>Checking
- [2]RE>[5]RE>Checking Lo
- [2]RE>[5]RE>Checking Log fi
- [2]RE>Checking Log files
- [2]Wobot?
- [3]Announcing NaecSpyr, a n
- [3]RE>[5]RE>Checking Log fi
- [3]RE>Checking Log files
- [3]Wobot?
- [4]Announcing NaecSpyr, a n
- [4]RE>[5]RE>Checking Log fi
- [4]RE>Checking Log files
- [4]Wobot?
- [5]RE>[5]RE>Checking Log fi
- [5]RE>Checking Log files
- [5]Wobot?
- [ANNOUNCE] CFP: AAAI-96 WS on Internet-based Information Systems
- [Fwd: WebCrawler & Excite]
- [ROBOTS JJA] Lib-WWW-perl5
- A bad agent?
- A few copyright notes
- A modest proposal...<snip> to discourage indexing ?
- A new robot -- ask for advice
- A new robot...TOPjobs(tm) USA JOBbot 1.0a
- About integrated search engines
- About Mother of All Bulletin Boards
- about robots.txt content errors
- Accept:
- ActiveAgent
- Activity from 205.252.60.5[0-8]
- Add This Search Engine to your Results.
- ADMIN: Archive
- Admin: how to get off this list
- Admin: List archive is back
- ADMIN: mailing list attack :-(
- ADMIN: Spoofing vs xxx.lanl.gov
- Admitting the obvious
- Advice
- Agent Specification
- agents ignoring robots.txt
- algorithms
- algorithms too
- Allow/deny robots from major search services
- alta vista and virtualvin.com
- Alta Vista getting stale?
- Alta Vista searches WHAT?!?
- Altavista indexing password files
- AltaVista Meta Tag Rumour
- AltaVista's Index is obsolete
- AltaVista's Index is obsolete; but what about the other
- AMDIN: The list is dead
- An extended verion of the robot exclusion standard
- An extended version of the Robots...
- An extended version of the Robots... (fwd)
- An extended version of the Robots...)
- an image observer
- An updated extended standard for robots.txt
- Announce: ActiveX Search (IFilter) spec/sample
- ANNOUNCE: Don Norman (Apple) LIVE! 15-May 5PM UK = noon EDT
- Announcement
- Announcement and Help Requested
- Announcing NaecSpyr, a new. . . robot?
- another dumb robot (possibly)
- another rare attack
- Another rating scam! (And a proposal on how to fix it
- Another rating scam! (And a proposal on how to fix it)
- another suggestion
- anti-robot regexps
- Any info on "E-mail America"?
- any robots/search-engines which index links?
- Anyone doing a Java-based robot yet?
- Anyone doing a Java-based robot yet?6
- Anyone know who owns this one?
- Apologies || communal bots
- Apology -- I didn't mean to send that last message to the list
- articles or URL's on search engines
- avoiding infinite regress for robots
- Back of the envelope computations
- BackRub robot
- BackRub robot warning
- Bad agent...A *very* bad agent.
- Bad robot: WebHopper bounch! Owner: peter@cartes.hut.fi
- Belated notice of spider article
- Blackboard for Discussing Domain-specific Robots
- BOUNCE robots: Admin request
- BSE-Slurp/0.6
- Bug in LibWWW perl + Data::Dumper (libwwwperl refs are strange)
- Bye Bye HyperText: The End of the World (Wide Web) As We Know It!
- Cache Filler
- Can I retrieve image map files?
- Cannot believe it "Morons"
- cc:Mail SMTPLINK Undeliverable Message
- changes to robots.txt
- Checking Log files
- CitedSites(sm): Citation Indexing of Web Resources
- Clean up Bots...
- Client Robot 'Ranjan'
- Collected information standards
- Commercial Robot Vendor Recoomendations Request
- communal bots
- Comparing robots/search sites
- Conceptbot spider
- Contact for Intouchgroup.com
- Content based robot collectors
- Content based search engine
- Copyrights on the web
- Copyrights, let them be !
- Crawlers and "dynamic" urls
- Crawling & DNS issues
- crawling accident
- crawling FTP sites
- Cron <robh@us2> /usr/home/robh/show_robots (fwd)
- CyberPromo shut down at last!!!
- Cyclones sign MacLeod
- databases for spiders
- Dead account
- Dearnaley Auto Reply Cannon?
- default documents
- Defenses against bad robots
- Defenses against bad robots)
- define a page?
- defining "robot"
- depth first vs breadth first
- Description or Abstract?
- desperately looking for a news searcher
- Do robots have to follow links ?
- do robots send HTTP_HOST?
- Does anyone else consider this irresponsible?
- Does anyone else consider...
- Does this count as a robot?
- Does this..)
- Domains and HTTP_HOST
- Dot dot problem...
- download robot
- dumb robots and xxx
- Duplicate docs (was avoiding infinite regress...)
- Either a spider or a hacker? ww2.allcon.com
- email address grabber
- email grabber
- email grabber)
- email spider
- escaped vs unescaped urls
- Excite Authors?
- Extracting info from SIG forum archives
- FAQ again.
- FAQ?
- fdsf
- fetching .map files
- file retrieval
- FILEZ
- Filtering queries on a robot-built database
- Finding the canonical name for a server
- For Sale for 12/26!
- Forsberg
- forwarded e-mail
- Frames ? Lycos ?
- Freely available robot code in C available?
- FS- Sharks Tkts- Front Row (2nd Deck) Jan 13
- FS-Jan 7 Row 1 Sec 211
- Game tonight
- General Information
- Get official!
- Getting a Reply-to: field ...
- Good HREFs vs Bogus HREFs: 80/20
- Gopher Protocol Question
- grammar engines
- Handling keyword repetitions
- harvest
- Harvest question
- Harvest-like use of spiders
- Have you used the Microsoft Active-X Internet controls for Visual Basic? (Or know someone who does?)
- HEAD
- help
- Here is WebWalker
- Heuristics....
- hey man gimme a break
- Hip Crime
- hipcrime
- Hipcrime no more
- hoohoo.cac.washington = bad
- Horror story
- HOST: header
- How can IR Agents be evaluate ?
- How do I let spiders in?
- how do they do that?
- How frequently should I check /robots.txt?
- How long to cache robots.txt
- How long to cache robots.txt for?
- How to get listed #1 on all search engines (fwd)
- How to get the document info ?
- How to...???
- htaccess
- HTML Parser
- HTML query to .ps?
- HTML query to .ps? ....
- http directory index request
- http://HipCrime.com
- hypermail archive not operational
- i need a bot!
- I vote NO (Was: Robot Gripes forum?)
- I vote NO (Was: Robot Gripes forum?) - I vote YES
- Identifying identical documents
- IIS and If-modified-since
- image map traversal
- Image Maps
- implementation fo HEAD response with meta info
- in-document directive to discourage indexing ?
- in-document directive..)
- Indexing a set of URL's
- indexing intranet-site
- Indexing two-byte text
- indexing via redirectors
- Infinite e-mail loop
- info for newbie
- Info on authoring a Web Robot
- Info on large scale spidering?
- Info on large scale spidering?)
- Information about AltaVista and Excite
- infoseek
- infoseeks robot is dumb
- InfoSpiders/0.1
- Ingrid ready for prelim alpha testing....
- Inktomi & large sca
- Inktomi & large scale spidering
- Inktomi & large scale spidering)
- inquiry about robots
- Inter-robot Comms Port
- Inter-robot communication
- Inter-robot Communications - Part II
- interactive generation of URL's
- Introducing myself
- Invalid request
- IROS 97 Call for Papers
- Is a robot visiting?
- Is it a robot or a link-updater?
- Is their a web site...
- It's not only robots we have to worry about ...
- itelligent agents
- Java and robots...
- java applet sockets
- Java intelligent agents and compliance?
- Java Robot
- Just when you thought it might be interesting to standardize
- Just when you thought it might be interesting to standardize robots.txt...
- Keyword indexing
- keywords in META-element
- Koen Holtman: Content negotiation draft 04 submitted
- Lack of support for "If-Modified-Since"
- Last message
- Lead Time
- Library Agents(sm): Library Applications of Intelligent Software Agents
- libww and robot source for Sequent Dynix/Ptx 4.1.3
- Limiting robots to top-level page only (via robots.txt)?
- Links
- Links (don't bother checking; I've done it for you)
- Links This Site is about Robots Not Censorship
- Linux and Robot development...
- loc(SOIF)
- Looking for a search engine
- Looking for a spider
- Looking for good one
- Looking for News robot
- looking for specific bot...
- Looking for subcontracting spider-programmers
- Looking for...
- Lycos
- lycos patents
- Lycos unfriendly robot
- Lycos' HEAD vs. GET
- Lynx. The one true browser.
- MacPower
- MacPower (an apology, I am very sorry)
- Magic, Intelligence, and search engines
- Mail failure
- Mail robot?
- Mailing list
- make people use ROBOTS.txt?
- Matching the user-agent in /robots.txt
- McKinley -- 100% error rate
- McKinley robot
- McKinley Spider hit us hard
- MD5 in HTTP headers - where?
- Merry Christmas, HipXmas-SantaSpam!
- Merry Christmas, spidie-boyz&bottie-girlz!
- message to USSA House of Representatives
- message to USSA Senate
- Meta refresh tags
- Meta Tag Article
- meta tag implementation
- META tag standards, search accuracy
- Meta Tags
- Meta Tags only on home page ?
- Meta-seach engines
- Microsoft Tripoli Web Search Beta now available
- mini-robot
- MOMSpider problem. Broken Pipe
- Money Spider WWW Robot for Windows
- More dangers of spiders...
- More Robot Talk
- More ways to spam search engines?
- More with the Cypherpunk antics
- multiple copies
- must be something in the water
- nastygram for xxx.lanl.gov
- nastygram from xxx.lanl.gov
- nastygram from xxx.lanl.gov)
- NaughtyRobot
- NCSA Net Access_log Analysis Tool for Win95
- Need help again.
- Need help on Search Engine accuracy test.
- net agents)
- NetJet
- Netscape Catalog Server
- Netscape Catalog Server: An Eval
- netscape spec for RDM
- Netscape-Catalog-Robot
- New engine on the loose?
- New Robot Announcement
- New robot turned loose on an unsuspecting public... and a DNS question
- New Robot???
- New Site
- New URL's from Equity Int'll Webcenter
- Newbie question
- News Clipper for newsgroups - Windows
- non 2nn repsonses on robots.txt
- not a robot
- Not so Friendly Robot - Teleport
- Notification protocol?
- Offline Agents for UNIX
- On the subject of abuse/pro-activeness
- Patents?
- PERL Compilers & Interpretive Tools
- Perl Spiders
- PHP stops robots
- please add my site
- Please Help ME!!
- Please take Uninvited Email discussion elsewhere
- pointers for a novice?
- Polite Request #2 to be Removed form List
- Political economy of distributed search (was topical
- Political economy of distributed search (was topical search
- Political economy of distributed search (was topical search...)
- Possible MSIIS bug?
- Possible robot?
- Possible robots.txt addition
- Possible robots.txt addition (fwd)
- Possible robots.txt addition - did I say that?
- Prasad Wagle: Webhackers: Java servlets and agents
- Preferred access time
- Preliminary robot.faq (Please Send Questions or Comments)
- privacy, courtesy, protection
- Private Investigator Lists
- Problem with your Index
- Project Aristotle(sm)
- Proposed URLs that robots should search
- Proxies
- PS
- PS)
- psycho at xxx.lanl.gov
- Public Access Nodes / Copywrited Nodes
- Q: Cooperation of robots
- Q: meta name="robots" content="noindex" ?
- Q: size of the web in bytes, comprehensive list
- Quakebots
- Question about Robot.txt
- Quick--who knows listproc?
- Quiz playing robots ?
- RCPT
- re Email Grabber
- RE: "Good Times" hoax
- RE: ActiveAgent and E-Mail Spam
- RE: alta vista and virtualvin.com
- RE: Alta Vista searches WHAT?!?
- RE: Altavista indexing password files
- RE: AltaVista's Index is obsolete; but what about the other
- RE: AltaVista's Index is obsolete; but what about the others
- RE: any robots/search-engines which index links?
- RE: Anyone know who owns this one?
- RE: avoiding infinite regress for robots
- RE: changes to robots.txt
- RE: Client Robot 'Ranjan'
- RE: copyright, etc.
- RE: Copyrights on the web
- RE: Copyrights, let them be !
- RE: crawling FTP sites
- RE: databases for spiders
- RE: email grabber
- RE: email spider
- RE: How can IR Agents be evaluate ?
- RE: How to get listed #1 on all search engines (fwd)
- RE: How to get the document info ?
- RE: http directory index request
- RE: Indexing two-byte text
- RE: indexing via redirectors
- RE: Inter-robot Communications - Part II
- RE: Introducing myself
- RE: Is a robot visiting?
- RE: Java Robot
- RE: Looking for...
- RE: Mailing List
- RE: make people use ROBOTS.txt?
- RE: nastygram from xxx.lanl.gov
- RE: Netscape-Catalog-Robot
- RE: Notification protocol?
- RE: On the subject of abuse/pro-activeness
- RE: Preferred access time
- RE: Recursion
- RE: RFC, draft 1
- RE: Robot Databases
- RE: robots.txt unavailability
- RE: robots.txt usage
- RE: Scalpers (SJPD does crack down)
- RE: Search Engine
- RE: Server name in /robots.txt
- RE: shot clock?!....
- RE: Should I index all ...
- RE: Tagging a document with language
- RE: The Internet Archive robot
- RE: The Internet Archive robot (fwd)
- RE: To: ???? Robot
- RE: VB. page grabber...
- RE: verify URL
- RE: Web pages being served from an SQL database
- RE: WebAnalyzer - introduction
- RE: What to rate limit/lock on, name or IP address?
- Re[2]: Anyone doing a Java-based robot yet?
- Re[2]: Harvest question
- Re[2]: Lycos' HEAD vs. GET
- Re[2]: robots on an intranet
- Re[2]: SpamBots
- Re[2]: verify URL
- Re[3]: SpamBots
- Really fast searching
- Recherche de documentation sur les agents intelligents ou Robots.
- Recursion
- Referencing dynamic pages
- Regexp Library Cook-off
- ReHowto...???
- Remember Canseco.....
- Report of the Distributed Indexing/Searching Workshop
- Req for ADMIM: How to sunsubscribe?
- Request for Source code in C for Robots
- Requesting info on database engines
- Responsible behavior, Robots vs. humans, URL botany...
- Returned mail: Can't create output: Error 0
- Returned mail: Host unknown (Name server: webcrawler: host not found)
- Returned mail: Service unavailableHELP AGAIN HELP AGAIN!
- Returned mail: Service unavailableHELP HELP!
- Returned mail: User unknown
- RFC, draft 1
- Robo-phopbic Mailing list
- robot ?
- robot algorithm ?
- robot authentication parameters
- Robot books
- Robot Databases
- robot defined
- robot definition
- Robot Exclusion Standard Revisited
- Robot Exclusion Standard Revisited (LONG)
- Robot exclustion for for non-'unix file' hierarchy
- Robot for Sun
- Robot Gripes forum? (Was: Anyone know who owns this one?)
- Robot logic?
- robot meta tags
- Robot Mirror with Username/Password feature
- Robot on the Rampage
- Robot Research
- robot source code
- Robot Specifications.
- Robot to collect web pages per site
- robot to get specific info only?
- robot vaiable list
- Robot's Book.
- Robot-HTML Web Page?
- robot.polite
- robot?
- robots & copyright law
- Robots / source availability?
- robots and cookies
- Robots and search engines technical information.
- Robots available for Intranet applications
- Robots in the client?
- Robots not Frames savy
- robots on an intranet
- robots on an intranet (replies to list...)
- robots source code in C
- robots that index comments
- robots, what else!
- robots.txt
- robots.txt (A *little* off the subject)
- robots.txt , authors of robots , webmasters ....
- robots.txt , authors of robots , webmasters ....OM
- robots.txt , authors of robots , webmasters ....OMOMOM[D
- robots.txt buffer question.
- robots.txt changes how often?
- robots.txt extensions
- robots.txt syntax
- robots.txt syntax)
- robots.txt unavailability
- robots.txt usage
- robots.txt: allow directive
- robots: lycos's t-rex: strange behaviour
- roverbot - perhaps the worst robot yet
- Safe Methods
- Search accuracy
- Search Engine
- Search Engine article
- Search Engine end-users
- Search Engine Tutorial for Web Developers
- Seeing is Believing: Candidate Web Resources for Information Visualization
- Server Indexing -- Helping a Robot Out
- Server name in /robots.txt
- Server name in /robots.txt)
- Server name in /robots.txt]
- servers that don't return a 404 for "not found"
- Servers vs Agents
- SetEnv a problem
- Should I index all ...
- Showbiz search engine
- Simple load robot
- Site Announcement
- Small robot needed
- Smart Agent help
- Social Responsibilities (was Safe Methods)
- sockets in PERL
- Somebody is turning 23!
- Something that would be handy
- Sorry!
- Source code
- Spam Software Sought
- spam? (fwd)
- SpamBots
- specialized searches
- Specific searches
- Standard
- Standard?
- Standard?)
- Standard}
- stingy yahoo server?
- stingy yahoo server?]
- Stop 'bots using apache, etc. or php?
- Stupid robots cache DNS and not IMS
- Suggestion to help robots and sites coexist a little better
- Tagging a document with language
- Tagging a document with language
- technical descripton[D[D[D
- Test server for robot development?
- test. please ignore.
- test; please ignore
- Thanks!
- That wacky Wobot
- The "Robot and Search Engine FAQ"
- The Big Picture(sm): Visual Browsing in Web and non-Web Databases
- The End of The World (Wide Web) / Part II
- The Internet Archive robot
- The Internet Archive robot (fwd)
- The Internet Archive robot)
- The Letter To End All Letters
- The Metacrawler, Reborn
- the POST myth... a web admin's opinions..
- The Robot And Search Engine FAQ
- The robots mailing list at WebCrawler
- Tim Freeman
- To: ???? Robot
- To: ???? Robot]
- To: ???? Robot
- Too Many Admins (TMA) !!!
- Topic drift (archive robot, copyright...)
- Topic-specific robots
- topical search tool -- help?!
- Try robot...
- tryme
- Tutorial Proposal for WWW95
- un-subcribe
- UN/LINK protocol is standardized! wasn't that quick!
- UN/LINK protocol is standardized! wasn't that quick!
- Unfriendly Lycos , again ...
- Unfriendly robot
- Unfriendly robot at 192.115.187.2
- Unfriendly robot at 205.177.10.2
- Unfriendly robot at 205.252.60.50
- Unfriendly robot owner identified!
- unix robot
- unknown robot
- Unregistered MIME types?
- unscribe
- unsubscibe
- UNSUBSCRIBE ROBOTS
- Unsubscribing from Robots (was "your mail")
- Unusual request - sorry!
- Up to date list of Robots
- Updating Robots
- url locating
- URL measurement studies?
- Use of robots.txt to "check status"?
- User-Agent
- user-agent in Java
- USER_AGENT and Apache 1.2
- USER_AGENT spoofing
- Vacation wars
- VB and robot development
- VB. page grabber...
- verify URL
- Virtual (was: RFC, draft 1)
- Wanted: Web Robot code - C/Perl
- Washington again !!!
- Washington again !!!)
- We need robot information
- We need to Shut down Roenick
- we should help spiders and not say NO!
- Web pages being served from an SQL database
- Web Robot
- Web Robots
- Web robots and gopher space -- two separate worlds
- Web spaces of strange topology. Where?
- web topology
- WebAnalyzer - introduction
- WebCrawler & Excite
- Webfetch
- Welcome to cypherpunks
- What Is wwweb
- What is your favorite search engine - a survey
- What to rate limit/lock on, name or IP address?
- White House/PARC "Leveraging Cyberspace"
- Who sets standards (was Server name in /robots.txt)
- who/what uses robots.txt
- WININET caching
- Wobot?
- word spam
- WRITERS WANTED (re-post)
- www.kollar.com/robots.html
- www.pl?
- wwwbot.pl problem
- xxx.lanl.gov - The thread continues....
- xxx.lanl.gov a real threat?
- xxx.lanl.gov/robots.txt
- yet another robot
- yet another robot, volume 2
- your mail
Last message date: Thu 18 Dec 1997 - 14:33:60 PDT
Archived on: Sun Aug 17 1997 - 19:13:25 PDT
This archive was generated by hypermail 1.02.