I suppose v7 are those used by grep (there is far less of the stuff
than in Perl). What sh uses is a simple globbing syntax that is easy
to cover, e.g.:
`*' - matches 0 or more characters
`?' - matches exactly 1 character
`[...]' - introduce character ranges, regexp-style
`\' - escapes the next character
So, the string "*" would match any string (even empty ones), "?*"
would match strings with 1 or more characters, "*abc" would match
strings ending with "abc", whereas "abc*" would match strings
beginning with "abc". "[a-z]*" matches a string beginning with a
lower-case letter, and "\**" matches the string beginning with an
asterisk. This is quite logical, and not too hard to implement. You
use this style in your exampes
The grep-style regexps are more powerful, but more complex, and take
more fuss to implement and (for the uninitiated) to use. This is a
partial specification:
`*' - matches 0 or more occurences of the preceding character
`+' - matches 1 or more occurences of the preceding character
`?' - matches 0 or 1 occurences of the preceding character
`^' - matches the beginning of line
`$' - matches the end of line
`(' and `)' - introduce the registers
...etc.
So "abc" would match any string containing "abc" anywhere, just like
"^.*abc.*$" (the first form is much faster too). "^abc" matches a
string beginning with "abc", whereas "abc$" matches a string ending
with "abc". "^(abc)+" matches a string beginning with 1 or more
occurences of "abc". Etc.
Perl has an even more powerful regexp syntax than this.
I would like robots.txt to use the normal shell-style globbing syntax,
since it is much simpler and faster to use.
-- Hrvoje Niksic <hniksic@srce.hr> | Student at FER Zagreb, Croatia --------------------------------+-------------------------------- Contrary to popular belief, Unix is user friendly. It just happens to be selective about who it makes friends with.