[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Spiderspace
Ed Carp writes:
> ... I was under the impression that the only documents that most web crawlers
> will search are documents that are link-accessible. Are you saying that this
> isn't true? Are you saying that Alta-Vista will search EVERYTHING that's
> publicly accessible, whether by anonymous FTP or web?
Ah, but if it hits a site that's set up with a top-level directory
which *does* contain an "index" page but whose server *doesn't*
recognize the index page name, then when you hit the site you
(probably) get one of those server-generated indices. Those things
generally have *everything* in the directory visible (except those
files blocked by the server configuration, usually stuff like emacs
temp files), and so there you go...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Nobody's going to listen to you if you just | Mike McNally ([email protected]) |
| stand there and flap your arms like a fish. | Tivoli Systems, Austin TX |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~