[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Spiderspace



At 7:22 PM 1/16/96, Mike McNally wrote:
>Ed Carp writes:
> > ... I was under the impression that the only documents that most web
>crawlers
> > will search are documents that are link-accessible.  Are you saying
>that this
> > isn't true?  Are you saying that Alta-Vista will search EVERYTHING that's
> > publicly accessible, whether by anonymous FTP or web?
>
>Ah, but if it hits a site that's set up with a top-level directory
>which *does* contain an "index" page but whose server *doesn't*
>recognize the index page name, then when you hit the site you
>(probably) get one of those server-generated indices.  Those things
>generally have *everything* in the directory visible (except those
>files blocked by the server configuration, usually stuff like emacs
>temp files), and so there you go...

What I've found are a lot of files which are sitting in directories. I'm
not sure I have the terminology down perfectly, but the Alta Vista search
reveals a link, I click on it, and I'm in the fairly common "Web access to
a file system" situation, where I can click on files, directories, move up
and down the directory structure, etc. The files are not "Web documents" in
the sense of having been prepared for the Web (with fancy fonts, pictures,
etc.), but they are certainly fully accessible via the Web.

--Tim

We got computers, we're tapping phone lines, we know that that ain't allowed.
---------:---------:---------:---------:---------:---------:---------:----
Timothy C. May              | Crypto Anarchy: encryption, digital money,
[email protected]  408-728-0152 | anonymous networks, digital pseudonyms, zero
W.A.S.T.E.: Corralitos, CA  | knowledge, reputations, information markets,
Higher Power: 2^756839 - 1  | black markets, collapse of governments.
"National borders aren't even speed bumps on the information superhighway."