[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
a decentralised dejanews
As you'll know if you read some of the previous eternity verbiage I've
generated over the last few days, the design is evolving as ideas are
bounced around.
To recap briefly:
The eternity implementation at the moment is performing the function
of a cache of USENET, which also `reformats' messages to extract and
deliver the content as web documents.
The cache has no replacement policy. Planned replacement policies
where least accessed documents, and/or the document accompanied by
the smallest ecash payment.
Obvious problem:
The cache will fill up and material will be discarded when the
eternity document store grows too large for one machine to cache.
Solution:
My last thoughts on how to fix this was to have collaboration
between the eternity servers: they ask each for documents they
don't have. They act like a distributed raid file server with
redundancy.
Clearly with the "least accessed" replacement policy they'll all
likely cache similar material, and documents will still fall off the
cache.
Ecash payment for cacheing might work reasonably well, if you buy as
much redundancy as you wish by including cash for several servers.
Or perhaps a random cache replacment policy is the simplest to keep
even spread of availability of documents.
k of n secret sharing could be used at some overhead so that no one
site is responsible for serving a given document.
However it is fairly complex to design and implement such a system
which can provide full underdisturbed service with nodes dynamically
leaving and joining, and it is also difficult to protect fully
against denial of service attacks.
Recap on old idea of using dejanews and altavista
You will recall that earlier discussions talked about using dejanews
or altavista as archives to fetch documents from. I rejected this
initial idea because altavista and dejanews take steps to reduce
their disk space usage, by stripping out UU and radix-64 encoded
documents. Now we can fool them easily enough by waging a running
battle of increasingly sophisticated textual stego on our side vs
playing catch up with cleverer eternity document spotters on their
part. We might even win long term if we public key encrypted the
documents for specific eternity servers. Kind of messy. Old
content will be disappearing if they go back and filter out old with
the new filters.
However is another important reason not to use altavista or dejanews:
it's not decentralised. If something sufficiently hot gets
published they'll both be taken off line long enough to figure out
how to disable access to eternity. Even if they can't do it
ultimately because of sufficiently advanced public key text
steganography, it'll disrupt the service.
Viewing the situation more modularly:
The eternity service with a distributed raid file server formed from
the respective caches has one problem: of acceptability. Each
server is actually holding eternity material, it tries to get away
with this by claiming that it is a cache. However each server is
serving three potentially separable functions:
- document cache
- node in a distributed news archival service, but only for
messages which look like eternity articles
- a view on the database to allow it to be viewed interactively as
web documents (fetching and translating decoding eternity
documents to serve as web pages)
The first function -- a document cache could probably just as well
be served by a standard proxy cache. Not a big issue, it's easy
either way.
The second function makes it hard to claim you are `just cacheing
USENET' -- you're also creating a distributed archive of a
very particular subset of it.
And the third function, viewing the database, is somewhat like a
specialised search engine. The search engine functionality is 100%
replicatable at practically zero storage overhead, and pretty good
as a non-censorable element of the system. 100% redundancy of
service each node has nothing that can't be replaced.
A distributed deja-news replacement
So after separating the functionality it becomes apparent that we
could have a another model. Actually try to start a distributed
deja-news replacement, an complete eternal record of _everything_
that _ever_ gets posted to news. Dejanews is kind of doing that,
except that it's not complete -- they are stripping out uuencoded
files, etc.
Also this would be distributed, and harder to coerce.
So what are the storage requirements? What does dejanews have in
their machine room? (Huge raid server? How many Gigs?).
I'm wondering if 1000 academic nodes, small and large ISP nodes, and
indivduals each contributing 1Gb each could out perform deja-news.
That'd be 1Tb. How long would 1Tb last at the being consumed at a
rate of a full USENET feed? As disks get cheaper, next two years or
so, people can upgrade disks to 10Gb disks or whatever is cheap at
the time.
You would want some redundancy, more than redundancy than dejanews
needs in their raid fileserver, because nodes may come and go, and
the number of nodes could fluctuate.
On top of distributed news archive, build an eternity service
So with the eternal news archive, you now have everything you need
to easily build an eternity server.
How practical is this?
How much interest is there likely to be in creating a full USENET
archive as a distributed net based effort. It's clearly got vastly
larger storage requirments than storing the full set of eternity
articles posted to USENET. However it has wider uses, and has lots
of innocuous reasons to exist. I guess it's quite an interesting
project in it's own right. You could start with just some
newsgroups, and build up as a way to boot strap it. But it may be
harder to generate enough interest to deploy this than to deploy
enough eternity servers to do build a distributed archive of just
eternity articles.
How soon could it happen?
Who knows. What do people reckon the interest would be in a
distributed complete USENET archive for it's own sake. For purposes
of the archive cancels would not be applied. Indeed archive the
cancels too, and the control messages for everyone to see for
perpetuity who did and said what.
Best way forward?
Perhaps it will be more realistic to build a distributed archive for
eternity documents only first. If it is designed as a separate service,
then it could be the same software to do either, just restrict it to
alt.anonymous.messages, or just for alt.anonymous.messages passing a
filter program.
Comments?
Adam
--
Have *you* exported RSA today? --> http://www.dcs.ex.ac.uk/~aba/rsa/
print pack"C*",split/\D+/,`echo "16iII*o\U@{$/=$z;[(pop,pop,unpack"H*",<>
)]}\EsMsKsN0[lN*1lK[d2%Sa2/d0<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<J]dsJxp"|dc`