[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Statistical analysis of anonymous databases

To: [email protected] (Clay Olbon II)
Subject: Re: Statistical analysis of anonymous databases
From: Adam Shostack <[email protected]>
Date: Wed, 29 May 1996 13:44:13 -0500 (EST)
Cc: [email protected]
In-Reply-To: <v01540b02add1fc6e4658@[193.239.225.200]> from "Clay Olbon II" at May 29, 96 09:58:55 am
Sender: [email protected]


One solution to this is to have a database that 'generalizes' its
answers as it provides them.  For example, rather than returning 

Clay Olbon, 32, m, left handed, cholesterol 350, bp 200/160, 5'9", 175#, 
it would return:
fooblat martin,25-35, m, left handed, cholest. 3-400, 5.5-6ft, heavy.

researchers could then provide ranges to get answers.  Thus, if I'm
very concerned about the correlation between age and weight, I could
get that information very specifically and nothing else.

The generalization filter could be written to only allow N queries of
a given level of detail, so that the more detail you wanted in one
area, the more you give up in others.

There could be a review comittee (This is the way hospitals & medical
research works) to review requests for more specific data.

Doctors like having names, so you could genrate arbitrary names for
patients, or use a sylable genarator to come up with pronounceable
nonsense.


Adam

Clay Olbon II wrote:

| In medical research (this particular application - there are others I am
| sure) it is desirable to have a large database of individual medical
| histories available to search for correlations, risk factors, etc.  The
| problem, of course, is that many individuals want their medical histories
| kept private.  It is therefore necessary to maintain a database that is not
| traceable back to individuals.  An additional requirement is that people
| must be able to add additional information to their records as it becomes
| available.  The researcher who initially posed the question suggested
| adding random data to "encrypt anonymity".
| 

-- 
"It is seldom that liberty of any kind is lost all at once."
					               -Hume

References:
- Statistical analysis of anonymous databases
  - From: [email protected] (Clay Olbon II)

Prev by Date: Re: [crypto] crypto-protocols for trading card games
Next by Date: Re: What is the strength of the MPJ/Diamond algorithm (Michael Paul Johnson 1989)
Prev by thread: Statistical analysis of anonymous databases
Next by thread: Re: Statistical analysis of anonymous databases
Index(es):
- Date
- Thread