[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Anonymous questionnaires





Lucky Green asks how to:

1. Correlate my answers to the answers of my partner.

2. Verify that I have indeed sent in a filled out questionnaire (and send
   me a check for participating).

3. Allow a supervisory agency, such as the U.S. Department of Health and
   Human Services, to verify that the researchers did not just make up all
   the data - that is to allow an audit.

4. Protect my privacy by making it impossible to correlate my name to the
   answers given.


The following a complicated and impractical solution (but it was a fun  
exercise):

First, assume everybody participating in the study is on the Net and is  
crypto savvy. :-)

Each participant generates a new public-key pair for the study.  


The supervisory agency generates a new public-key pair and gives a copy of  
the public key to each participant.  They do not give a copy to the  
researchers.  


The researchers generate a new public-key pair and give a copy of the  
public key to the supervisory agency and each participant.

Finally, each participant generates a symmetric key, blinds it, and has  
the supervisory agency sign the blinded symmetric key.

Ok, assume Bob and Alice are a couple participating in the study.  Bob and  
Alice each get a copy of the questionaire, the researcher's public key,  
and the supervisory agencies' public key.  They each generate and blind a  
symmetric key and have it signed by the supervisory agency.

Bob fills in his copy of the questionaire and then signs an MD5 hash of  
his completed questionaire.  Alice does the same.  Bob gives his signed  
hash value to Alice and Alice gives her signed hash value to Bob.  Bob  
appends Alice's signed hash value to the end of his completed  
questionaire.  Alice appends Bob's signed hash value to the end of her  
completed questionaire.  Neither sees the other's completed questionaire.

Bob now signs his questionaire with his private key.  Alice signs her  
questionaire with her private key.

Bob encrypts his (now signed) questionaire and his public key with his  
symmetric key.  He next encrypts the signed (and now unblinded) symmetric  
key with the supervisory agencies' public key. Finally, he encrypts those  
items, along with a cleartext copy of the completed and signed  
questionaire, with the researcher's public key and e-mails the result to  
the researchers using a chain of anonymous remailers.  :-)

Alice does the same.

Ok, the researches receive an anonymous e-mail message from somebody (call  
him Ted) that is encrypted with their public key generated specifically  
for this study.  They decrypt the message and get four items:  Ted's  
completed and signed questionaire, Ted's encrypted and signed  
questionaire, Ted's encrypted public key, and Ted's encrypted and signed  
symmetric key. 


Since Ted's public key is encrypted with his symmetric key and the  
symmetric key is encrypted with the agencies' public key, the researchers  
cannot read these items.  Also they cannot verify the signature on the  
cleartext copy of the questionaire.  However, they check that everything  
appears to conform to the requirements of the test, so they credit Ted  
with completing the questionaire and e-mail him (via the encrypted reply  
block) an IOU signed by the researcher's private key.  More on the IOU  
later.

The researchers collect all the anonymous replies and send them as a group  
to the supervisory agency.  The supervisory agency decrypts all the  
encrypted symmetric keys using its private key, validates the signatures  
on those keys, then uses the symmetric keys to decrypt the participants'  
public keys and encrypted questionaires.  Since the symmetric keys were  
blinded when the supervisory agency signed them, the agency does not have  
enough information to be able to determine which participant completed  
which questionaire.  All the agency can do is verify that the  
questionaires were completed by people who had symmetric keys signed by  
the agency.  Since the questionaires where e-mailed to the researchers via  
anonymous remailers, the researchers can't collude with the supervisory  
agency to determine who complete which questionaire.

The agency sends the decrypted public keys and questionaires back to the  
researchers.  The purpose of the signed symmetric keys was to help prove  
to the agency that the researchers did not fabricate the study results.   
This is not perfect, the researchers could have pretended to be all of the  
participants and could have filled out all of the questionaires.  However,  
if they did that, they would be unable to produce any real participants,  
if they were ever challenged.

The researchers use the decrypted public keys and the signed MD5 hashes to  
group the questionaires into related pairs.  The researches can compare  
the decrypted questionaires sent back from the agency with the plaintext  
copies received from the participants to verify that the supervisory  
agency did not substitute any of the real questionaires with bogus ones.

The researchers can now analyze the questionaire data, but they don't know  
which participant filled out which questionaire.  However, the researchers  
do know which questionaire is paired with which other questionaire.


More on the IUO:

How does a participant redeem the IUO without revealing information which  
could allow the researchers or the supervisory agency to pair them up with  
their completed questionaire?

Well, the IUO is really a blinded message sent to the researchers in the  
anonymous message along with the other stuff.  If the researches are  
satisfied with the plaintext questionaire, they will sign the blinded IUO  
and send it back via the encrypted reply block.  The participant unblinds  
the signed IUO.  The participant can now redeem the IOU offline without  
giving anyone any information other than the fact the person was a  
participant in the study.

Of course, if there was real anonymous digital cash, there would be no  
need to use an IOU.


How to prevent a totally fabricated study:

As mentioned above, the researchers could fabricate the entire study by  
pretending to be all of the participants, getting known symmetric keys  
signed and so forth.  How can the supervisory agency determine the  
difference between a real anonymous participant and a bogus anonymous  
participant?

It is at this point that we have to step out of cyberspace and back into  
the real world.  Ideally, the supervisory agency needs to determine two  
things: 


    1) All of the participants were real people.

    2) None of the participants colluded with the researchers.


Requirement 1 can be satisfied by having the supervisory agency redeem the  
IOUs using money they escrowed on behalf of the researchers.  When the  
participant comes in to redeem the IOU (or snail mails it in), the  
supervisory agency can check the ID (driver's license, SS#, whatever) of  
the participant, verify the signature on the IOU, and hand over (or mail)  
the check.  The signed IOU will not give the agency the ability to  
determine which questionaire the participant filled out.

I know of no way to enforce requirement 2 without violating the anonymity  
of the participants.  The researchers could hire a bunch of people to  
redeem bogus (but correctly signed) IOUs, fooling the supervisory agency.   
The only way I can think of to prevent participant/researcher collusion is  
to have independent auditors standing over the participants while they  
fill out the questionaires.  Not what Lucky Green had in mind, I'm sure.

So anyways, there it is, a complex and impractical solution that still  
doesn't solve all the problems.  Oh well.  Time to go back and work at my  
real job.


[email protected]