[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ID of anonymous posters via word analysis?



Just reading this list I'm sure it would be fairly clear that word
analysis could be used to identify posters.  Reread a few posts on the
cypherpunks list.  Note who spells out "government" and who abbreviates
to "gov't".  Some people consistently use one or the other.  Count who
uses "though" and who uses "tho".  Also look at who refers to "anonymous
posters" and who talks about "nyms".  I think you will notice some
definent patterns.  Other possible word favoritisms:
cypherpunks/c-punks
cryptography/crypto
cipher/encryption
America/USA
England/UK
baud/bps
DigiCash/digital cash
Internet/"the net"
information/info
Mail/E-mail/Net-mail

Just looking at the above list I'm sure some of you will realize how
much you favor using certain terms, others probably without noticing it.
 Sorting by subject is possible too.  Notice that there is only a
certain group of users who consistently discuss DCnets.  Another group
consistently mentions the IRS, and taxes.  A different group typically
discusses anonymnity and anonymous postings.  Others tend to avoid
certain topics.  Think about your own postings and realize what topics
interest you most.  

I don't think it would be too hard to establish a "text fingerprint" of
people based on what words they use.  Maybe when I have some time I'll
write a program to do it and see how many different patterns/styles I
can identify.

P.S. Also note the variations in text markings to express emphasis. 
Note who CAPITALIZES, *stars* _underscores_ or Capitalizes The First
Letters.