[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Style Analysis



-----BEGIN PGP SIGNED MESSAGE-----
 
- -> Back when [email protected] said....
 
[Stuff deleted, no value judgment implied]
 
The researchers analyzed the frequency distribution of words
found in the works of Shakespeare, and compared them to the other
writers of the day.     I don't recall the results of the
project, but that kind of research would have implications for
anonymous postings.
 
It is not too difficult to see how certain spelling errors, word
frequency (how often do you say 'I':-) choice of wording, and the
working vocabulary of an individual could  allow you to
identify an anonymous poster.  This would be particularly easy if the
individual also posted under their real name.
 
[Stuff deleted, no value judgment implied]
 
This brings up the subject of how one can post without
leaving an "ASCII fingerprint".  I suspect the use of a spelling
checker and grammatical checker would help.    Perhaps running
your text through a language converter, (say English to French)
then back would remove many identifying characteristics.
 
 
Jim Pinson                     Galapagos Islands
PGP key available by finger    [email protected]
 
 
- -> to which I reply:
 
It seems to me that the software to "filter" a message through and
remove anomalies, standardize punctuations and replace words
over 5 letters with more standard words.. etc.. has a kind of
utility.  I particularly like the two sweep translation program
idea.  If enough people used this software it would become
meaningless to attempt this kind of analysis, which looks to
be straightforward enough to give even the persistent
investigator a "gut feel" for the identity of an otherwise
anonymous poster.
It seems that the most solid basis for this kind of message
analysis is non-standard use of grammar, spelling, and
punctuation. I, for example, use too many commas.
Anyone have any information on what factors identify
posters?  Is it just word frequency analysis or...?  It would
be easy enough to correct that.
 
- -uni- (Dark)
 
 
 
-----BEGIN PGP SIGNATURE-----
Version: 2.3
 
iQCVAgUBLNGcxxibHbaiMfO5AQFVIwP+JsuNvRmE1WlFZ7wxvIybg1bTa0FO5/N7
4XrHQ0On1avtoFDjPAmA7dqgrHHscz8LiwYEx1eXx/exOPmZkA2sCg5/AVo61zv6
iBjsqd3o5IgV9L+uXmzl2+OBJ0zpdTyNxiV7VzrKjJqKVlzZgCqbYCB8tN5cOpFj
M3FnGQZfSsg=
=a1Hf
-----END PGP SIGNATURE-----