[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

No Subject





Newsgroups: sci.crypt,alt.security,alt.privacy
From: [email protected] (Bruce Schneier)
Subject: "Interesting Stuff" Checkers at the NSA
Message-ID: <[email protected]>
Organization: Chinet - Public Access UNIX
Date: Thu, 19 May 1994 17:40:15 GMT

This is from a flyer that NSA people have been distributing:

     NATIONAL SECURITY AGENCY --  TECHNOLOGY TRANSFER

     Information Sorting and Retrieval by Language or Topic

     Description:  This technique is an extremely simple, fast,
     completely general mathod of sorting and retrieving machine-
     readable text according to language and/or topic.  The
     method is totally independent of the particular languages or
     topics of interest, and relies for guidance solely upon
     exemplars (e.g., existing documents, fragments, etc.)
     provided by the user.  It employs no dictionaries keywords,
     stoplists, stemmings, syntax, semantics, or grammar;
     nevertheless, it is capable of distinguishing among closely
     related toopics (previously considered inseparable) in any
     language, and it can do so even in text containing a great
     many errors (typically 10 - 15% of all characters).  The
     technique can be quickly implemented in software on any
     computer system, from microprocessor to supercomputer, and
     can easily be implemented in inexpensive hardware as well. 
     It is directly scalable to very large data sets (millions of
     documents).

     Commercial Application:

          Language and topic-independent sorting and retieval of
          documents satisfying dynamic criteria defined only by
          existing documents.

          Clustering of topically related documents, with no
          prior knowledge of the languages or topics that may be
          present.  It desired, this activity can automatically
          generate document selectors.

          Specializing sorting tasks, such as identification of
          duuplicate or near-duplicate documents in a large set.

     National Security Agency
     Research and Technology Group - R
     Office of Research and Technology Applications (ORTA)
     9800 Savage Road
     Fort George G. Meade, MD  20755-6000
     (301) 688-0606


If this is the stuff they're giving out to the public, I can only
imagine what they're keeping for themselves.

Bruce

**************************************************************************
* Bruce Schneier
* Counterpane Systems         For a good prime, call 391581 * 2^216193 - 1
* [email protected]
**************************************************************************