[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Islands in the Net



   From: [email protected] (Timothy C. May)

   Language is an example we ought to look at more closely, as both of us
   have noted. In contrast to the "data structures" we love so much,
   natural language is a way of creating a more fluid data structure, a
   more nuanced statement.

The version of language though, that I was referring to were formal
languages, the stuff of DFA's (deterministic finite automata) and
push-down automata.  The advantage here is entirely in their
formality, in that precise interpretations of a formal language can be
made.  A great benefit derives from the explicit formulation of the
semantic scope of particular representation.  A formal language _can_
"mean exactly what I want it to mean, neither less nor more."

The social process of creating these interpretations ("meanings") and
getting everyone to agree upon them, however, can be tortuous.  We in
the ASCII world all agree that the number 65 represents the capital
letter 'A', but the letter 'A' is a further abstraction, albeit
universally shared in the literate world.  Interpretations of data
structures almost universally share this trait; they are reductions of
one abstraction to another.

Two major problems about compatibility can be framed in terms of
formal languages: the need for well-formed data structures and
the coexistence of multiple data structures.

The formal language notion of recognition is merely an algorithm for
set membership, the set being called the "language".  "Is this string
of symbols a member of the language or not?"  Is layman's terms, the
problem is with data corruptions.  While everyone knows data
corruption is a problem, deciding what data is corrupt and what is not
is sometimes difficult; witness the habitual arguments between client
and server writers about whose implementation is wrong.  Even fairly
clear standards like RFC-822 (mail) leave wide holes in
interpretation.

The second problem is less immediately pressing and ultimately more
important.  Given a string of bits, what exactly _does_ it refer to?
One can pass it through all the recognizers one has, but it may still
not be uniquely determined as being a particular kind of data.
Compatibility between data of different types will be of vital
importance to achieve systemic robustness.

Any set of languages, though, can be made compatible by prepending a
common language which acts as a dynamic type specifier.  Unix has
the beginnings of this with its "#!" syntax for picking the
interpreter of an executable.  The problem with the Unix version
of this is that a particular interpretation binary is specified, not
an actual language specification.

   Natural language is often misinterpreted, hence the value of data
   structures. For example, I'm glad my financial accounting at my stock
   broker is handled with robust data structues, but I'm also glad to be
   able to communicate my goals and desires in a natural language.

Well, there's someone somewhere who understands both the formal
language and the natural language; it can be either oneself or an
intermediary.  Now the formal language may be quite flexible and
understandable and admit synonyms, but the contextual nature of human
languages mitigates against their strict interpretation.

One of the real-life characteristics of natural language which isn't
present in computer systems is a way of correcting misunderstandings.
If one person misunderstands another, further conversation can ensue.
If the computer interprets a command differently than the commander
intended, disaster can ensue.  Suppose I want to delete some data and
then I change my mind:

E: Computer, please get rid of this old correspondence.
C: OK, boss, all done.
E: No wait, I need one particular series of those back.
C: Sorry, all gone.
E: What do you mean, "all gone".
C: I destroyed them utterly.
E: Why?
C: You asked.

This stuff has been a theme in SF humor forever.  I find it highly
ironic that the computer industry, so steeped in SF themes, hasn't
thought more about how to alleviate this problem.

As a very basic example, consider the issue of data persistence.  No
standard operating system has at a deep level the notion of "backed-up
data".  The replication and redundancy could take many forms,
including tape, network disk, or data haven.  This particular issue is
going to be an obstacle for the widespread deployment of digital cash.
When a disk crash (hard or soft) means that you lose fungible money,
either the problem gets fixed or the system doesn't propagate.

   What's the common theme? Agents. Chunks of code which also have local
   processing power (brains, knowledge).

I don't think that agents have any relation to the problem of mapping
natural languages to formal languages.  Perhaps you mean something
else by this reference.

   Someone sent me private e-mail on this "Islands in the Net" topic, and
   talked about "payloads of data carrying their own instructions," in
   reference to the Telescript model of agents. (I wish he'd post his
   comments here!) This approach, also typified in some object-oriented
   approaches, seems to be the direction to go.

   > If steel were like software, there would be a knob on each beam that
   > allowed you to change, for example, the balance between hardness and
   > toughness.  Knobs mean random knob-twiddling.

   Actually, such "dynamic buildings" are becoming more common, I hear.

Now add knobs to the thermal expansion coefficients, the densities and
masses, the rates of oxidation, the stress-strain matrix elements,
etc.  If materials engineering were like software, we'd have _both_
nanotechnology and everybody living in trees because they didn't crash
so often.

   But the effect is to increase the "state space" which must be tested,
   and we are led to "testability" and "provable correctness" of
   programs, two interesting areas of programming. So far we've seen
   little application of these ideas to Cypherpunks interests.

Not unexpectedly, since these apply to all software, not just
cryptography software.

   > The more specific inspiration for the general form of the remailer
   > syntax is Jon Bentley's theme of "Little Languages".  

   I'm hopeful that the recent interest in TCL, Safe-TCL [...]

The "little" in little languages might be taken to mean "Not Turing
Complete".  His expository language, as I recall, is the language of
floating point numbers, which, alternately, is the question "how do
you write down a mantissa and an exponent."  Another little language
would be email addresses -- still not completely standardized,
although blessedly mostly so.


   We "locally clear" (approximately the same as "readable on its face")
   cash and commercial paper because of an assumption that forgery is
   difficult and unlikely. When forgery becomes common in some area,
   merchants carry lists of suspected numbers, IDs, etc., and the
   "readable on its face" criterion erodes.

These two are not the same at all!  "Readable on its face" means that
you can actually determine _entirely from the front side of the
document_ what the instrument says.  If there is an inclusion by
reference, then it's not readable on its face.  If there is a
condition external to the instrument, such as a condition of services
rendered, then it's not readable on its face, since some event
external to the instrument determines its value.

"Readable on its face" just means that one knows what is said, _not_
whether one believes it or not.  Those actions which turn the note
into a lie are called "conversions" as a group, and forgery is just
one form of conversion.  (Stealing a note is another.)

With a naive implementation of Chaum's blind signature, all you have
is a string of bits that can be verified only with some public key.
Nowhere in the bits themselves is there an explicit representation of
how much the bill is worth, what currency it's denominated in, when it
expires, who issued it, etc.  These signatures alone are not facially
readable.


   We need to find a way to get back to exploring the various nifty
   systems that are being described in the crypto papers, but which lack
   any real implementation. 

Fandom and enthusiasm will only carry so far in prototyping.  One of
the reasons that the remailers have attracted such interest is that
they do something proximately useful.  The questions of reliability
and utility that are mentioned here really are key to getting more
people trying out stuff.

Eric