[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

DIAL: directed information assembly line




over the past few months I've been intermittently posting some
snippets and fragments of ideas on a new information processing system.
I've codified most of what I was talking about into a semi-formal
protocol description. 

I think this has tremendous potential for
wide applications, and I suspect many of these ideas below are
already being used in many diverse contexts but have not been
unified into a single specification (which I think will increase their
value and use signficantly). I suspect something like the following
is actually going to be a natural, inevitable evolution of the current 
"information infrastructure" and cyberspace-- i.e. something like
the following is going to evolve whether I personally work on it or not,
or whether future designers see this particular essay or not.

anyway, I will try to incorporate comments from correspondents into
future revisions. I would love to put together a group of people
interested in continuing to develop this although that's probably
premature at this point. thanks to everyone who has (unwittingly) 
contributed to this so far based on responding to my earlier essays.

===




DIAL

= introduction
= terminology: flits, floutes & bloxes
= fault tolerance
= tracing
= time estimation
= common bloxes
= implementation
= examples



introduction
==

The industrial revolution was driven by specific technologies.
Similarly, the "information revolution" is now in full pace 
using a different array of new tools. However, it is unlikely
that all information processing tools have been invented yet.
This paper is a preliminary sketch of one such potentially 
significant tool called DIAL.

DIAL stands for "directed information assembly line". This 
document contains a description of this novel information processing
architecture. This proposal outlines the idea and contrasts
it with existing systems, showing how it highlights certain
aspects of data processing that are not specifically addressed in,
but seem to be implied by, other existing tools. 

The system can be viewed variously as
a programming environment, a monitoring system, a revision
control system, a quality assurance and quality control
mechanism, a fault tolerant computing system, 
a workflow (re)engineering technology, a company
intranet routing algorithm, an operating system
based on virtual reality, etc.


terminology
==

DIAL is something like a "dataflow" oriented system. This document
will not describe any specific implementation of the DIAL concepts
(such as giving a language syntax) but instead focus on its 
abstract properties which can be implemented in multiple ways.

There are 3 basic components in the DIAL universe:

"flit" - a flit is a unit of information that has an associated
DIAL state. The information can change over time. One aspect of state
is "location". "flit" stands for "fleeting bit", i.e. a piece of 
data that can "move".  The number of binary bits allocated to a
flit may vary per flit or over time.

"floute" - a floute is a "flit route" or a path that a flit can
take. Conceptually flits move through the flouts. A flout can
be implemented in various ways, such as "last in, first out", 
"first in, last out", a pool of data, etc.

"blox" - a blox is a "black box" or a component that changes the
information content of a flit. Bloxes are connected to floutes.

The DIAL system is recursive in that any of the 
3 basic objects can be contained or encoded
in the 3 objects. For example, flits can contain flits, bloxes
can contain further floutes and bloxes, etc.

Conceptually, a DIAL system is a directed graph, with
nodes called "bloxes", edges called "floutes", and a superimposed
set of things called "flits" that can, over time, "move through" the 
network.

In many cases flits have a natural analogy to messages being
passed through the system.

Bloxes contain a single internal state, like a regular automaton.
They can send requests, and respond to, memory bloxes.

DIAL is unlike a programming language in that "time" is considered
a key property of what it models. Many languages handle the concept
of time implicitly through the use of variables. But programming
languages have a computation-centric view of processing, such that
programs are seen as directing and operating on data. In DIAL, 
data is seen as flowing through components, a data-centric view. 


fault tolerance
==

To be implemented correctly, the state of a DIAL system at
any given time is incorruptable.  All operations have total
integrity.  The system is designed to coordinate unreliable
subprocesses and must itself be reliable.

In DIAL, a blox is roughly analogous to some kind of process.
The process may or not be entirely computational. The blox
models an unreliable process. The process
is activated when a flit approaches the blox from a 
connecting floute, at which point the flit "enters" the 
blox. DIAL handles the protocol of informing the blox (or rather,
the process represented by the blox) of the
presence of the flit. DIAL keeps track of all flits that are 
currently being processed by bloxes. The blox should process
the flit and push the flit into some other floute which signals
it has successfully operated on the flit.  Combinations of
flits at inputs can be processed, and multiple outputs are supported.

The DIAL system allows processing time limit rules to be associated 
with flits, floutes, and bloxes, such as a floute assigning time
limits to flits that move through it, time limits associated with
all flits going through particular bloxes, or time limits attached
to flits. (The system will have a precedence to these rules.)
The motion of flits through floutes is handled by the DIAL system
and is incorruptable. Flits can "pile up" in floutes if not processed
by connecting bloxes as rapidly as they accumulate. All bloxes may have
different amounts of processing times on incoming flits.

When a blox fails to "return" a flit in the time limit, the DIAL
system can be programmed to automatically take particular
countermeasures.  The countermeasure programs are associated with
flits in the same way the expiration time is (via the flit, a 
floute, or a blox, and having precedence rules).

- DIAL can ask the blox, "have you heard of this flit". The blox
can reply, (1) "yes, I am still working on it", or (2) "no, I have not
heard of it". The rules can specify possibilities such as
resubmitting the flit, cancelling the flit processing
and redirecting it elsewhere, propagating other flits into floutes 
(which might represent message(s) sent to "failure controllers"), etc.

- The blox may reply, "the flit has corrupted the blox". This may
happen in systems without transaction integrity. Again rules
can automatically be followed to try to "clean up" the system by
propagating new flits to particular floutes or possibly resubmit 
the flit.

- The blox may not reply. Again, rules for countermeasures can
be programmed into the DIAL system.


tracing
==

All flits have unique IDs that can be traced. At any time a query
can be sent to the DIAL system, "where is so-and-so flit?" and
the system will describe the exact location of the flit. Flits
can never vanish, even when bloxes fail to operate correctly. 
Every flit has a history as well. The flit may 
contain different information at different times. The system allows 
some number of earlier information states of the flit to be accessed.
Information about the past flow-path of the flit and each associated
change in contents (prior and subsequent to entry and exit of a blox)
is available. This could be called a "replay" feature.

In a query of a DIAL system, it may actually reply, "the flit was deleted", 
although its earlier states would still be accessable. Another possible
response is, "the flit moved out of this DIAL system", but again
some number of its penultimate states, while it was still "inside",
would still be accessable. Again, customizable rules determine how
much history is available.

General queries such as "locate all type [y] flits that have not moved
within [x] time period" are supported. Past histories of the flits
that have moved through flouts and bloxes are also available.
The system can support some degree of "global or local rollbacks"
in which prior processing flows are reset, redirected, restarted,
etc.  An ability to locate components based on traffic is supported,
such as floutes where current flit queue lengths are of some
size, etc.

The system allows the assignment of arbitrary version numbers with
particular flit states that are also allowed in queries.


time estimation
==

An implementer of a DIAL system might support specialized queries
called "time estimates". A flit is passed into a system with a special
flag that indicates processing time should be estimated but results
should not be computed. The flit flows as far through the system
as possible and records time estimates as it passes the bloxes.
When it finally emerges, cumulative statistics on the time estimates
that would be associated with an actual processing of the flit
are available to the requester.


Common bloxes
==

- extracter/combiner

Bloxes to extract flits, floutes, or bloxes encoded in flits
are available, as well as to create flits that encode any
of the same objects.

- warehouse

The DIAL system can support a flit warehouse in which all flits are
stored when they are not being propagated elsewhere through the
system. The warehouse is a blox that responds to flit queries
in the form, "move flit [x] into floute [y]".  (The motion of
the flits into and out of the warehouse resembles the checkin
and checkout task of RCS software.)

- create bloxes

In a static DIAL system, all bloxes and floutes are predetermined
and fixed.  In a dynamic system, the bloxes and floutes may 
change over time. This is accomplished by feeding special flits
into bloxes that can create other bloxes and floutes as
their result. Other bloxes can connect them in specified ways.
The dynamic system is far more complex and is reserved for
specialized situations.

- rerouter

A special rerouter blox is useful for dealing with new versions
of other bloxes. A frequent problem that arises with new versions
of software (i.e. a new blox) is that it is incompatible or has
bugs. The rerouter blox is capable of rerouting a request to a
new version of a blox to an older version when problems arise
in the processing.

- tester

Often results output from bloxes are to be tested for consistency
to ensure the bloxes are functioning properly and not returning
spurious results.

- tester/comparer/rerouter

Another useful blox for fault tolerance can take the results of
two other bloxes and compare them for discrepancies. Flits output
from a new version of a blox could be compared with the flits from
the previous version, and automatic actions be taken on any
discrepancy (such as passing through the old version while flagging
the exception). The comparer allows the system designer to more
elegantly deal with regular "upgrades". In fact it is the embodiment
of automated regression testing.

- isolater

Often input of flits is batched into groups, and some flit in the
group may cause problems in the processing, but further detail in
revealing the exact "bad" flit is not available.  The isolater
can automate this process in an algorithm that resembles the 
classic linear search. The input batch into a blox is consecutively split 
into halves by the isolater until the smallest erroneous pieces are 
isolated and hilighted.

- global versioner

Often a system with many bloxes has new versions of the bloxes, and
there is need in globally switching control to new bloxes. This
can be accomplished with the use of a special global versioner
blox.  It can keep track of the different versions of all bloxes
in a given DIAL configuration, and switch between configurations.

- bad blox isolater

Combining many of the previous bloxes, a special system that
automatically isolates new versions of bloxes that fail to 
be backward and/or forward compatible is possible. This is a 
direct implementation of the software development pipeline.


implementation
==

The DIAL system should be implemented graphically and visually.
A corresponding language specification to describe a DIAL network is 
possible and desirable, but a visual or graphical interface to any operation 
should always be possible; likewise a representation of all 
DIAL states should be supported. Ideally, the DIAL universe could be
visualized in a 3d virtual reality, and a person could physically
"grasp" and manipulate all the basic objects (flits, floutes, bloxes).
The one-to-one visual representation of DIAL and its states is a
key aspect of its accessability and usefulness.

There should be in principle no memory limits on the core
DIAL entities: flits, floutes, or bloxes. Limits in implementations
such as "a maximum of 50 queued flits per floute is supported" are
antithetical to the basic design principles. However there may be
various memory limits associated with particular uses of the
entities that model a particular application. 

The DIAL system internally must have many protections that maintain
its internal consistency at all times to prevent corruption of
its state. However, unreliability of all components it models
is allowed (and specifically designed for). 
The states of components are allowed to be 
inconsistent to some degree, such that a state like "blox went 
down" might arise at a random time. Typically in implementations,
the "last known state" of a blox would be tracked, along with 
protocols to retrieve the most recent state and allow resetting
or other manipulations of the state.

Recent new technologies such as the Web and Java may be excellent
environments for implementing aspects of the DIAL specification.
Recent widespread interest and developments in intranets, workflow 
analysis, and reengineering may be very tangibly furthered
through the introduction and use of DIAL systems.


examples
==

1. a DIAL system can be used to model the workflow of a company.
Individual flits can be thought of as documents, and bloxes
are the operations that transform the documents. The history
mechanisms are identical to revision control. Floutes represent
interactions or communication paths between people or departments.

This is the primary "information assembly line" application 
of the DIAL concept. In this system, the "paperwork" cannot be lost
because of the inherent properties of DIAL. In fact it is in these
sitations that history, tracing, and "timeout" mechanisms are most 
valuable. The estimation feature gives an approximation of "how
long will this take when submitted".
 
2. a DIAL system could reflect the internal state of packets
on a network. Individual computers, routers, etc. are seen as bloxes,
and messages are the flits, and floutes are the network routes.
The DIAL tracing features are especially useful in this context.

3. a DIAL system could represent a large, complex software project.
Individual bloxes are the components in the program. The 
versioning capabilities deal with the development pipeline.
Regression testing and the process of moving to new versions is
explicitly built into the system.

4. the DIAL system might represent data being sent over the
World Wide Web. 

5. the DIAL system could represent a distributed computing system
in which the bloxes are spread out over the Internet (the floutes),
and socket communication is used to transport flits.

6. In many debugging situations, "bad data" is detected without any 
knowledge of its history and complex measures are employed to try
to deduce the prior processes that led to it. The history mechanism in 
DIAL allows a user to trace prior states of the flit and find the 
exact point or blox where the corruption occured. Also, the capability of 
putting a "rider" on a flit that detects when it is modified in 
some way is possible.

7. There is a natural correspondence between every computer language
and the DIAL system. Subroutines or arithmetic operations 
are like bloxes, and parameters are passed in floutes. 
Variables are the contents of specific "floutes"
at various points in processing.