Colloquium at Stanford
The Unfinished Revolution

Date: Tue, 18 Jan 2000 17:49:47 -0800


From:   Eric Armstrong
eric.armstrong@eng.sun.com

To:     unrev2
unrev-II@onelist.com

Subject:   Towards a DKR

The workshop has made it abundantly clear that mankind is faced with deep, significant problems. It is equally clear that the "Dynamic Knowledge Repository" is the only significant place where technology can be leveraged to improve mankind's *ability* to solve complex problems.

We have already greatly expanded our communication and publication technologies, with the result that we have an explosion of information. What is needed is a better way to harness, interact with, and utilize that knowledge base. Hence the importance of a "Dynamic Knowledge Repository".

But what does such a thing look like? How does it work? How will it make a difference?


General Requirements

I suspect that the system will look like a combination of email and a hypertext archive (OHS, anyone?) but with some unusual aspects. To understand what they have to be, and what they are needed, I think it would help to start with a few observations about weakness in our current information systems, which are:

Superfluous

Much of what gets published in the world is superflous, excess noise. A lot of what counts towards the "information explosion" is episodes of Jerry Springer, copies of People magazine, etc. Needless to say, it is important to be able to remove superfluous information from a knowledge repository.

Redundant

Where real information exists, it is frequently presented a dozen different ways. A journal article on the subject of nutrition, an exploratory article in Newsweek, and a one-page digest in Prevention magazine may all have the same fundamental information as their foundation. Each is presented differently, but a single information-model underlies them all. Ideally, that model would be the foundation for a knowldege repository.


Non-Reductive

Email lists provide a great opportunity for discussion and exploration of a subject area. Archives provide a rich mine of background and information pertaining to the subject. But such systems are totally "additive". Every contribution, including summaries of the conversation up till now, becomes an addition to the system.

A knowledge system, though, will need to be "reducible" in the sense that a digest of previous posts should subsume those posts. Those posts should still be available for investigating the accuracy of the reduction, or to get more background information, but they should no longer have a life of their own -- they should be tucked "under" the reduction, so as to simplify the view presented by the archives.

[Note: There are three ways for that reduction to happen, depending on how the repository functions. More on that later.]

The important thing about these observations is that the goal is to create a dynamic *knowledge* system, not merely another *information* system. To do that, at least two other characteristics need to be considered:

Abstractable

Like reducibility, abstractability is a desirable characteristic of the system. It should function in a similar way, in the sense that an abstract representation subsumes specific instances. The difference is that the specific items don't disappear from the view, as they do after a reduction. Instead, the specifics remain in view, allowing investigators to operate at either more general or more specific levels of detail.

Negatable

An important aspect of any knowledge system is that, frequently, what we *think* to be true turns out to be dead wrong. A good knowledge system must have the capability to reduce the knowledge base, not by deletion, but by marking a theory or fact as invalid, and attaching the data to support that conclusion. After all, the conclusion may itself be wrong. It must be possible to hide negated precepts so we can focus on what we "know", but it must at the same time be possible to revisit and possibly revive them.


Knowledge Mathematics

Those are general characterisics for a Dynamic Knowledge Repository. Given those characteristics, it seems likely that the ideal solution would make use of some kind of "abstract knowledge mathematics". Like symbolic logic, a knowledge mathematics would make possible a concise expression of information and simplify the process of abstraction, reduction, and negation. However, while symbolic logic deals with only a few very simple relationships (true, false, if..then) an "abstract knowledge mathematics" requires a seemingly infinite variety of relationships. For example, attempting to model the human nutrition system requires the ability to state relationships like these: improves, requires, enables, is required for, is enabled by, causes, hinders, prevents, manifests as, etc. Such a system would need to be extensible, as well, since it is unlikely that everything which needs to be expressed could possibly be anticipated.

If an "abstract knowledge mathmatics" were created, it would seem likely that it would be very helpful for creating a knowledge repository. One advantage of such a system would be the relative ease with which people who speak different languages could interact with it.

On the other hand, the attempt to create a knowledge mathematics should probably operate in parallel with the building of the first knowledge repository. The problem is too complex, and the problems we face too urgent, to make the solution wait. In the ideal scenario, we would solve some of the most urgent problems in a natural-language system, as the abstract knowledge system is developed. The natural-language system would serve as the model for the abstract system, and provide something to check it's operation against. Subsequent problems could then be attacked in the abstract system.

NOTE:

The way in which reduction and abstraction works in such a system would depend on whether the system uses an abstract knowledge mathematics. If so, the system might be capable of automatic reductions or possibly human-directed reductions where the computer takes care of the details. At the very least, it should be possible to do automated verification of reductions and abstractions that are offered to the system. In a natural language system, on the other hand, such operations must necessarily be manual -- which raises the the necessity for competing reductions and abstractions, with arguments for and against tallied with each, until at last some resolution is achieved.

Specific Requirements

The preceeding thoughts discuss general requirements for an ideal knowledge repository. Each is a guideline. After all, even email archives are giving us tremendous new tools to augment our thinking -- as are email lists like this one, which put us in touch with concepts and ideas from thinkers around the globe. So anything we put together is likely to be of *some* help. Let's get a version 1.0 out, and start a list of things to do in version 2.0.

To get started, it seems helpful to attack a specific problem. Having a real problem to solve provides a concrete referent when building the system, provides a real-world validity test, and hopefully helps to solve a real problem. Even if the system doesn't make much of an impact with it's initial problem (because it's users are system designers rather than domain experts) *attempting* to solve the problem motivates the system extensions that are really needed.

I'll propose the energy problem as a starting point because, (a) that seems to be the most pressing problem we have seen to date and (b) if we can't run the computers there's not a hell of a lot else that we are going to be able to do with our technology.

In looking at a complex problem like the energy crisis, a number of subsystems suggest themselves:

The remainder of this post takes a short look at each subsystem.

Problem Definition

This is a summary view of the problem, outlining it's fundamental characteristics and egregious consequences. The problem statement subumes, and is built on, the fundamental data that identifies the problem. As the data changes, the summary needs to be periodically updated as well. The change in the problem statement over time therefore shows whether or not we are making progress.

Data (time-stamped)

For the energy problem, the data concerns who is using how much, and how much is being generated. The data changes over time, so accumulated data must be time-stamped.

Tactical Possibilities

If we drove less, telecomuted more, lived closer to work, used public transportation more, built better solar panels -- each of these is a tactic that may serve to ameliorate the problem. A tactic, almost by definition, tends to be a one-dimensional "solution" to a problem. The knowledge repository must be able to track such approaches, providing whatever data is available on the potential gain.

Strategic Alternatives

A strategy combines multiple tactical approaches in a unified way. Or at least it pretends to be unified. A strategy might also consist of a single intervention which indirectly motivates the tactical attacks. For example, "doubling gasoline taxes" is a possible strategy which increases telecommuting, motivates people to live closer to work, reduce driving, and take public transporation.

Strategic obstacles and methods for circumventing them must also be identified. For example, a political system in which policy-making is dominated by profit-seeking corporations may have a serious problem in summoning the political will to implement the only strategy which stands to make a difference. If so, that obstacle must be identified as the fundamental, irremedial difficulty that it is, so that discussion can focus on how to modify the political system so that change *becomes* possible. (Or possibly an alterative can be found? We should be so lucky.)

Proposed Models (model construction kit)

Strategic visions are extremely difficult to quantify. Most political discussions therefore resolve around spurious logic and disfunctional rhetoric. A true "knowledge repository" needs something better, however.

One possibility is the capability of supporting strategic alternatives with one or more "proposed models" that shows how it would work.

The system should make model-building easier by providing something in the nature of a "Model Construction Kit". In such a system, you would drag and drop sources, sinks, processes in a 3-dimensional framework that you could rotate and view. You would construct flow lines and feedback loops by dragging lines between the objects.

A system similar to the one used by the Education Objects Economy might be useful here. A Java application might be used to run the models, or perhaps the model would itself be a Java applet that could be run by another investigator.

NOTE:

Like the EOE system, "attribution" is liable to be the most profound motivator for contributions to the knowledge repository. That makes autmomatic, accurate attribution a vital part of the system.

Some models would be conceptual. Others would be quantitative. It should also be possible for a model to be totally conceptual and migrate to a quantitative version in stages, being "operational" in some respect every step of the way.

With such a system, arguments would be more oriented towards the validity of the model, or of the data it is based on. Competing models could be presented, and decisions made on something more than fundamental instinct about the "rightness" of a given strategy.

Feedback

The best argument for or against a given model will be based on real world feedback. If the anticipated result of an intervention is as predicted by the model, then the model will have at least some degree of validity. Although different models may predict different long term effects, at least those which are incorrect in the long term can be negated.

Design Decisions

As in any project, the construction of a model entails a number of design decisions. The ability to track those decisions would also provide a lot of benefit. Most models, for example, need to be simplified in order to be understandable at all. Ideally, the simplifications that were chosen would be recorded in a design journal, along with the reasoning. When the model is questioned at a later date, the original assumptions can be reexamined. If they still seem to be valid, the model can justifiably remain unchanged. If it turns out that the assumption was in error, however, the need for change is clear.

[Note: I've had both kinds of experiences. In some cases the design journal justifies the decision. In others, it points out the long- forgotten reasons, making it possible to easily consent to changes that would otherwise be regarded with deep suspicion.]


Summary

A "Dynamic Knowledge Repository" of some kind is critical to augment mankind's ability to solve urgent, complex problems. It is the one area where technology can be brought to bear on that issue, itself an urgent, complex problem, as Doug has made us vivdly aware. An ideal system will be reducible, abstractable, and negatable. It will also track and attribute contributions in the areas of problem definition, data tracking, tactical possibilities, strategic alternatives, model building, feedback, and tracking design decisions.

Sincerely,

Eric

Eric Armstrong