Eric Armstrong
eric.armstrong@eng.sun.com



Date: Tue, 04 Apr 2000 18:33:25 -0700

From:   Eric Armstrong
eric.armstrong@eng.sun.com
Reply-To: unrev-II@egroups.com

To:     unrev2 unrev-II@onelist.com
Subject:   Knowledge Representation

Jack Park wrote:

So, why not let's enumerate the problems of knowledge representation we want to solve, then start from there.

Thinking more about this.

The goal at this stage of computational development is to *augment* human reasoning, not replace it. So, while I applaud efforts aimed at teaching machines to understand "tree" and "apple", there are simply too many nouns in the world to make that approach useful any time soon. Even if we *do* reach the point where machines can understanding everything, I'm not sure I care. If the machine doesn't make *me* smarter, then I am fundamentally dependent on it in ways I don't like it.

So fundamentally, I want the machinery to act as a tool, as an enabler that helps make me smarter -- one that relieves me of much of the repetitive labor. Along those lines, it doesn't much matter to me if the machines understand the nouns, verbs, adjectives, and adverbs themselves. The most I will ever want to ask of it is that it understands when a word is a noun, and when it is an adjective. At that point I may be able to get it to help identify redundancies and contradictions.

Even before we get to grammatical reasoning, I think there is a layer of machine-assisted reasoning that we can implement in the near term.

When I think about what I do in the design process, it really looks very much like a logical system. There are alternatives (or), aggregations (and), implications (therefore), and negations (not). There may also be syllogistic inferences (a->b & b->c & a => c), as well as contradictions (a & ~a).

There is also a lot of reasoning by analogy, as one of the speakers in Saturday's seminar mentioned. I'm not sure if the system can help us with that, but it would be nice if it could. To reason by analogy now, for a moment, I recall that most people hated outlines in school. The reason? Writing them on paper made them difficult to change. That meant you had to get it all organized in your head first, and most people don't do that.

On the other hand, when people try outlines on a computer, they quickly come to realize that they can easily rearrange things to *build* the organization as they go along. The difference is like night and day. The outline becomes a tool that helps them get organized, instead of an additional impediment.

I am thinking that a tool that assists us with our own reasoning can have a similar kind of benefit. Although we are not much used to dealing with assertions, negations, and implications in our usual discourse, perhaps a tool that really helped us in that area would change our way of reasoning.

In the design process, for example, we start with a problem we are trying to solve -- or possibly several. We go from there to a collection (and) of features -- and we often revise our problem statement in the process. Each feature suggests a set (or) of alternative implementations. Individual implementations and combinations of them imply additional implementation details -- which may in turn suggest new features that are easily derived from the implementation, which may once again cause a reformulation or refinement of the problem statement.

The multiple feedback loops and the need to track implications from one document to the next are tasks we leave up to individual designers. Their numbers are limited, due to the difficulty of integrating large volumes of information and maintaining all the "mental links" necessary to do the job properly. But what if there were a tool that assisted us in that process? Might it be possible for average programmer-jock to perform like super-designer?

Now, for the design process, the "documents" consist of problem statements, functional requirements, functional specs, data structures, design specs, help systems, user guides, and other documentation. When engaged in the reasoning process, though, the mind does not stay rigidly fixed within a single document space -- it happily jumps to the nearest logical node, regardless of the space it is in. So, while thinking about part of the functional spec, the mind may leap to related ideas in the design space, the data structure space, or the requirements space.

That observation imposes several requirements on the system:

  1. It must be possible to capture associated ideas easily and naturally, without having to "change context" to another document to do so.

  2. It must be possible to tag the ideas as "design" "functional spec", or whatever -- and it must be possible to do so after the fact, rather than requiring the author to accurately predict the correct category in advance.

  3. It must be possible to collect all ideas (nodes) of a single type to create a "document". Collect all of the data structure notes, for example, produces at least the initial version of the data structure document.

  4. Since design decisions will typically allow for multiple possibilities, it must be possible to limit the collection of nodes to those that correspond to some other document. For example, it must be possible to collect the data structure notes corresponding to version 12 of Jim's proposed functional spec, which selects some set of the proposed features for implementation.

In summary, I'm seeing mental processes that can be abstracted, and tools that can be constructed to improve them, without any sense of "knowledge processing" on the part of the machine. If the machine is simply a tool for manipulating symbols, and the humans are responsible for interpreting the meaning, that is fine with me. (In fact, I find that preferable.)

Part of the mental process consists of asking questions, adducing alternatives, evaluating the alternatives, and choosing an answer. Those are the kinds of functions that IBIS provides.

But I'm also seeing a need for making strategic proposals that combine a number of alternatives. For example, the question "how do we solve the energy problem" has multiple "alternatives" like "raise prices, use public transportation, improve insulation, make everybody walk, build smaller cars, and catch the wind. Clearly, no one alternative is sufficient. A policy proposal will select several of them, and show how they work together to address the problem at many levels.

I'm also seeing the need to categorize information, as mentioned above. But most importantly, we need to improve our conflict, collision, and contradiction detection. For example, I believe we are *still* subsidizing the tobacco industry, while at the same time suing them and spending millions of dollars on non-smoking campaigns. Is that nuts, or what?

Similarly, we may set up functional requirements for a system such that:

  1. It is small.
b. It is fast. c. It does everything.

But a+b=>~c, and a+c=>~b, while b+c=>~a. So this design is nuts! We need systems that will help us figure these things out sooner in the design process. [Note: Here I've indulged my personal taste for "+" as "and". Unfortunately, that means "^" or "," has to mean "or", and there isn't a lot of wide spread agreement about that. I know that "*" is usually "and" and "+" is usually "or", but I detest that convention. Sorry.)

At some point, too, the system probably has to allow for quantitative thinking, as well as the qualitative thinking outlined so far. So it should be possible to say "smaller than x", "faster than y", "with features a, b, c, d, and e." "Contradictions" then come in shades of gray, with options of relaxing requirements or eliminating features.

Similarly, policy proposals often revolve around quantitative issues. One proposal may be "n billion for anti-smoking ads, m billion for cancer research, x billion for hospitalization, and a nickel nintey-eight for tobacco subsidies". Another proposal might be: "Improve the omega-3 fatty acids in our national diet, since rats who get them high doses can't be given cancer by any means we've been able to find, regardless of the amounts of carcinogens we administer". (Sorry. Wrong soapbox.)

The final comment I'll make is that reduction is an important component of the system. It has to be. When the same question comes into a email list for the 400th time, it must be joined at the hip to other variants of the question, all of which are answered by a set of responses, in the order in which they were found to be most helpful by readers.

'nuff said, for now.

Sincerely,



Eric Armstrong
eric.armstrong@eng.sun.com