Kurtz-Fernhout Software


Memorandum

Date: Wed, 05 Apr 2000 07:40:58 -0400

From:   Paul Fernhout
pdfernhout@kurtz-fernhout.com >
Reply-To: unrev-II@egroups.com

To:     unrev-II@egroups.com

Subject:   Knowledge Representation
XML limits


Eric Armstrong wrote:

Ignoring XML and anything remotely like it. What would you use?

The short answer is to at least start with William Kent's ROSE/STAR model defined in his book "Data & Reality" implemented in whatever programming language...

http://www.bkent.net/

http://home.earthlink.net/~billkent/catalogmain.htm#TOPICS

...IN SEMANTIC MODELLING

For something even easier, just start with some tuple-space system...

http://www.cs.yale.edu/Linda/linda.html

...or like IBM's T-spaces....

http://www.almaden.ibm.com/cs/TSpaces/

Gelernter of Linda is the same guy who is involed with LifeStreams.

The slightly longer answer is that any sort of Knowledge Representational system is a sort of AI, even if it is designed for primarily augmenting rather than acting independently.

One needs to do something like create a system with multiple levels of knowledge about knowledge, where a representation at any level can be used to observe, browse, or modify the representation at a lower level (introspection, self reflection?).

For a shallow example, one might have a set of knowledge structure in a DKR which define a GUI interface to interact with the repository, but you might then use that GUI to rework the underlying knowledge structures to define new GUI.

A deeper example entails knowledge structures that define search and summary techniques which produce new knowledge. However, at the same time the lower levels need to generate the behavior of the higher levels, but in a compartmentalized way so that a compartment at the upper level can operate on a different compartment of the lower level than the one that sustains it.

On a practical basis, an implementation of something like this was VisualWorks Smalltalk's (unfortunately killed) "firewall" concept, where a set of development tools could be used to develop new development tools, those tools being behind a "firewall" in a logical (not network) sense.

My longer point is that the knoeledge management / representation problem is a deep one, and XML doesn't address it in a serious way, and confuses the subject by the hype making it sound like XML does address the topic of knowledge representation in a serious way. However, sorry, as an alternative (not to XML, but to lots of hard work mucking in these K.R. fields) I have no easy answer.

There are person-millenium of AI and cognitive science work years at my feet and I can't really point to much of it as being immediately useful. In general, much of it is generally useful for understanding the difficulty of the problem. I'd think Hofstader's work on Fluid Concepts and Creative Analogies is a great source of ideas for a good example of the better stuff.

http://www.psych.indiana.edu/cogsci/hofstadter.html

We have to accept that much work knowledge representation work (creating formats and content) will need to be done on whatever platform we choose. Further, to be practically useful, we need to pick some problems to address which can be addressed relatively easily, but ideally addressable in such a way as the approach can later be revised and expanded.

I think the decision for tools to build with needs to be based more on:

Squeak, Python, Common Lisp (less so) are interesting choices. I'm starting to think Squeak might be the best choice for prototyping (for me) given that it is completely cross-platform and open. It's cross-platform GUI does the best job of addressing the DKR design requirement of shareable screens.

... The point is to create a DKR/OHS flexible enough to deal with this issue of representations changing over time as users needs change.

How?

This is a question AI workers and Knowledge Management workers spend years studying, and more years working on, to make solutions that are incomplete. There is no easy answer yet (if ever).

However, the mainstream AI community is starting to improve in this area. For example, at a talk last year by Marvin Minsky he went on at length about the need for multiple representational strategies for problem solving. He argued the human mind may perceive problems using five or six strategies (ex. geometrical reasoning, formal logic, heuristic rules of thumb, pattern recognition, semantic networks, others) and continuously picks the best one at the time to progress in thinking.

Maybe what we need is a overview of the AI and knowledge management fields and how each area or major problem/topic would affect a DKR/OHS.

Also, what will evolve over time for an OHS/DKR project is a set of useful code that can manipulate data strucutres that are related to knowledge representaion. We might also wish to have a survey of such existing code.

And to take things further, why invent XML when one could instead just use LISP to the same effect (and with less bytes)? Lisp is used all the time to define representations such as:

(user (ID 100001) (name "Grampa Muenster") (address "13 Mockingbird Lane"))

If we're only talking about data representations, this could easily be done in XML. One is not required to use DTD-validation when parsing an XML structure. So one can easily add other name/value pairs, without being constrained by a DTD. The only time a DTD comes into play is when you *want* to enforce restrictions. And there are times when you want that.

True. However, as time goes on, any restrictions will become obsolete. One needs a representational system that can adapt to user needs. And while XML, could be a part of that solution, the important issues go beyond that -- to standards creation and revison and communication, and to coin a phrase "data upgrading".

Lisp can easily parse this which defines a valid LISP list, and we can define LISP programs with related data that validate such representations. That is what AI programmers have been doing for decades.

Certainly. Validating the structure is always possible in the program, with Lisp, XML, or any other data representation. DTD-validation is only a convience that permits taking that burden off the programmer, when it is desirable. There is also an intermediate ground. An DTD can be written so that any number of pairs of name and value elements can occur, so what you see is ID100001address... etc. The DTD can then ensure that the name/value pair restriction is never violated, but the values can be anything we want. Again, validation is not required, so even this restriction does not have to be enforced.

But DTD validation only gets you so far. And I believe with XML DTDs there are some limits to the nature of the structures you can define (which SGML does not have.)

Any library (for Lisp, Smalltalk, etc.) can define a DTD like syntax (or something better) to easily define limits on valid representations.

The deeper issue is that rather than focus on ways to limit representations (DTDs) we need to focus on ways to transform, extend, and simplify representations as needed (sort of along the multi-level approach I mentioned earlier).

However, in my opinion, the investment in learning and using these other systems (Smalltalk, Lisp) is worth it.

No argument there. Maybe some of the flexibility you desire comes from the ease of manipulation in those languages, rather than from the data structures themselves?

Yes. That is big part of the issue. Any DKR/OHS will need to be more than a bunch of passive data in a database. It will need many programs to do things to that data to make new data (search, format, summarize, repackage, interpret, transfer, upgrade, etc.).

A more important issue than data transmission format (the one XML tries to address) is to build a robust platform for doing those algorithmic things. As a first cut one picks a language and does something. As a deeper approach, one tries to represent the knowledge and algorithms in an abstract enough way as to be ideally programming language neutral or at least programming language retargetable (generating whatever code in whatever language as needed).

Sincerely,



Paul Fernhout
Kurtz-Fernhout Software


Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator
http://www.kurtz-fernhout.com