Eric Armstrong
eric.armstrong@eng.sun.com



Memorandum

Date: Thu, 04 May 2000 21:39:42 -0700

From:   Eric Armstrong
eric.armstrong@eng.sun.com
Reply-To: unrev-II@egroups.com

To:     unrev2 unrev-II@egroups.com

Subject:   DKR Project Meeting at SRI: 4 May 2000


Major News

Lee Iverson put forward the results of some hard work and deep thinking he's been doing. If we are going to base this thing on XML, he said, then we get a lot of useful mechanisms for manipulating Document Object Model (DOM) trees (basically, the tree-structured form of a document).

From there, he argued, it seems that what we really need is something along the lines of a *Distributed* DOM, or DDOM. Casting the problem in those terms immediately implies the need for node versioning, attribution tracking, and most of the other requirements previously identified for the system (while at the same time providing the mechanisms that can be used to fulfill those requirements).

[[At about this point, you could hear the sound of jaws dropping. Casting the problem as "We need a DDOM" was such a brilliantly concise focus on the fundamental issues that it almost appears obvious in retrospect. But I suspect that the solution to that statement of the problem will be applicable in a wide variety of problem domains.]]

During the discussion of the DDOM concept, Eugene pointed out that identifying the fundamental XML structure was the core issue. Given that, we can translate or transcode other data formats to it. [[The difference is one of persistence. If data starts out in form X, and I translate it to form Y, then forever after it is in form Y. But if I transcode it to form Y, then it stays in form X and is transcoded to form Y whenever I need it.]]

At that point, I observed that Augment data could reasonably be transcoded into the "normal form" used in the repository. That would work because, unlike HTML pages, information segments in the Augment database have fixed IDs -- so adding or modifying data in the repository would not shuffle IDs around. (An innocuous edit to an HTML page, on the other hand, could turn previously created links into gibberish. The links would "succeed", but be be pointing to the wrong paragraphs.)

[[On the way out the door, Lee observed that the whole DDOM could well be built on a Gnutella foundation in order to eliminate the browser/server distinction. A person's repository would then act as server as well as a client application. That is exactly how the ideal system must operate, so that prospect was rather exciting, as well.]]

[[Of course, as Lee commented, we would want to "wrap" the Gnutella interface in a layer of function calls, so that we could replace Gnutella with a different system merely by rewriting the function library (rather than modifying the system). But that is straight forward system-design stuff.]]


Other News

It turns out that Lee has been given the go-ahead to work on this project, so he has been able to devote some quality time to it. [[Yay!]]

He has now installed Zope as well as Slashdot. Unlike Slashdot, he found Zope a breeze to install. The system appears highly usable, as well. So it is a strong candidate for our next discussion tool.

Lee mentioned that the Zope folk are also at work on a "Portal Tool Kit" that would include newsgroups, mail lists, and the Wikis (anyone can edit your web page) collaborative editing system. All of which makes it an interesting vehicle for use and/or hacking.

[[The only other viable candidate for that honor, it would appear, is the JavaCorporate suite of collaboration tools.

(It seems reasonable to rule out Slashdot, PHP/Slashdot, and Arsdigita based on language and installation issues.)

[[Note that where Zope uses Python and acts as its own server, JavaCorporate uses Java and uses the Apache server.

Which is design is more modular and hackable remains to be seen.

[[In either case, it occurs to me now, we have good options for processing XML structures. The w3c DOM standard puts the whole DOM into memory at once -- fine for documents, but pretty horrible when your "DOM" is in fact a disk-resident database. Python has a (badly named) EasySAX model that lets you instantiate only the subtree you need in memory. (SAX is the Simple API for XML. It's another way of processing XML. But "SmallDOM' would have been a better name.) Meanwhile, Java has a JDOM model that lets you do the same thing.]]

Sincerely,



Eric Armstrong
eric.armstrong@eng.sun.com