Eric Armstrong


Date: Mon, 05 Jun 2000 20:32:54 -0700

From:   Eric Armstrong

To:     unrev2

Subject:   Collaborative Documents Requirements, v0.8

Requirements for a collaborative-document system. (Aprox. equiv to Open HyperDocument System.)


This is a lengthy document aimed at adducing the requirements for a subset of an eventual Dynamic Knowledge Repository (DKR). The subset described is for a collaborative document system, which Doug describes as an "Open HyperDocument System" (OHS). The goal of this document is to show how such a system fits into a DKR framework, detail its requirements, and point to a couple of extensions that move it in the direction of a full DKR.

This document has the following sections:


0.8 Added "Quotable" section and expanded "Catagorizable."
0.7 "Categorizable" section expanded, "Relational" requirement added under "DKR Requirements"
0.6 "Reusable" requirement added after "Hierarchical"
0.5 "Partionable" requirement added under "general system rqmts"
0.4 Use Case Scenarios, April 23, 2000 (not shown as v0.4)
0.3 Formatting, two additions (no reference available)
0.2 Refinements, March 6, 2000
0.1 Initial Version, January 1, 2000

Long-Range Goals

A fully functional DKR will need to manage many different kinds of things:

It is likely, too, that different kinds of problem will required information to be organized in fundamentally different ways. For example, a DKR devoted to the energy problem might have major headings for the problem statement, real world data, tactical possibilities, strategic alternatives, and predictive models. On the other hand, a DKR devoted to building the next-generation DKR might have sections for requirements, design, implementation, testing, bug reports, suggestions, schedules, and future plans.

Since the general outline of a DKR seems to depend on the problem domain it is targeted for, it seems reasonable to focus attention on the elements they have in common.

This set of requirements will focus on what is perhaps the major common feature: Documents -- in particular, Collaborative Documents, and the need to interact via email to construct them.

Other important areas that will need attention include the integration of multimedia objects (including animations, simulations, audio, video, and the like) as well as the critical functions of abstract knowledge representation, inference engines, model-building functions, and the integration of other executable programs. But here, we'll focus on Collaborative Documents.


A wide variety of email and forum-based discussions occur on a host of topics every day. In each of these discussions, important information frequently surfaces, but that information is hard to capture where you need it.

Document production systems, on the other hand, simplify the task of creating complex documents but make it hard to gather and integrate feedback.

For example the DKR discussions have identified several possible starting points for such a system. That kind of feedback occurs naturally in an email system, as opposed to a document production system, but each of the pointers was buried in a separate email. It required lengthy search to gather them together (below), and the list may not even be complete!

To act as a foundation for a DKR, a Collaborative Document System (CDS?) needs to combine the best features of:

Starting Points

In the DKR discussion, we've seen pointers to several possible starting points for such a system. Those are contained in the References post, in the Bootstrap section. (They many possible starting points listed in the post desperately need short synopses and evaluations.)

General Characteristics

The lengthy list of starting points, the difficulty of creating it, and the rapidity with which it goes out of date, combine to suggest several obvious requirements for the system: It needs to be composed of information nodes that are hierarchical, mailable, linkable, and evaluable (more on those subjects in a moment).

Each of those requirements leads in turn to other requirements. The major requirements are listed here and explained below:

General Functional Requirements

General Systemic Requirements:

DKR Requirements

The next three sections discuss those requirements in greater detail. Following that, there are three shorter sections:

General Functional Requirements

These are the general requirements for how the system must operate, to be effective.


This document, like the list of starting points mentioned earlier, is heavily hierarchical in nature -- as are most technical documents. These facts further underscore the need for a hierarchical system.

For example, this email message should exist in outline form. It should be easy to add and remove entries to various sections: for example, the list of starting points given above.

However, the hierarchy should function using XML-sytle "entity references" that copy the target contents into the displayed document, "inline". That permits multiple references to the same node. The result is effectively a lattice of information nodes, where any one view of it is hierarchical.


To be strictly correct, the underlying data structure will be a directed graph. In reality, it will be bidirectional, and it will typically turn out to have cyclic loops. Although it would be nice to avoid that, it is probably unavoidable.

The "network" nature of the graph results from the property that allows a document-segment (node or tree) to be used in multiple places. In each "document" that makes such an access, however, the view is hierarchical. The hierarchy is a view of the graph, and a "document" is really a structured collection of nodes from the data base.

Unlike HTML, where references to other documents occurs only with links, references to other nodes and trees in this system will typically occur as "includes". The effect of the inclusions will be to make the material will appear inline, as though it were part of the original document.


Although "hard" links to objects will be needed at times, in most cases the link to the "Requirements Document" should be a "soft" link -- that is, an indirect link that points to the latest version. That means never having to worry about looking at an old version of the spec.


Each node in the hierarchy needs to be versioned, so that previous information is available. In addition, the task of displaying differences becomes essentially trivial.


It must be possible to "publish" the whole document or sections of it by "posting" it. It must also be possible to create replies for individual sections, and then "post" them all at one time.


At a minimum, every node in the system has two hierarchies descending from it. One is a list of content nodes that comprise the hierarchical document. The other is a list of reviewer comments. (Some comments will be specific to the information in that node, others will be intended as general comments for that section of the document.)

Other sub-element lists may found to be desirable in the future, so the system should be "open-ended" in allowing other sublists to be added, identified, and accessed.


Rather than using a central "repository", the system should employ the major strengths of email systems, namely: fast access on local systems and the robust nature of the system as a result of having redundant copies on many different systems. The system will be more space intensive than email systems, but storage costs are dropping precipitously, and future technologies paint an even brighter picture.


To mitigate the short-term need for storage space, it should be possible to set individual storage policies. For example, a user will most likely not want to keep previous versions of any documents they are not personally involved in authoring.

It must also be possible to add names to the authoring list. Name removal should probably be limited to the original author. For those cases when the original author is no longer part of the system, it should be possible to make a copy of the document and name a new primary author.


When a new version of a document arrives, differences are highlighted. Old-version information becomes accessible through links (if saved). Differences are always against the last version that was visited. If a section of the document was never visited, the most recent version of that section is displayed on the first visit. If several iterations have taken place since the last visit, the cumulative differences are shown. (Again, node-versioning makes this user-friendly feature fairly trivial.)

Starting Points

XMLTreeDiff at IBM Alphaworks (Lars Martin)


Clearly support for web links is desirable, as shown by the links to the various possible starting points in the References post. [Note: Each of those should be evaluated against this requirements list, and used to modify these requirements.]

Indirect links are needed, both to link to a list of related nodes, and to link to the latest version of a node.


It must be possible to categorize nodes (and possibly links). For IBIS-style discussions, for example, node types include (at a minimum) question, alternative, pro, con, endorsement, and decision.

For material that is included "in line" in the original document, typing implies the ability to choose which kinds of linked-information to include. For example, in addition to the current version, one might choose to display previous versions and/or all commentary.

For material that is displayed in separate windows, typing allows the secondary windows to automatically display material of a given type. (For example, in Rod Welch's "contract alignment" example, the secondary window might automatically display the meeting minutes that are linked to particular phrases in a contract. Lines might be automatically drawn from sections of the minutes to sections of the contract. Other links in the documents, however, would be ignored.

The Traction system probably presents that most clearly-thought out and well-implemented approach to Categories. In that system, categories are implemented as lists. When a category is applied to a node, the node acquires a link to the list, and also becomes a member of it. The fact that nodes are members of category lists allows efficient searches. The fact that each node links to the categories it belongs to allows all of the nodes categories to be displayed in a list (to the right of the paragraph, in Traction, in a light blue color).

In Traction, categories can also be hierarchical. The colon- convention is used to separate categories, as in "logic:assert" or "logic:deny". Categories can also be changed in that system. In the demo that Chris Nuzum was kind enough to give me, he used the example of "ToDo" changing to "Feature:Scheduled" and "Bug:Open". When you invoke the change operation, all of the nodes currently marked "ToDo" are listed, and flagged as "subject to the change". You can then uncheck any nodes the change does not apply to before performing the operation. Then, when you change the remaining "ToDo" nodes, the list is all set to carry out the change.

In addition to those features, Traction realized that the impact of changes could be large, so they included an *audit trail* for every change. When a node is recategorized, the date, time, and author of the change are recorded. It may also be possible to undo such changes, though I'm not sure. But the important point is that changes in such a system can generate a significant amount of confustion. The audit trail makes it possible to see what happened. It would also be helpful to identify folks you would rather not have messing around in your data base.

To summarize, then, the requirements for the proper handling of categories, are:


It should be possible to construct an initial design document using queries of the form "give me all design notes corresponding to the features we decided to implement in the current version of the functional specification.


The many possible starting points in the References list highlights the need for evaluablility. It should be possible, not only to reply with a comment on any item in those lists, but also to add an evaluation, much as keeps evaluations for books. That feature is arguably their greatest contribution to ecommerce, and the DKR should make use of it. It should also be possible to order list items using relative evaluations. That lets the most promising starting point float to the top of the list.

Not all lists should be ordered by evaluation, however. For example, the sequence of requirements has been chosen to provide the most natural "bridge" from one to the next. So evaluation-ordering must be an option.

Ideally, it should also be possible to "weight" an evaluation, perhaps by adding a "yay" or "nay" to an existing evaluation.

When displaying an evaluation, where evaluators can choose a value from 1..5, it might make sense to display the average, the number of evaluations, and the distribution. A distribution like

10 2 1 2 10

...for example, would show a highly polarized response, even though the "average" was 3.

Starting Points


The system must increase the ability of multiple people, working collaboratively, to generate up to date and accurate revisions.

For any given document, there are several classes of interaction:

The first group consists of people who receive the document and do nothing else with it. (Just trying to be complete here.) The second group consists of people who send back comments on different sections. That feedback will typically be used in future versions.

The 3rd group consists of people who suggest an alternative wording or organization. Those "suggestions" take the form of a modified copy of the original. One of the document authors may then agree to use that formulation in place of the original, or may simply keep it as commentary.

The 4th group consists of the fully-collaborative authoring group. The original author must be able to add other individuals to the document, or to subsections of it. (An author registered for a given node has authoring privileges throughout the hierarchy anchored at that node.)


Every information node that is created should be automatically attributed to it's author. When a new version of a node is created, all of the people who sent comments should be contained in a "reviewer" list. When a suggestion is accepted, the author of the suggested node should go into a "contributor" list in the parent node and be added to the "author" list for the current node. It should be possible to identify all of the reviewers, contributors, and authors for the whole document and for each section of it.


In addition to being able to add commentary to existing documents, the user must be able to easily quote from existing documents when creating new ones.

Internally, the quotations will appear as a link (for example, using the w3c XInclude specification). But the quoted material will appear "inline" in the new document. The link, in this case, will be a "hard link". That is, when newer versions of the text are created, the link will not point to them, but will instead point to the original version. The fact that newer versions exist, however, will be reflected in the display (explained next).

When displayed, quoted material will be automatically attributed, and followed by a link to the original source node, in its original context. If that material has changed, that link will be flagged as "older", and a link to the newer version will also be presented. (The document's author(s) will then have the option of using the newer version in place of the original.)

Design Note:

If the system is truly a network (a node can exist in multiple contexts), then the pointer must point not only to the node, but also to it's parent context, so that the link goes to the document the node was quoted from. On the other hand, if the system is not really a network (but only appears to be one through the action of inclusion operations like quoting, then the system must be prepared to handle "pointers to pointers". In other words, if the node appeared in document A, and it was quoted in document B, then when constructing Document C, quoting the same text from document B will construct a link (pointer) in C to the pointer (virtual node?) in B that points to A. The "context" of the node, in that case, must be B, and not A.


When new versions of a document are created, material would be included by pointing to it, keeping attributions intact. The system must accelerate that process. It should be possible to start a new document in one of two ways:

General Systemic Requirements

These are requirements for the system as a whole.


The system must be "open" in the sense that a user is not constrained to using a particular editor, email system, or central server. The specifications for interaction with the system should be freely available, along with a reference implementation to use as a basis. As much as possible, conformance with existing standards (XML, XHTML, HTTP, email) is desirable. (The tricky decisions, of course, will be between required features and standard protocols that don't support them.)


The server and client systems that implement the DKR must also be fully *extensible*. In other words, the same characteristics of hierarchy, versioning, and revisability (use of most recent version) that apply to the documents must apply to the system itself.

That extensibility can be accomplished with a "dispatch table" that names the class to use for each kind of object that needs to be created. In conjunction with open sourcing, that architecture allows a user to extend (subclass) an existing class and then use the extended version in place of the original. In addition, upgrades can occur dynamically, while the system is in operation, while allowing for modular downgrades when extensions don't work out.

Starting Points


Security in such a system becomes an issue, unfortunately. The system should employ whatever mechanisms exist or can be constructed to help prevent trojan horse attacks, back door attacks, and other security breaches in an open source system.

For example, Christine Peterson described Apache's process as having something like 45 reviewers, 3 of whom recommend the inclusion and none of whom object, before new code is added to the system.


Email is fundamentally the right interface for such a system, because information comes to you, the information is organized into threads, and you can edit/reply from within the same application you use to view the information.

(Email's major weaknesses stem from the fact that even though the interface is appropriate, the underlying data structures are not. But the hierarchy inherent in the specified system will rectify those flaws, eliminating the redundancy inherent in email responses and allowing for thread-summaries.)

However, the factor that makes email central to one's daily activities is the wide variety of inputs you receive. Email is inherently "project neutral". You get email on every topic under the sun, including personal and professional interests. It represents "one stop shopping" for your information needs. (The Web, on the other hand, provides nicer storefronts, but you have to go visit the store to find what you want.)

In a sense, the "firewall" requirement is in itself a partition. In an organization like the Standford Research Center (SRI), for example, there is a need to create a project-specific partition, so that only only other members of the project team ever see that information. On the other hand, there is a wide area of shared expertise (computer expertise, management expertise, administrative expertise) that can be shared among all members of the organization.

In a similar vein, the "email interface model" implies the need for multiple partitions -- one for each project or interest area, for example. The degree to which you "cross-fertilize" between the partitions should then be up to you.

Looking Ahead: Some DKR Requirements

These additional requirements begin to move the system towards a DKR.


With respect to security, there is also the issue of "firewall" capability. The DKR must allow professionals in many different organizations to contribute and share knowledge. That knowledge may largely be in the form of published papers and the means to locate and access them, but it represents a high-degree of inter-organizational co-operation, at the level of the individual professional.

The DKR will also be handy for individual projects, though. The mechanisms will support collaborative designs and "on demand" education as to corporate procedures, for example. But that information must remain *inside* the firewall, inaccessible to competitors.

In the ideal scenario, it will also be possible to "publish" information stored in the inner repository at strategic times, rather like publishing a technical paper that gives the design of the system. But until then, the firewall must remain intact.


It must be possible to add *relations* as first-class objects in the system, where a "first class" object is one that can be observed and manipulated like any other node in the system. Such relations will make it possible to link nodes in interesting ways, make it possible to add new connections over time, and allow for some forms of automated reasoning (or at least, " reasoning assistance"). In conjunction with categories, the addition of relations is likely to be the most important step in converting the system into a true DKR, of the kind that Jack Park describes.

Relations should work like much like categories, with the capacity for adding and changing relations, while keeping an audit trail of the modifications. However, while categories apply to single nodes, relations relate pairs of nodes, at a minimum, or possibly multiple nodes at one time. As Dewain Delp observed, the repository of information nodes in the system is more properly described as a "network", rather than a "hierarchy", because a single node may be simultaneously part of several document structures. (Even though any one view will most probably (and valuably) be hierarchical.) With the advent of relations, the system is immediately and obvisouly a true network.

An equivalence relation, for example, could be used to relate a new question to an existing thread. The sender of the question, now alerted to the equivalence relation, can then readily inspect the answers that have been previously been given. (There are likely to be several answers in the system. By giving high marks to the answer(s) that were found to be most helpful, the best answers "float to the top" in an organic, evolving FAQ.)

Another useful relation is "implies". The ability to add implications to the system lets the user create connections between nodes. The inverse of that relation (implied by) allows a user to trace back the rasion d'etre for a given node. In a software design network, implications allow functional requirements to be linked to each other and to design requirements, which can then be linked to specifications, and from there to code. If "not" is introduced at any stage (as in, "we can't do this) then the proposal under attack can be traced back to its roots -- with alternatives available at each stage. If the design proposal is invalid for example, perhaps one of the design alternatives that has been discussed will be usable. Failing that, the functional requirement can be reconsidered, etc.

The ability to add relations will provide the kind of "alignment" that Rod Welch talks about -- the ability to thread document sections together so that, for example, a section of a contract can be threaded back to the email discussions that prompted it, making it easier to ensure that the final contract accurately reflects the desired goals.

Although users can add relations at will, it makes sense for the system to come with a "starter set" of standard relations that everyone uses by convention. That initial set can come from the fields of logic, mathematics, and abstract reasoning:

Didactic (DKR)

Eventually, the system must become a *teaching* tool. It must follow the concept of "Education on Demand", intelligently supplying the user with the information needed, and educating that user, whatever their initial background. (Within reasonable limits.)

Outline of Operational Requirements

This is an outline of functional operations for the system:

Data Structure Requirements

Each node in the system should be able to track the following information:

Data Structures, v0.0

__prev. version __

Eric Armstrong

This document represents a preliminary outline of the data structures for the collaborative document system known as OHS -- Open Hyperdocument System. The data structures described here are represented in XML, since that is most likely the format we will use for interchange.


Since we have not yet worked through the use cases, this document is necessarily speculative. It also needs to be checked against the requirements document, since I have undoubtedly overlooked a few things.


Work through use cases.

Check against requirements document.


The data structure design has two primary goals. The first is to serve as a "normal form" for documents that come from multiple sources, such as:

The second goal is to make the design of the data structures as simple as possible. Real simplicity is obviously impossible, however. The large number of interacting requirements makes it unattainable. What is attainable, though, is regularity. It is hoped that the entire system can be built up from a small number of "atomic" nodes, each of which has a regular, consistent structure. With that in mind, we start by defining a "template" for nodes in the system.

Basic Node Template

Here is the template for the basic nodes in the system (explanations follow):






The nodetype value is one of several predefined possibilities that are defined for the system. The initial set of node types consists of:

VIRNODE (virtual node)

Since the system must support versioning, some mechanism is needed for indirect linking. In some cases, of course, a link might be created that points directly to an existing node. But in cases where the most recent version of a node is the target of the link, the link must point to a "virtual" node, which contains a link to the real information. When pursuing a link, the system will automatically step through virtual nodes to retreive the real information.


It is likely that there will be a large number of indirect links of this type. The result, if printed, will be a system that defies comprehension by the human mind. Computers, on the other hand, will be able to handle the complexity quite easily. While computers will be unable to comprehend the content of the nodes in the forseeable future, they will be able to present that content to human users in way that makes the information interactively usable. The result is a man/machine symbiosis intended to augment human thinking in ways that will increase our ability to collaboratively investigate and solve complex problems.


INFNODE (information node)

Information nodes are the primary content carriers in the system. Every paragraph and heading in a document converts to a single information node in the system, under the principle that (in theory, at least) each paragraph expresses one main idea.


STRLIST (structure node)

This node serves as the list header for a node's substructure. For a heading in a document like this, the heading is represented by an INFNODE. The text of the heading is the heading's CONTENT. One of the LISTS under that heading is contains the introductory paragraphs and subheadings that are directly under that heading. The head of that (substructure) list is a STRLIST. (The separation between CONTENT, DATA, and LISTS will be discussed in the next section, along with the need for distinguishing them.) [TBD: Do the contents of the STRLIST need to be STRNODES, or can they be INFNODES?? (STRNODES may be desirable. See discussion of styles in the next to last section.)]


RTGNODE (rating node)

A rating node captures the sum of evaluations under it. It's DATA segment contains the average of ratings below it, as well as the number of entries. For the list header, the CONTENT section is empty. For entries in the rating list, the CONTENT section could contain a single-paragraph explanation for the rating. Deeper discussions would require a structure list. The CONTENT section could then be used for a one-paragraph summary. [TBD: Alternative: The INFNODE adds RATING_AVERAGE and RATING_NUMBER elements to it's DATA section. RTGNODEs are then used to add a rating to a node. The DATA section of a RTGNODE then contains a RATING element, only.]


CATNODE (category node)

A category node defines an information category in the system. For an IBIS-style conversation, for example, categories include question, alternative, argument:for, and argument:against. Since a node can be categorized multiple ways, one of the a node's LISTS delineates the categories to which it belongs. That list will consist of pointers to category nodes (or possibly the virtual place holder for that node). Similarly, the category node will contain a list of pointers to nodes that belong to that category, allowing for fast search and compare operations.


CATLIST (category list)

The category list is the header for the list of categories an INFNODE belongs to. That list contains pointers to CATNODES. [TBD: Is this nodetype necessary, or can it be eliminated by using INFNODEs for category defintions and using CATNODE as the list header?? What will the pointers look like, and do they need a separate node type??]


RELNODE / RELLIST (relation node/list)

The ability to define relations in the system requires a node that encapsulates the relation. Since a node can be part of multiple relationships, a list is required. A RELNODE will be very similar to a CATNODE, except that some relationships are one-way. That implies a need for two LISTS under a RELNODE -- a "from" list, and a "to list". [Details TBD]


STYNODE (style node) [TBD]

This is a highly provisional concept. But when we investigate the concept of "structure" under a node, we begin to see that the subheadings and lists under a heading, for example, can really use some rudimentary style information. In a very real sense, such "style" attributes capture important information. When a numbered list is used, for example, it indicates a seriality: item #3 follows item #2, which follows #1. A bullet list indicates a "parallel" construction, where the items can be considered in any order. Although such information is typically thought of as "formatting", it also represents valuable information in the system. (In addition, the ability to serve as a normal form for HTML/xHTML documents and DocBook articles may well require some kind of style-related capabilities.)


The next section discusses the reason for segmenting the node into content, data, and lists sections.

Structure of the Basic Node

The structure of the basic node has been divided into 3 sections: CONTENT,

DATA, and LISTS. This section explains why. But first, a word about the basic node attributes.


At a bare minimum, a node needs an ID attribute so that other nodes can link to it unambigously: <nodetype id="identifying-value">. The ID value must be unique within the system. Note, however, that it is not globally unique. Since we intend to implement a distributed document object model (DDOM), that value will be shared across every system that has a copy of the node.


Element attributes are reserved for invisible, non-extensible aspects of nodes. The ID attribute, for example, is one which is used internally in the system, but is never exposed to the user, nor can the user add new attributes. Information which is represented in some visible form, which can be manipulated by the user, or which can be added by the user, is encoded in the DATA section of a node.

Rationale for the Three-Part Structure (Content/Data/Lists)

Now at last, we get to the discussion of the node's structure. The need for the separation into content and structure arises from a major deficiency in XML, for our purposes. While XML is wonderful in many ways, it has no mechanism for distinguishing "tags which convey style information" from "tags which represent structure". For example, consider the following hypothetical segment of document markup:

<h2>An <i>Important</i> Heading
   <p>An introductory paragraph</p>

<h3>A SubHeading

<p>A paragraph under the subheading</p>



This segment represents the kind of thing you would like to do with XML. However, XML's validation mechanisms don't allow you to restrict an XML document to a form like this.

The things to note about this format are:

It is the last observation in that list which is impossible to describe in XML. There is no way to say "text and style tags are allowed up until the first structure tag is seen, and no thereafter". There is not, in fact, any way to distinguish style tags from structure tags at all.

The inability to make such restrictions means that the "mixed content model" (text plus tags) which can be defined in XML would allow text and style tags to occur between the paragraph element (<p>...</p>) and the <h3> element, for example. Since that text would not be enclosed in any structure, it would be completely ambiguous. There is no way for a program to know what to do with it. Although programs could be required to detect such errors, that requirement defeats the automatic verification that constitutes one of XML's major advantages.

DocBook Solution

The solution to the problem is to introduce another layer of structure: a CONTENT layer. That is the solution used by DocBook when defining it's section elements. In DocBook, a section element contains a title element, like this:


<TITLE>...title text here...</TITLE>






Rather than having the text of the section heading belong to the section element directly, DocBook introduces the TITLE tag to hold it. That addition distinguishes the content of the heading from the structure under it, and allows the structure to be automatically validated for correctness.

The cost, however, is one of program complexity -- especially with respect to editors. The added element prevents a "natural" mapping of the structure to a display. Without that added element, you could simply display the XML as a tree and allow it to be edited. The result would be an outline-version of the document, with various inline elements like bold and italic adding a bit of style.

However, the addition of the extra element prevents any such natural mapping. The editor must now understand that the content of a <SECT1> element is, in reality, in it's TITLE element. And when the text is edited, it must know to store the change in the TITLE element, rather than as part of the SECT1 element.

The reason for taking the trouble to explain all this is to observe that if an mechanism similar to XML were found or constructed, that allowed structure tags to be distinguished from content tags, then editors could operate on the documents much more naturally and easily, without having to understand specific semantics like SECT1/TITLE.


We could always define a type attribute: <tag type="style"> or <tag type="struct">. We could then require that implementing programs verify that after the first node of type struct is seen, no more text or style tags are allowed. We could define the type so that it defaults to "style". That would prevent it from having to be specified as part of <b> and <i> tags, for example. However, that still shifts the burden of validation to every program that attempts to interact with the system, rather than allowing XML mechanisms to do it automatically.

Content Element

An unstated assumption in the foregoing analysis is that the CONTENT section will be able to contain various style elements like bold and italic (<b>, <i>). Such tags are important because they impart information, as well as formatting that makes the text more readable.


In some cases, italics means italics. For example, in this sentence italic formatting is added for emphasis, so the text reads as you would hear it if I spoke it. In other cases, italics is a way of imparting information. For example the first use of a new term is typically italicized, in order to indicate that the current paragraph supplies a definition. In such cases, it undoubtedly makes sense to use XML's capacity to define new tags. For example, the <gls> tag might be defined to highlight glossary terms in those paragraphs where a definition (or part of a definition) is supplied. [TBD: Mechanisms by which new tags are added to the system, mechanisms for defaulting their rendering (e.g. "gls" => italics), and mechanisms for custominsing the rendering in the client browser.]

Elements like bold and italics are defined in xHTML's Document Type Definition (DTD), as inline tags. That definition forms the basis for the "mixed content" (text and tags) definition of CONTENT, as it would defined in a DTD for the system:
[ToDo: Insert the full list of inline tags from the slide showing HTML style vs. structure tags]
[ToDo: Compare with the list of inline tags defined in the xHTML DTD.]

(PCDATA | inline)*


"PCDATA" means "parsed character data". In other words, it is text -- data that is going to be read and parsed (inspected) for tags and other entities, unlike pure "CDATA", which is never parsed. (CDATA is like HTML's "preformatted" element -- <pre> -- only more so: nothing inside of a CDATA section is ever interpreted. So while <b> in an HTML <pre> element would cause the text to be bold, in an XML CDATA section, it causes nothing, and would simply be displayed as "<b>".

Data Element

The DATA element provides a location for storing the data associated with a node. For a rating node, for example, data elements would include the <average> and <number> elements, containing the average rating value and the number of ratings it was constructed from. (More sophisticated systems might include ratings of individuals, which would then weight the results of their evaluations when constructing the "average". That is left for a future exercise.

Data for a virtual node could include links to the previous and following versions of the document. That information needs to be stored somewhere -- the virtual node seems to be the natural place to put it.

[TBD: Are CONTENT and DATA both required, or can one be eliminated?? (For some reason, in my original notes it seemed necessary to have both. At the moment, I don't see the justification for that, but I'm loathe to remove the distinction until I'm 100% sure it's not necessary.]

Lists Element

The lists element contains all of the sublists associated wtih a node. Those lists include:

Style Considerations

Having opened the pandora's box of style considerations with inline tags in CONTENT elements. It makes sense to consider the potential for defining styles for the nodes that constitute substructure. As previously noted, there is some level at which "style" constitues information: whether a list of items is ordered or unordered (bulleted), ir whether it is a list of headings, or a list of plain paragraphs. On the other hand, there is also a sense in which structure-style is definitely format-related. Format considerations include the size of the font to use, the particular bullet for an unordered list, or the kind of enumeration for an ordered list (numbers, alphabetics, roman numerals, etc.)

There would seem to be a good case for preserving the information-content of styles in the system, while allowing the actual formatting to be determined when the information is displayed or printed. The primary reason for allowing the actual format to be determined at "run time" (when displayed or printed) is the fact that the system as presently conceived is far more dynamic than the document systems we are used to. In an HTML document, for example, an H2 element is place under an H1, and there it sits. The author can declare it "H2", because it sits under an "H1" -- the levels are static.

In the proposed system, however, "levels" are much more dynamic. One document may include whole sections of another document. Tthe heading may now be at level 3 or level 4 in the new context, instead of at level 2. The ability to reuse nodes therefore defines a "dynamic context" for a node, rather than the static context defined by a traditional document. That being the case, itis unwise to specify the format for node as, for example, h2. Instead, the node can be identified simply as a heading. When actually displayed, the heading can be displayed using the font and size appropriate for it's currrent context.


The ability to dynamically dictate display format implies some form of stylesheet.capability. A stylesheet would let you specify the font sizes for headings at different levels, the bullet types for unordered lists at different levels, and the numbering style for ordered lists at different levels. So, for example, a bullet list might use dots at the first level, and diamonds for bullet entries under it. Numbered lists might use numbers at the first level, then lowercase alphabetics, then lowercase roman numerals. [TBD: How/where stylesheets defined and used??]

[TBD: Which of several possible methods for encoding the style information, using some combination of new element types, new list types, and new attributes. One possibility is to use STRNODE as the header for a list of INFNODES. The STRNODE can then encode the type of entries in that list: paragraphs, ordered list, bullet list, or heading. In keeping with the principle that information which affects the display is encoded as data, that information would be stored in the DATA section of the STRNODE. The STRLIST, in turn, would consist of a list of STRNODES.]

Links and Inclusions

It is clear that the system needs to take advantage of XML's ability to "include" referenced nodes inline, rather than merely linking to them. XML's XInclude mechanism can be used for that purpose.

[TBD: It may be that the ability to specify inclusion vs. linking eliminates the need for typed links. Or it may be that the category information contained in the target of a link is sufficient for typing a link. Or it may be that typed links are really needed. For example, a typed link could distinguish a document pointer from a pointer to an author's home page. Typed links could be color code or shifted around on the page. In addition, using typed links could give the user control over the display. Links of one type might show up like HTML links. Another type (say "comments") could be displayed inline, as though they were part of the original document. Or they could be made to appear in a separate window that was automatically synchronized with the content window. Such dynamic, interactive control would be enabled by typed links.]


Adding a concept of structural style to the system starts a path towards the slippery slope that HTML struggled with. At the outset, it contained only lists and headings and such. But the need for tables soon became apparent. (A need for frames, forms, and a dozen other things also seemed necessary for HTML, but one hopes that the sytstem can avoid those complexities. It has enough of its own.)

Like hierarchical structures, tables are familiar ways of organizing information. Any system that hopes to improve mankind's ability to deal with complexity must make allowance for tables, in one form or another. Interestingly, the capabilities of the system defined thus far allow an interesting table-facility to be implemented as a display mechanism, without adding any new structuring information to the system.

The existence of categories provides a wonderful opportunity to define dynamic tables. In essence, every column in the table represents a category. The first column in the table is a list of information nodes. Cells in the remaining columns are filled with the nodes that are linked-to or in a relation with the first-column nodes. [TBD: Can links be used for this?? If so, typed links are needed to distinguish simple references from links that participate in the table. Or is it better to use relations??]

Using categories, lists, and relations (or links) would allow tables to be constructed "on the fly", from information stored in the repository. In essence, such a table would be a "view" of the system. [TBD: How to enshrine that view as a "document", and transmit it to others so they can see the organization you perceive.]

Using category-based tables would make it possible, for example, to view source code and comments on that code side by side. Similarly, documents could be viewed next to the comments and ratings they received. [TBD: The information could be stored in separately scrolling windows that synchornized with each other as needed. On the other hand, a single-window system would allow for the kind of color-coded background and threading that Ted Nelson displayed at the colloquium. In that vision, he had different background colors for blocks of text, with a "thread" (a line) of that color linking related items. The first node in your document might have a gray background, for example. The 3 comments it received would be adjacent to it, also with a gray background. A gray line would then link the two blocks. The next node in the document might be in red, etc.]


Use Cases & Scenarios

After the initial version of the data/object structures has been nailed down, they need to be run through a series of use case scenarios, with the data manipulations defined for each. The goal of the process will be to refine the data structures, looking for weaknesses or necessary reorganizations. [Note: Some scenarios may need to be tabled as unsuitable for the initial system.]

General Scenarios

Specific Use Cases Future: Using an Abstract Knowledge Representation A hierarchical system is created from only two relationships: If progress is made in the pursuit of abstract knowledge representations, it may be that the whole of collaborative document system may well migrate into a knowledge representation, using those two relationships. The document management system would then be a subset of a much larger knowledge management repository. One wonders what such a system will look like after it begins to be extended with thousands of additional relationships. It boggles the mind. Sincerely, Eric Armstrong