Eugene Eric Kim


Date: Thu, 04 May 2000 01:23:31 -0700 (PDT)

From:   Eugene Eric Kim


Subject:   Research Narrative for OHS/DKR

I mentioned at the first meeting that I had been designing some tools to help me with my research.

While we've decided to target our initial efforts on software development, I thought I'd try to put together a brief narrative on how the OHS could be applied to scholarly research. At the very least, it will reveal my personal perspective on how I envision the OHS; at the most, it may help others picture what the OHS might be and how it could be used.

Below is a rough, poorly written narrative of how a researcher might use the OHS. Later, I'll also take a quick stab at doing a very brief narrative for software development and for managing e-mail.



Eugene Eric Kim

Eugene Eric Kim
"Writer's block is a fancy term made up by whiners so they
can have an excuse to drink alcohol." --Steve Martin

My work as a researcher depends on storing and managing information and knowledge. Raw information mostly takes the form of primary source documents and information on those documents: text and Word documents, e-mail, bibliographic information, etc. "Knowledge" in my work generally takes the form of summaries and my own personal commentary. Most of my notes consist of summaries of articles, books, and lectures, with plenty of my own annotation.

I have been designing a set of tools to help manage this information for about three years now. When I first became exposed to Doug's work, I was struck by the parallels in concept (noting that his work preceded mine by a good 40 years and was far more sophisticated than my own). Some of his ideas have helped me understand some of my own requirements better, while other ideas have shown me how certain additional features can add a whole new dimension to my vision of an information manager. Below is a narrative of how I envision the Open Hyperdocument System (OHS) affecting my work.

A Simple PIM Extension

One of my simplest needs is storing bibliographic information, contact information, and scheduling information. It's important to note that there are many, many tools that already address these specific needs. However, the OHS allows me to do things with this raw, structured data that none of the existing systems, to my knowledge, support.

As a very simple example, if I'm viewing my calendar and see that I have a meeting with Joe Schmoe on Tuesday, I should be able to click on his name to view his contact information. With back-links, I can also do the reverse: I can find out every instance Joe Schmoe appears in my calendar.

The same thing can be done with bibliographic information. If I click on an author's name, I should be able to get his or her contact information. Conversely, thanks to back-links, I can get an author's list of publications.

One of the things that freelance writers need to know is which publishers to approach about articles or books. With a combination of bibliographic and contact information stored in the OHS, I can look at a list of book publishers and, with one click, see a list of all of the books that they have published and that are currently stored in my database, possibly filtered based on category or some other field. From this data, I may see that a certain publisher has printed a number of books that I admire, and with that publisher's contact information immediately at my disposal, I can quickly send a proposal.

A feature that I automatically get by using the OHS is versioning. When I reschedule an appointment in a paper calendar, I cross out the old appointment, note the new appointment, and draw an arrow between the two. When I cancel an appointment in a paper calendar, I simply cross out the appointment. Most electronic scheduling systems do not support versioning, which is a real loss. With versioning, I can see how often I rescheduled or cancelled appointments, and I can mine this data for more interesting information (i.e. How often did I reschedule Monday appointments? How often did I reschedule meetings with Joe Schmoe? How often did I have to reschedule Monday meetings with Joe Schmoe?).

Some of these features, such as hyperlinking between fields, are supported by existing tools, and other features, such as the backlinking, could be easily added on. However, the abstractions of the OHS offer a degree of extensibility and customizability that is not possible with current systems.

Structured Notes

I take a lot of notes, most of which take the form of summaries of books, articles, or lectures intermingled with my own annotations.

One of the things that I noticed in the process of designing my own system was that there is some degree of structure in my notes, and if I had a good tool for easily adding this structural information to my notes, some interesting things could be done with that information.

My notes are mostly in outline form. I found that outlines alone were not adequate for distinguishing summaries and my own annotations. For example, a lecturer or author may make a point and then provide some commentary on that point, in which case I would write down that commentary as an indented subpoint of the original point. However, I may disagree with that commentary, and may want to make a note of that immediately. So in addition to creating another subpoint, I differentiate that note by either using a different color ink, or by marking that note with some distinguishing symbol.

Often, I need to correct my summary information (which in reality is my interpretation of what the source is saying rather than a "pure" summary) and/or my annotations. However, I don't want to lose my original interpretation/notes. If I look at some notes taken 10 years ago in a field which is now one of my specialties, and notice that I initially completely misunderstood certain concepts, that knowledge can help me in a number of ways. For example, it may help me better teach those concepts to others. Once again, versioning allows me to retain all this information -- keeping notes current and accurate, while retaining the older and possibly inaccurate information.

One type of annotation that I make is a "followup" annotation, a note to myself that says to find out more information about a certain point. If this type of annotation is clearly marked up, I can obtain a list of all items I need to follow up on. This information can also be linked to my calendar or to do list.

Another structural item I find in my notes are references to people. If these people are in my DKR in other areas, I can automatically interlink all of this information with the OHS. With this additional information, I can click on people's names and find out their contact information, their publications (with links to the ones that I have read and summarized), and every reference to that person in other publications and lectures summarized in my system, with more detailed information just a click away.

Yet another type of data I find in my notes are quotations. Once again, if these quotes are clearly identified and attributed, then when I click on a person's name, I can automatically get a list of that person's notable quotes as well.


One of the most important needs of a scholar is the ability to cite the source of a particular nugget of knowledge. The OHS's support for addressability gives me tremendous power and accountability in this area. This accountability is especially important for academics, because it can help prevent inadvertent plagiarism (i.e. forgetting to footnote a point, etc.).

The OHS supports granular addressability, which allows me to link data points in separate documents. This creates an almost ideal footnoting system. For example, if I click on a "footnote" (however a footnote in a traditional paper-published article is represented in this system), I can not only get a reference, I can go to my notes from that reference and see the exact data point for that footnote.

Addressability also allows me to reference primary source information that I have in electronic form. For example, I can link to the exact paragraph in an e-mail message from which I derived certain information.

Using the OHS

The beauty of the OHS, in my mind, is its extensibility. I can add new types of information to the system, and have that information automatically inherit all of the benefits of the OHS: versioning, addressability, back-links, annotations, etc.

The OHS needs to be able to deal with both structured and unstructured data. When I say structured data, I mean data that specifies type in some machine-parseable way. For example, e-mail is structured data, because you can parse it for the sender's e-mail address, the date and time that e-mail was sent, and for URLs that are included in that e-mail. Once that e-mail is in the OHS, you can add even more structured data, such as links to other documents in the OHS. Some of these links can be added automatically. For example, I can link the sender's name with all the other information I have on that sender in my OHS. Now, when I click on a person's name, I can also browse all of the e-mail that person has sent.

Building in support for new types of information means defining a schema (probably in the form of an XML DTD), teaching the editor how to best deal with that schema, and teaching the back-end system what to do with data of that type.


Eugene Eric Kim