Colloquium at Stanford
The Unfinished Revolution

Memorandum


Date: Thu, 09 Mar 2000 15:10:46 -0800

From:   Eric Armstrong
eric.armstrong@eng.sun.com
Reply-To: unrev-II@onelist.com

To:     unrev-II@onelist.com

Subject:   Knowledge Repository War Stories...

Valuable info, indeed! [from Jon Winters on March 7, 2000]

You are quite right that "horror stories" tracks important information about what can go wrong, and how. The fellow from Citigroup who spoke in Session 9 mentioned several important Murphyisms to be alert for.

QUESTION: What is HyperNews? Do you have a link to it? It may be very close to what I am proposing this evening. If so, I could have used a DKR much earlier! (Typically, I find that things that sound promising don't turn out to be as good as hoped, but there are always exceptions...)

The issue you mentioned where people ask a question that has already been answered is very, very important. It creates some interesting requirements for the system. There are a couple of related issues along the same lines, as well. First, the question you raised:

Use Case:

Person asks a question that has already been answered. System Action:

  1. It is important that the question is *not* entered into the resository until it has been refined sufficiently to be a unique question.(Or it should be possible to delete it later, as well as revise it.)

  2. Ideally, the system will be "friendly" enough for people to find things. Most NewsGroups aren't that friendly, because of the large volume of frequently redudant material (see point #1) and the inability to group things together. So the ability to "grow" the organization of the data by adding categories and moving or copying items to them is paramount.

  3. Even with the best system, a new person will frequently ask a question "in the wrong way" -- using the wrong terminology, etc. That makes automated detection a difficult task. What's needed is a way that makes it tres simple for another user of the system to respond "answered in X", where X is a link to the appropriate node. That answer might have the effect of adding the question as an "alternative form" to the original, or simply prevent it from being added to the database, unless and until it is redefined and resubmitted as a new question that no one responds to in that way.

Related Issue: UPDATING INFORMATION

I recently sent out a query to the company wide mailing list: Anyone know any good optometrists? I got back about 10 replies. One of them was from a fellow who saved the 10 replies *he* got when he asked the same question a few months earlier. To continue his good work, I merged my responses with his. The results were instructive:

There was a 30% overlap between the two lists. Three people were kind enough to repeat the same recommendation they had given earlier. The other 7 were new -- from new people in the organization or because they had come across new information.

That raised some interesting questions:

  1. How are questions updated with new answers? It is repeated queries that spark new additions. What should happen, ideally, is that new answers are linked to the original question, rather than to the duplicate (which should not be entered). Ideally, duplicates are eliminated there, as well. But again, that needs to be a human process, since Jeffrey Archer in one response might be J. Archer in another, and that might be hard to distinquish from his brother, K. Archer, in a third.

  2. How are answers updated? If J. Archer changes his address from Cupertino to Pleasanton, that's important information, and the correction should replace the original.

  3. How are invalid and out of date responses removed? If J. Archer retired, how/when is that response removed from the repository?

All of these issues apply to a how a system is designed, or how it is used, as well, since change is the norm, rather than the exception.

They're important issues to solve. Even after we figure out what to do, figuring out how to do it is hard...

Sincerely,


Eric Armstrong
eric.armstrong@eng.sun.com