THE WELCH COMPANY
440 Davis Court #1602
San Francisco, CA 94111-2496
415 781 5700


S U M M A R Y


DIARY: March 31, 2000 09:19 AM Friday; Rod Welch

Called John Hogden at LANL about SDS review.

1...Summary/Objective
2...LANL Currently Focused on Major Initiatives Until Next Monday
3...Speech Recognition for Linking Documents Interesting Challenge
......Convert all of the documents to ascii, not PDF
4...Speech Pattern Mapping, Recognition Supports Subject Management
5...Mathmatics to Assess Alignment that Reduces Meaning Drift


..............
Click here to comment!

CONTACTS 
0201 - Los Alamos National Laboratory       505 665 0134
020101 - Mr. John Hogden
020102 - hogden@lanl.gov
0202 - Los Alamos National Laboratory       505 665 0134
020201 - Mr. Cliff Joslyn; 505 667 9096
020202 - joslyn@lanl.gov
020203 - Computer Research Group (CIC-3)

SUBJECTS
Demo SDS via Web Site             ~
Locate Decision Maker
Subject Indexing
Alliances on Development
Customer Requests for Information
Los Alamos National Lab
Customer Requests for Information
PDF Pictures Not Efficient for Com Metrics KM Takes Too Much Time
Xerox Project to Organize the Record that Aids Retrieval, Telecon Joh

1311 -    ..
1312 - Summary/Objective
1313 -
131301 - Follow up ref SDS 18 0000, ref SDS 17 0000.
131302 -
131303 - Cliff's team has a big push to get some stuff done into next week.
131304 - John will mention that I called.  John's work on the Xerox project
131305 - sounds pretty challenging.  His expertise in speech pattern matching
131306 - might be helpful for advancing SDS technology to build and apply
131307 - ontologies which would reduce the time required to create relevant
131308 - links in realtime, i.e., day-to-day, so that Knowledge Management can
131309 - become practical reality for a wider base of people. ref SDS 0 9797
131310 -
131311 - Submitted ref DIT 1 0001 to John linked to this record, and to POIMS
131312 - on the role of SDS records in giving meaning to documents. ref OF 1
131313 - 0859  Also linked Corps of Engineers on the benefits of SDS.
131314 - ref DRP 7 6172
131318 -
131319 -
1314 -
1315 -
1316 - Discussion
1317 -
131701 -  ..
131702 - LANL Currently Focused on Major Initiatives Until Next Monday
131703 - Speech Recognition for Linking Documents Interesting Challenge
131704 -
131705 - John said Cliff is really busy now on some work that needs to be
131706 - completed by Monday, 000403.
131707 -
131708 - This supports Cliff's letter, ref DRP 6 0001 responding to my letter,
131709 - ref DIP 5 0001, on 000323. ref SDS 18 0001
131710 -
131711 - John will mention to Cliff that I called.
131712 -
131713 - John advised that their group is, also, working with Xerox on a big
131714 - project to link documents.
131722 -
131723 - John explained his work on mapping the meaning of speech from indexing
131724 - recordings.  This work was reviewed in the lead article of the March
131725 - issue for ACM Tech.
131726 -
131727 -      [On 020408 John advised that funding for this project has been
131728 -      withdrawn. ref SDS 24 0001
131729 -
131730 - John asked if I reviewed the scope of his work on the web...
131731 -
131732 -
131733 -            http://www.c3.lanl.gov/cic3/teams/knowledge/
131734 -
131735 -
131736 -
131737 - ...where John is listed under...
131738 -
131739 -
131740 -                http://www.c3.lanl.gov/~hogden/
131741 -
131742 - ..
131743 - John is a
131744 -
131745 -      ...a staff member in the Pattern Recognition and Distributed
131746 -      Knowledge Systems team of Computer Research and Applications
131747 -
131748 -      ...primary research interest is recognizing patterns in time
131749 -      series -- particularly speech.  I am currently the principal
131750 -      investigator on a Speaker Verification project and also on a
131751 -      Speech Compression project. In the upcoming months, I will do
131752 -      some joint research with XEROX PARC on Natural Language
131753 -      Processing through a Cooperative Research and Development
131754 -      Agreement (CRADA). Previous projects include Speech Recognition
131755 -      and Anomaly/Fraud detection. My approach to all of these research
131756 -      areas is centered around Maximum Likelihood Continuity Mapping
131757 -      (MALCOM), an algorithm developed here at LANL for speech
131758 -      processing. More details about my work and the work of others can
131759 -      be found in various references.
131760 -
131761 -  ..
131762 - John asked if my system "automatically" creates links in documents, as
131763 - shown in my correspondence, for example the letter to John on 000315.
131764 - ref DIP 3 0001
131765 -
131766 - I explained that...
131767 -
131768 -      SDS automatically chains or links records and segments, but does
131769 -      not link specific text sequences, as set out in the record on
131770 -      000316, ref SDS 17 0733, because that requires judgement about
131771 -      meaning and causation.
131772 -
131773 -      SDS technology aids the mind in formulating a desire to create a
131774 -      useful link, and it aids execution of the mind's desire to make a
131775 -      link.  When people write or read information, or see a picture,
131776 -      the mind instantly makes connections with a lot of prior related
131777 -      information, going back, minutes, hours, days and years.  SDS
131778 -      technology makes it easy for people to hardwire instantaneous
131779 -      recognition of connections by first quickly finding sources, and
131780 -      then instantly establishing the link, pretty much as a matter of
131781 -      volition.
131782 -
131783 -      Links in correspondence use this second method, of helping the
131784 -      user create links as a matter of volition, i.e., computer aided
131785 -      thinking, rather than attempting to decide what links to make.
131786 - ..
131787 - John indicated the Xerox project is to find a way to
131788 - automatically link a bunch of documents that are not already linked.
131789 -
131790 -     [On 020408 John advised that funding for this project has been
131791 -     withdrawn. ref SDS 24 0001
131792 -
131793 - Speech pattern mapping and matching might help this objective.
131794 -
131795 -     [On 000623 Jack Park describes architecture for DKR that seems to
131796 -     include this objective. ref SDS 23 5475
131797 - ..
131798 - Criteria in historical documents that are somewhat amenable to
131799 - automatic linking are..
131800 -
131801 -      Author
131802 -      Organization
131803 -      Document IDs
131804 -      Dates
131805 -      Subjects
131806 -      References
131807 -
131808 -
131809 - Hardest to link automatically because they are expressed differently
131810 - are...
131811 -
131812 -      Subjects
131813 -      References
131814 -
131815 - ...where any particular document might have 10 - 100 different
131816 - subjects, and an equal number of references.
131817 -
131818 - Even for authors and organizations, it is difficult to create
131819 - meaningful links, because "Tom Sawyer" may have written about widgets
131820 - in one letter, and his prior letter was on glottenstop.  More
131821 - difficult is where he writes about delays in submitting design specs
131822 - for a new "widget" the firm plans to market next year, and that
131823 - failure to perform will cause a $10M loss, so on and so forth.  Tom's
131824 - prior letter on widgets explained how to position existing and
131825 - available widgets to give the best appearance at the Christmas party;
131826 - completely unrelated.  The letter before that on widgets was on
131827 - donating the existing stock to Bangledash, again unrelated to either
131828 - of the first two letters.  People have a lot of trouble figuring out
131829 - how to create a technology for parsing context any way close to the
131830 - how humans instantly "understand" based on parallel processing of an
131831 - entire life experience.  Humans do not rely simply on what they see in
131832 - the document, they infer and expand meaning using induction, explained
131833 - in POIMS. ref OF 1 0561
131834 -
131835 - John's algorithms from speech pattern recognition, ref SDS 0 2475,
131836 - might be able to make some general assumptions about relatedness, and
131837 - create links.
131838 -
131839 - How will it work...
131840 -
131841 -      One way would be to identify meaning of a document and go search
131842 -      for other documents that seem to pertain to the same subject
131843 -      based on context.  An algorithm like Landauer's LSA could be
131844 -      helpful.
131845 -
131846 -      But which words in the document will be used to determine the
131847 -      search?
131848 -
131849 -      Possibly particular subjects have been, or can be, identified
131850 -      that establish which, of many alternate, meanings that can be
131851 -      drawn, are needed to support current work.  This would greatly
131852 -      simplify the problem.
131853 -
131854 -      Documents, large and small, have many, many associations.
131855 -
131856 -      When a document is found to link, how will the link be displayed?
131857 -
131858 -          A table of links could be arbitrarily inserted at the top of
131859 -          the document.
131860 -
131861 -          This is not a lot of help for context related links.  You
131862 -          might split the para and insert a link listing.
131863 -
131864 -          Alternatively, if you want to link a word of text, which
131865 -          one, or which string of words?
131866 -
131867 -          Again, input from end users on what they need is helpful.
131868 -
131869 - ..
131870 - Another method would be more labor intensive, but provides a
131871 - much higher quality product.
131872 -
131873 -      Convert all of the documents to ascii, not PDF
131874 -
131875 -         Note:  most of these large scale doc projects convert to PDF,
131876 -         as reported for USACE on 990510. ref SDS 6 B3P5  This is not
131877 -         an effective method of creating links to accomplish LANL's
131878 -         objective.  Pictures are ineffective for daily management that
131879 -         needs a history of cause and effect, per review on 940609,
131880 -         Kissinger. ref SDS 1 4238
131881 -
131882 -      Use SDS and people to decide what to link and where based on
131883 -      human judgement of relevance.
131884 -
131885 -      Use SDS to automatically pop every document into HTML for display
131886 -      in a web browser, which is where the links can be helpful.
131887 -
131888 -
131889 - A third idea, and the most practical, is to...
131890 -
131891 -      Convert the documents to ascii, and create an inventory according
131892 -      to the criteria above, ref SDS 0 0918, for quickly finding and
131893 -      opening relevant documents based on the criteria above.
131894 -      ref SDS 0 0918
131895 -
131896 -      This provides a resource people can check for support of current
131897 -      work, and they can then create links that are relevant, as
131898 -      needed.
131899 -
131900 -      In other words, provide the environment to create links as
131901 -      needed, but skip actually "linking" stuff, since the chance that
131902 -      automatic links will actually be useful is not very great, or at
131903 -      least it seems so from experience working with a linked
131904 -      environment the past 15 years.
131905 -
131906 -
131907 -
1320 -

SUBJECTS
Computational Linguistics
Subject Indexing
Automated Topic Ontology Mapping
AI Cannot Think No Biological Drives
Engine Knowledge SDS Augment Intelligence
Thinking, Reasoning, Automated
Speech Pattern Mapping, LANL 000331

1810 -
181001 -  ..
181002 - Speech Pattern Mapping, Recognition Supports Subject Management
181003 -
181004 - Apart from the Xerox PARC project, ref SDS 0 2405, John's work, per
181005 - above, ref SDS 0 2475, could be very valuable for enhancing the SDS
181006 - subject management capability for alerting people to consider making
181007 - links based on text patterns.
181008 -
181009 -     [On 000501 cite this record to indicate understanding of LANL
181010 -     objectives and possible fit with SDS. ref SDS 21 2278
181011 -
181012 -     [On 000623 Jack Park proposes similar capability. ref SDS 23 2915
181013 -
181014 - Building ontologies based on repetitive text patterns, and possibly
181015 - some rules of hierarchy, could greatly reduce the time required to
181016 - manage subjects, which is one of the two engines of knowledge, without
181017 - which, no useful management can occur.  That is what got my attention
181018 - initially on 000314 discussing the ability to manage subjects (also,
181019 - ontologies). ref SDS 14 4972
181020 -
181021 - Part of this effort entails defining a baseline epistomology that
181022 - provides a framework for assigning subjects to different chunks of
181023 - information in Knowledge Space, per recent review on 000311.
181024 - ref SDS 11 0783  Essentially, we need to jump-start people using SDS by
181025 - providing a first order subject framework that reduces the level of
181026 - effort for organizing information.
181027 -
181028 -
181029 -  ..
181030 - Mathmatics to Assess Alignment that Reduces Meaning Drift
181031 -
181032 - On 000314 explained possibility for developing mathamatical algorthims
181033 - and software to support alignment of information with contextual
181034 - domains defined by organic subject structure. ref SDS 14 2838  This
181035 - would improve productivity by reducing the time needed to find and
181036 - create connections of cause and effect, i.e., generate useful
181037 - knowledge.
181038 -
181039 - It would accomplish the aims of NSF reviewers reported on 991213 to do
181040 - something helpful with computational linguistics, ref SDS 8 3234, for
181041 - for Com Metrics, which is not really available yet, as reported on
181042 - 000110. ref SDS 9 0001
181043 -
181044 -      [On 000424 Sandy Klausen proposes the purpose of Doug's DKR
181045 -      project is to automate linguistic pattern matching. ref SDS 19
181046 -      4212
181047 -
181048 -      [On 000503 Jack Park recognizes potential for using mathamatics
181049 -      to process SDS record. ref SDS 22 6860  He suggests Henry Liu
181050 -      develop an idea proposed by Cliff Joslyn, reported previously on
181051 -      000428. ref SDS 20 5722
181052 -
181053 -
181054 -
181055 -
181056 -
181057 -
181058 -
181059 -
181060 -
181061 -
Distribution. . . . See "CONTACTS"