THE WELCH COMPANY
440 Davis Court #1602
San Francisco, CA 94111-2496
415 781 5700
rodwelch@pacbell.net
S U M M A R Y
DIARY: October 1, 1999 04:57 PM Friday;
Rod Welch
Little mistake exploded into $150M loss on Mars.
1...Summary/Objective
.............Simple Error Doomed Mars Polar Orbiter
2...Intelligence Avoids Mistakes: Analysis, Alignment, Feedback, Summary
3...Communication Metrics Supports Concurrent Discovery of Alignment
4...Risk Management Identifies Small Risks Before They Grow to Disaster
......Email Likely Caused Crash on Mars
5...Communication Primary Cause of Mistakes, SDS Alignment Needed
..............
Click here to comment!
CONTACTS
0201 - Intel Corporation O-00000704 0201
020101 - Mr. Morris E. Jones; Director of Architecture
SUBJECTS
Risk Communication Main Factor of Management Success
Rework Cycle Due to Miscommunication
Difficult to Calculate, Difficult to Believe
Mistakes Avoided, Saves Money, Lawsuits
Not Enough Time to Succeed
Denial Managers Making Mistakes Reject Savings
Communication Biggest Risk of Mistakes
Traceability & Aligning Communications
Engineering Management Mistakes
Simple Error Doomed Mars Polar Orbiter
2612 -
2612 - ..
2613 - Summary/Objective
2614 -
261401 - Follow up ref SDS 60 0000, ref SDS 51 0000.
261402 -
261403 - The front page headline in the San Francisco Chronicle today has an
261404 - article on....
261405 -
261406 -
261407 - Simple Error Doomed Mars Polar Orbiter
261408 -
261409 -
261410 - ...causing $125M loss of a space craft, ref OF 8 0001, plus $100M more
261411 - in planning and management costs, because engineers submitted
261412 - information on guidance using pounds, and the computer at another
261413 - location was programmed to use grams for the metrics system.
261414 - ref OF 8 4028
261415 -
261416 - [On 991008 contacted author of article. ref SDS 68 0001]
261418 - ..
261419 - [On 991207 President announces initiative to reduce medical
261420 - mistakes;, ref SDS 69 0889; letter to Intel cites further
261421 - problems in NASA's Mars program. ref SDS 70 2881]
261423 - ..
261424 - [On 000328 report on NASA problems cites continual bumbling from
261425 - striving to implement TQM objective -- faster, better, cheaper.
261426 - ref SDS 71 0001
261428 - ..
261429 - [On 000822 Intel trying improve management. ref SDS 73 019R
261431 - ..
261432 - [On 001011 Firstone Ford trying to improve management avoid
261433 - product defects causing accidents, cost $450M. ref SDS 74 0001
261435 - ..
261436 - NASA officials report Lockheed should have used metrics.
261437 - ref OF 8 0450
261439 - ..
261440 - However, the larger flaw was failure to add metrics to communication
261441 - which aligns information with original sources under ISO criteria
261442 - requiring traceability to original sources, reviewed on 950721.
261443 - ref SDS 8 1740 Here, guidance data did not align with submission
261444 - requirements, which is an engineering management failure.
261446 - ..
261447 - As a result, Lockheed's communication lacked context, and so was clear
261448 - and concise, but not complete. On 950412 executives worry that
261449 - project managers do not tell the truth, yet insist communication be
261450 - limited to 25 words or 30 seconds, i.e., cursory, due to limited time.
261451 - ref SDS 7 3920
261452 - ..
261453 - On 970524 case study found small communication mistakes caused
261454 - Columbia Space Shuttle crash in 1986. ref SDS 18 7298
261456 - ..
261457 - On 990524 proposed Communication Metrics to improve engineering
261458 - management with traceability to original sources. ref SDS 36 0876 On
261459 - 990525 this could not be used because engineers don't like to write.
261460 - ref SDS 37 0966 On 990817 managers need commitment before getting
261461 - support that makes it easier to align work with requirements.
261462 - ref SDS 58 6829 Reflects cultural forces that require change in
261463 - attitude to improve management, reported on 990527. ref SDS 38 1233
261464 - ..
261465 - $125M is an expensive "attitude" problem at IBM, Turner, USACE,
261466 - SFIA... everywhere. How and where to begin solving "attitude"???
261467 -
261468 - [On 991006 "attitude" caused medical mistakes. ref SDS 65 3596]
261469 -
261470 - [On 991007 cited Mars incident in letter to Dave at Intel on using
261471 - Com Metrics for risk management because communication is the
261472 - biggest risk in enterprise. ref SDS 66 0001]
261474 - ..
261475 - [On 991008 wrote to Millie's daughter Pam, about setting a good
261476 - example to develop constructive "attitude" for her son, who is 4
261477 - years old. ref SDS 67 4140]
261478 -
261479 -
261480 -
2615 -
SUBJECTS
Little Deviations Lead to Big Problems
Murphy's Law, Avoiding Mistakes Requires More than Luck
Risk Caused by Complexity which causes Uncertainty
Intelligence Competence Enhanced Knowledge Space Saves Time Money Acc
Align Communications to Maintain Shared Meaning from Meetings, Calls
Align Avoid Rework Mistakes Subjects Discover all Factors
Communication Biggest Risk Enterprise Too Many Problems Stock Market
Error Too Small to Notice by Observing Daily Operations
Email Likely Caused Crash on Mars
Error Too Small to Notice by Observing Daily Operations
Email Likely Caused Crash on Mars
5213 -
521401 - ..
521402 - Intelligence Avoids Mistakes: Analysis, Alignment, Feedback, Summary
521403 - Communication Metrics Supports Concurrent Discovery of Alignment
521404 - Risk Management Identifies Small Risks Before They Grow to Disaster
521405 -
521406 - The error was too small to notice by observing daily operations
521407 - following launch of the Mars space craft; but, it multiplied many fold
521408 - over the months of space travel, and this caused the catastrophic end.
521409 - ref OF 8 6958
521411 - ..
521412 - This fits Aristotle's point cited in the NWO... paper. ref OF 10 6056
521414 - ..
521415 - On 970524 case study found communication mistakes caused Columbia
521416 - Space Shuttle crash in 1986. ref SDS 18 7298
521417 -
521418 - [On 010725 analogy train wrech because bolts out of spec, cannot
521419 - see wheels wobbling. ref SDS 75 02EW
521421 - ..
521422 - Since information overload increases the risk of "meaning drift," only
521423 - proactive Risk Management can maintain alignment of communication, as
521424 - reviewed on 960518. ref SDS 11 3734
521425 -
521426 - On 970524 case study linking thousands of communications over 10
521427 - years showed Columbia Space shuttle disaster in 1986 caused by
521428 - small problems that grew due to telephone game. ref SDS 18 7298
521430 - ..
521431 - On 940611 a case study of an oil tanker that sank at sea, shows
521432 - how "expediting" to avoid paperwork relies on conversation that
521433 - ignores good management practice to align communications,
521434 - ref SDS 3 2066, specified by ISO criteria, reviewed on 950721.
521435 - ref SDS 8 1740 Lack of alignment causes small mistakes that are
521436 - overlooked and grow over time into big catastrophes that are
521437 - attributed to Murphy's Law, explained in the NWO... paper.
521438 - ref OF 10 9449
521440 - ..
521441 - The letter on 990924, ref DIP 2 1680, cites this same cause of medical
521442 - mistakes. ref SDS 62 0001
521444 - ..
521445 - Email Likely Caused Crash on Mars
521446 -
521447 - NASA engineers overwhelmed by information overload, reported by
521448 - CBS News on 60 Minutes, see 980412, ref SDS 29 8956, likely fell
521449 - behind and got fouled up trying to make up time, (see later
521450 - example reported on 011006, ref SDS 76 O99K), as occurred on the
521451 - Columbia Space Shuttle that crashed in 1986, reported 921021.
521452 - ref SDS 2 4499 Trying to catch up by "expediting," the engineers
521453 - used email to "collaborate," which omitted units of measure, as
521454 - people often do in email, as explained in POIMS. ref OF 9 YF5L,
521455 - causing failure and loss of a $125M space craft, reported today.
521456 - ref SDS 0 0001
521458 - ..
521459 - This seems likely because, if Lockheed had given the wrong units
521460 - of measure along with the figures, it likely would have been
521461 - noticed by the NASA people in Pasadena based on the report that
521462 - this group has used metrics for a long time. ref OF 1 0820
521463 - Evidently cursory methods were used to expedite. Email is a
521464 - popular method to "expedite" where people assume common
521465 - understandings and omit information, as occurred here.
521466 -
521467 - [On 000505 email proposed as core of KM system. ref SDS 72
521468 - 4392
521469 - ..
521470 - Risks of conventional email are explained in the letter on
521471 - medical mistakes, ref DIP 2 1045
521472 - ..
521473 - The Chronicle reports today that Lockheed is checking their
521474 - contract with NASA to see if units of measure were specified.
521475 - ref OF 8 4148 This applies traceability to original sources, see
521476 - again ISO criteria, ref SDS 8 1740, also, called "alignment" to
521477 - explain the role of "intelligence" in POIMS. ref OF 9 0582 However,
521478 - it is after-the-fact. Proactive Risk Management recognizes it is too
521479 - late to discover requirements after the ship has crashed on Mars, or
521480 - at sea, or on the operating table.
521482 - ..
521483 - Risk Management needs Concurrent Discovery supported by SDS that adds
521484 - and maintains alignment to make communication effective, developed on
521485 - 960620. ref SDS 12 1101
521487 - ..
521488 - On 951212 a study on Risk Management reviewed communication alignment
521489 - as the most difficult requirement to maintain. ref SDS 10 8870
521490 - ..
521491 - Lockheed's conduct reflects NWO... sequence of discovering
521492 - correct alignment after disaster strikes. ref OF 10 0645
521494 - ..
521495 - On 990912 articles on medical mistakes say hospitals use Risk
521496 - Management after costly accidents occur, to avoid liability,
521497 - rather than use management to reduce risk of mistakes.
521498 - ref SDS 60 0165
521500 - ..
521501 - On 990925 Intel chip set delayed again due to poor management,
521502 - ref SDS 63 0001, reflecting report on 970603.
521504 - ..
521505 - On 960620 Concurrent Discovery was developed to enable review and
521506 - alignment with the record before mistakes occur. ref SDS 12 1101
521507 -
521508 -
521509 -
521510 -
521511 -
521512 -
5216 -
SUBJECTS
Communication Biggest Risk in Enterprise, Dilemma
90% Managers Time Communication
Risk Management Complexity Mistakes Communication
Too Many Problems Stock Market Crashes Downsizing Information Overloa
Bumbling Information Overload Highway Compounds Impact of Mistakes Re
Communication Biggest Risk of Enterprise Paradigm Shift of Millennium
$125M Mars Program Failed NASA Needs Loss Avoidance Communication Met
Communication Metrics Avoid Bumbling Discover & Fixes Mistakes
Align Avoid Rework Mistakes Subjects Discover all Factors
6211 -
6212 - 2128
621301 - ..
621302 - Communication Primary Cause of Mistakes, SDS Alignment Needed
621303 -
621304 - Called and discussed space craft loss of $125M, per above, ref SDS 0
621305 - 0001, and related communication issues, per above, ref SDS 0 3192,
621306 - with Morris.
621307 -
621308 - We recalled our meeting on 960721 where Morris advised that Lockheed
621309 - had communication difficulty on a project with Intel and Chips.
621310 - ref SDS 14 0896 Possibly the same people or processes caused the
621311 - space craft to crash on Mars, because a small deviation grew into a
621312 - major problem over time, causing the loss of $125M. ref SDS 0 0001 On
621313 - 950303 Morris felt SDS could help avoid communication problems that
621314 - Chips people encountered at a meeting in Paris. ref SDS 5 3333
621315 - ..
621316 - I mentioned similarities between loss due to mistakes by NASA
621317 - and Lockheed, and public reports on 990925 of Intel's problems
621318 - releasing the 820 chipset. ref SDS 63 0001 Previously, on 990226
621319 - Intel delayed release of the chipset to September. ref SDS 32 0001
621320 - Now that target has come and gone.
621322 - ..
621323 - Morris advised this matter is very secrete. He feels SDS can't help
621324 - because communication was not the cause of Intel's delayed release of
621325 - the 820 chipset. He cited 30,000 or so email have been issued to
621326 - ensure effective communication on this problem.
621328 - ..
621329 - We reviewed my letter on 990718 explaining email is cursory and
621330 - incomplete, ref SDS 48 5251 Morris reported on 980722 that people
621331 - read and write email during long meetings at Intel. ref SDS 30 0464
621332 - Meeting notes consist of Power Point slide presentations, ref SDS 30
621333 - 4826, rather than alignment of what is discussed with requirements,
621334 - objectives and history, as reported by Dave Vannier on 970603.
621335 - ref SDS 19 5803
621336 - ..
621337 - On 951212 a study on Risk Management found that "understanding"
621338 - and "problem handling" are integral to communication which is the
621339 - biggest risk factor that causes mistakes. ref SDS 10 4433 Alignment,
621340 - also called "traceability to original sources" is a key aspect of
621341 - "understanding." ref OF 10 4212 On 970910 executives reported not
621342 - having time to think, which reduces understanding. ref SDS 21 3479
621344 - ..
621345 - On 950927 Dave Vannier reported that email is Intel's least effective
621346 - business system. ref SDS 9 4939 On 970603 Dave related the need to
621347 - maintain alignment of communications at Intel. ref SDS 19 5803
621349 - ..
621350 - Morris and I recalled this evening our discussion on 980722 about the
621351 - practice of reading and answering email during meetings, ref SDS 30
621352 - 0464, which causes cursory analysis and understanding set out in a
621353 - letter on 990718. ref SDS 48 5251
621354 - ..
621355 - I asked if Morris has read the letter explaining the cause and
621356 - solution to management mistakes, based on articles published the past
621357 - month about the high cost of medical mistakes? ref DIP 2 0001 It was
621358 - linked in the letter, ref DIP 3 0899, sent yesterday. ref SDS 64 5974
621360 - ..
621361 - Morris advised that he has not had time to read the letter, since our
621362 - call yesterday. ref SDS 64 5696
621364 - ..
621365 - The letter uses the "Telephone game" to illustrate how dialog and
621366 - documents that are not aligned compound error, as in the Lockheed
621367 - space craft crash on Mars. ref SDS 0 3192 Email is worse than dialog
621368 - and documents, because it is a stream of conscious rendering that is
621369 - devoid of alignment by virtue of shear volume. Email is a series of
621370 - momentary impressions, that distribute errors faster and wider than
621371 - ordinary guess and gossip. ref DIP 2 1045 When a memo or letter is
621372 - printed and signed there is some level of review between admin and
621373 - author. This reflection does not occur in email.
621374 - ..
621375 - We reviewed examples from the Broadwater Dam project where
621376 - errors in communication were repeated and compounded over months and
621377 - years through endless meetings, calls, fax and email. See case study
621378 - on 990316. ref SDS 34 3088
621380 - ..
621381 - Morris said Intel has a lot of sophisticated and powerful business
621382 - metrics to monitor mistakes.
621384 - ..
621385 - Intel does not have a system of metrics for communication, which is
621386 - the biggest cause of errors in human endeavors.
621388 - ..
621389 - On 921021 a JPL executive reported at a Cal Tech seminar that the
621390 - Columbia Space Shuttle crashed in 1986 because JPL and NASA business
621391 - metrics, which are sophisticated and powerful were inadequate.
621392 - ref SDS 2 4499 On 921021 this was still a problem. ref SDS 2 4390 On
621393 - 960712 Dave Vannier reported at Asilomar that technology was making
621394 - the problem worse. ref SDS 13 1552
621395 - ..
621396 - On 940611 the Asilomar Conference reported communication
621397 - problems that caused the loss of an oil tanker at sea, costing $500M.
621398 - ref SDS 3 8473
621400 - ..
621401 - On 970524 Morris reported having attended a Cal Tech seminar that
621402 - traced the "root cause" of the Columbia Space shuttle to communication
621403 - failure. ref SDS 18 4401 This cost $billions of dollars.
621405 - ..
621406 - On 961218 the U.S. Army Corps of Engineers made a decision that likely
621407 - resulted in a loss of $6M, without realizing it, that was outside the
621408 - system of business metrics, because counsel and executives do use
621409 - these methods. ref SDS 15 5790
621411 - ..
621412 - On 970405 a meeting at USACE illustrates how lack of alignment in
621413 - communication causes meetings to fail, despite a lot of email, as used
621414 - at Intel. ref SDS 17 0001
621415 - ..
621416 - On 990525 Morris reported that engineers and managers cannot use
621417 - Communication Metrics because they don't like to write, and are
621418 - anxious to perform engineering and management. ref SDS 37 0966 On
621419 - 990817 Morris reported that people need a commitment to diligence in
621420 - order for business systems to be effective. ref SDS 58 6829
621422 - ..
621423 - On 990625 Fortune reported communication is biggest cause of failure
621424 - by CEOs, and that "psyche" prevents CEOs from writing copious notes,
621425 - ref SDS 41 4914, which Andy Grove says (reviewed on 980307) removes
621426 - ambiguity of mental maps that otherwise cause errors. ref SDS 27 3668
621428 - ..
621429 - This record suggests that the design problem and consequent delay in
621430 - releasing the 820 chipset likely can be traced to communication that
621431 - was not aligned with requirements, because SDS is the only system that
621432 - aligns communication over time, and it is not used at Intel.
621433 - ..
621434 - The only challenge is overcoming denial, cited by Andy Grove as
621435 - the "inertia of success" reviewed on 980307. ref SDS 26 3740
621437 - ..
621438 - As things stand, only the CEO has authority to generate intelligence
621439 - in an organization, as reviewed on 980307, ref SDS 27 8488, and most
621440 - of them don't want to do this work because of "psyche."
621441 -
621442 - $125M is a lot money for psyche.
621443 -
621444 -
621445 -
621446 -
621447 -
621448 -
6215 -
Distribution. . . . See "CONTACTS"