Birth of a Thinking Machine
By MICHAEL A. HILTZIK, Times Staff Writer
..
AUSTIN, Texas--Popular culture has long held strong opinions about what
the world's smartest machine should look like. There's the unblinking red
eye of HAL, the brilliant, homicidal computer of Stanley Kubrick's "2001: A
Space Odyssey"; the gilded humanoids of pulp sci-fi; and the flashing
lights and gleaming boxes of countless doomsday scenarios.
..
But it's a safe bet that nobody has imagined artificial intelligence
the way it is taking shape inside a low-slung brown brick building hidden
deep within a leafy research park north of town. Yet here beats the heart
of the system known as Cyc.
..
For 17 years a small band of engineers and programmers has been
slaving away at the task of teaching Cyc much of what a human being knows.
(The name comes from "encyclopedia" and is pronounced "psych.") The idea,
as articulated by the project's creator, computer scientist Douglas B.
Lenat, has been to create the most sophisticated artificial intelligence
system ever devised--the closest a computer has come to replicating the
human brain's reasoning, learning ability and perhaps even its
consciousness.
..
Whether Lenat and his team have achieved that goal is about to be
subjected to public scrutiny. For years, Cyc has largely been kept under
wraps, seen and used only by specialists and a handful of commercial firms
and government agencies licensed by Cyc's corporate owner, Cycorp Inc.
But Cyc is about to have its coming-out party. Late this summer (soon
after Steven Spielberg's film "A.I." presents moviegoers with the latest
fictional personification of thinking machines), Cycorp will release to the
public a sizable portion of Cyc's so-called knowledge base: the body of
assertions and inferences that corresponds to its heart and soul.
..
The release, under the name OpenCyc, will allow people to use the
limited knowledge base for free, browsing it via a Web site or
incorporating it in applications ranging from speech-recognition software
to database searches.
Users can even supplement the existing knowledge base with new facts
and concepts, although these will be subject to review by a technical
board. The full knowledge base (about 20 times the size of the public
portion) will be licensed to commercial users for a fee. The company also
will release two stand-alone applications, one to identify security holes
in large computer networks and another designed to answer queries posed in
natural language.
..
Cyc already exhibits a level of shrewdness well beyond that of, say,
your average computer running Windows. In one recent demonstration for a
Defense Department project, a Cycorp engineer informed the system they
would be discussing anthrax. Cyc responded: "Do you mean Anthrax (the heavy
metal band), anthrax (the bacterium), or anthrax (the disease)?" Asked to
comment on the bacterium's toxicity to people, it replied: "I assume you
mean people (homo sapiens). The following would not make sense: People
Magazine."
..
A Lot of Time, a Lot of Money
Getting to that point has not been easy. The project already has
consumed an estimated 500 person-years and $50 million in investments from,
among others, the Defense Department, the pharmaceuticals company
GlaxoSmithKline, and Microsoft co-founder Paul Allen.
..
The scale of the project elicits awe-struck appreciation from
supporters and critics alike.
"Having an encyclopedic knowledge base that would cover all of common
sense is an absolutely critical goal in AI," says Benjamin J. Kuipers,
chairman of the University of Texas computer science department. A former
advisor to the Cyc project, Kuipers disagrees with some aspects of Lenat's
technical approach, but acknowledges: "Doug was the guy with the guts to do
this, and he deserves a lot of credit."
..
The system today encompasses more than 1.4 million
assertions--hundreds of thousands of root words, names, descriptions,
abstract concepts, and a method of making inferences that allows the system
to understand that, for example, a piece of wood can be smashed into
smaller pieces of wood, but a table can't be smashed into a pile of smaller
tables. As a software program, Cyc is not embodied in any physical thing: A
visitor to Cycorp would see only cubicles filled with programmers
contemplating conventional computer monitors displaying Cyc's "knowledge."
..
An intelligent system on this scale, Cycorp officials contend, has
countless potentially profitable applications. "My vision of this company
is to be the Intel of intelligent software," says Dwight Lodge, chief
executive of Cycorp. "I'd like to have [Cyc inside] a whole range of
existing applications."
As an experimental project, Cyc emerged as a response to an
intellectual crisis in the field of artificial intelligence. Having emerged
as a formal science in the mid-1950s, after 30 years, its goals had proved
elusive.
..
Projects from chess-playing computers to robots that could haltingly
negotiate uneven terrain took years longer to achieve than expected. Some
highly touted AI programs could exceed human performance only in certain
narrow fields--such as diagnosing diseases where the choice among possible
causes was limited. Lenat, then a member of the Stanford University
computer faculty, thought he understood the problem. Expert systems were
"not savants, but idiot savants," he said. They lacked the basic
information possessed by an average 8-year-old human: Green is not blue;
sweating makes you wet; a car can rust but not run a fever. They broke down
when they were asked questions that relied, however subtly, on such mundane
observations.
..
The machines had been stuffed with the wrong kind of knowledge.
Computers had become repositories of the esoteric facts one found in
reference works, textbooks and the brains of experts. What was missing was
the vast fabric of ordinary facts and observations that humans acquire and
use almost subconsciously and without which life would be
unintelligible--what Lenat terms "common sense."
This common sense deficit, Lenat argued, is what makes computer
intelligence so shallow. It is why the same system capable of mapping the
trajectory of a hurtling ICBM can be brought to a grinding halt by a
trivial misspelling in a typed command--one that a child would
disregard--or why a computer familiar with human diseases and asked what
ails a rusting car is likely to answer: "measles."
..
Lenat's solution was to program the computer not with the basic
information one finds in the library, but with all the information "the
author of an article assumes the reader already knows."
Many people in AI had accepted that the inability to represent common
sense was an obstacle, but few had confronted the horrifying necessity of
entering so much of it by hand. Lenat resolved to shoulder a task he later
called "a 20-year detour to pull the mattress off the road so the traffic
can flow."
..
Whether the result adds up to genuine intelligence, much less the
elusive quality we call "consciousness" or "mind," touches on one of the
central debates in artificial intelligence. Within the contentious
community of AI researchers, Cyc has drawn its share of criticism.
Some critics argue that Cyc's focus on sorting facts and observations
into logical categories, even given its powerful ability to make
inferences, is too restrictive.
..
"I don't believe in the idea that intelligence is founded upon having
vast amounts of facts about the world," says Douglas R. Hofstadter, the
Pulitzer Prize-winning author of "Godel, Escher, Bach: An Eternal Golden
Braid," a study of human creativity and artificial intelligence.
"Intelligence is about making decisions based on imperfect knowledge and
among partially good choices."
But others believe that the exponential surge in the processing power
and speed of computers in recent years may finally be enough to give
systems such as Cyc the critical mass they need to cross the consciousness
threshold.
..
"We finally are getting to the point where machines will be able to do
what the human brain alone can do," says James C. Spohrer, chief technical
officer of IBM's venture capital relations group, who has studied Cyc's
potential as a commercial project. "The time feels right."
Lenat believes that systems such as Cyc could replicate human
cognition closely enough to simulate consciousness, emotion and motivation.
Some of these qualities already have appeared in the system, he says.
..
"Cyc has goals, long- and short-range," he says. "It has an awareness
of itself. It doesn't 'care' about things in the same sense that we do, but
on the other hand, we ourselves are only a small number of generations away
from creatures who operated solely on instinct."
The potential applications of such a discerning system are vast, he
says: Systems could converse with their users in plain English or perform
accurate translations. Or automated systems could be entrusted with
life-and-death responsibilities. "Will you let a robot around your house if
it doesn't understand the [relative] value of things, like a moth versus a
baby?" Lenat asks.
..
Cyc already has displayed the ability to identify common-sense
absurdities. "Cyc already knows that people have to be a certain age before
they're hired for a job," Lenat says, meaning that it could clear such
inaccurate entries as mistaken birth dates from corporate payroll records.
Cyc also can extract and compile facts scattered among diverse sources of
information and use them to draw conclusions--in one test responding to a
request for an image of people relaxing by turning up a photo of some men
holding surfboards.
..
An Ongoing Dialogue With Colleague
The center of all this activity is an exceedingly unusual place, even
for an emerging technology company. Cycorp's 65-member staff engages in a
dialogue day and night with their unremittingly curious electronic
colleague.
..
Most of Lenat's programmers are trained not in computer engineering
but in fields related to logic and human thought: The staff includes about
20 philosophers and smaller teams of experts in subjects ranging from
theology to physics.
Among them is Charles Klein, 33, a University of Virginia-trained
metaphysician who joined Cycorp in 1999 after finding its want-ad for
"ontological engineers" in a meager professional quarterly called Jobs for
Philosophers.
..
In a room he shares with a large monitor displaying Cyc's
characteristic rows of logical queries and responses, Klein spends hours
inculcating the system with such abstract concepts as "belief"--a difficult
notion for a computer program to grasp, possibly because it has more to do
with point of view than with anything true or false about the real world.
"People who do this enjoy the process of decoding thought," he says of
his daily routine of typing assertions into Cyc's database and replying to
the computer's minute requests for clarifications. It is the kind of work
that only a specialist could love. "Take the phrase, 'I like to go
shopping,' " Klein says. "Connecting each word to a concept is fascinating
to any philosopher who's interested in the structure of thought and
inference."
..
One thing everybody agrees on is that Cyc would never have got off the
ground, much less kept aloft for 17 years, if not for Doug Lenat.
A former wunderkind of computer science, Lenat is now 50. Brash,
barrel-chested and with an unruly mop of black hair, he has a distinctive
way of interrupting his technical explanations with a wide smile, as though
delighted by his own perspicacity.
..
"If I have an idea, he's one of the five people I can expect to
understand it and see what's wrong with it," says Marvin Minsky, a
professor at the Massachusetts Institute of Technology and one of the
pioneers of the field.
Lenat burst upon the AI scene in the 1970s with a string of
now-legendary programming feats. The first, which became the basis of his
Stanford doctorate, was a program called Automated Mathematician, or AM.
The program was designed to learn not by being fed a diet of new facts but
by "discovery"-- given a handful of starting principles, it was to search
for new ones.
..
AM started with 78 basic concepts such as mathematical sets and 243
"rules of thumb" for making hypotheses, judging the intellectual value of
its discoveries on a scale of 0 to 1,000, and so on. If it found something
intriguing, such as multiplication, it looked for its inverse, thus
discovering division.
Launched into action, the system rapidly evolved into a sophisticated
scholar engaged in abstruse speculations in mathematical theory. Every so
often AM would latch onto a concept so obscure that Lenat would believe it
to be original, only to find that it already had been discovered--often by
some real-life theoretical genius.
..
Eventually, AM acquired an uncommon ailment for a computing system:
intellectual exhaustion. Having explored the esoteric reaches of
mathematics, AM suddenly downshifted into a preoccupation with rudimentary
arithmetic. Finally, with the remark, "Warning! No task on the agenda has
priority over 200," the system virtually expired, as though from boredom.
With his next program, Eurisko, Lenat attempted to repair the glitches
that had caused AM's ennui. Eurisko's spectacular debut came at the 1981
Traveller Trillion Credit Squadron tournament, a futuristic war game that
attracted players nationwide. Having been fed the game's 100-page rule
book, Eurisko exploited previously unnoticed loopholes, crafting innovative
spaceship designs and ingeniously novel strategies to deploy a small,
nimble fleet.
..
Eurisko easily blew the competition out of the heavens, a feat
duplicated the following year. Before the 1983 competition, the organizers
informed Lenat that if he chose to enter again, they would cancel the
games. He retired gracefully, holding the rank of intergalactic admiral.
In 1984, after he had formulated the design principles that would
underlie Cyc, he was invited to set up the project at the Microelectronics
& Computer Technology Corp., a research consortium founded in Austin by a
group of high-tech companies. (MCC spun off Cycorp in 1995.)
..
His first step was to design a framework of concepts and categories
akin to those of a standard thesaurus. The concept "space," for example,
might encompass terrain, elbow room and nothingness, as well as city,
country and town. Within each frame his team would place the appropriate
common-sense assertions: Terrains can vary, cities have borders, and so on.
As implemented by teams of programmers that occasionally exceeded 100,
the knowledge base grew exponentially. Each new assertion had to be
categorized and its contradictions with other assertions resolved: Cyc
could know both that Dracula was a vampire and that vampires do not exist
only by understanding that the first was true in a fictional world and the
second in the physical world. This was done by positioning each assertion
in its own "microtheory," or context, such as fiction versus fact.
..
Although not exactly a secret, Cyc remained an enigma to outsiders.
Press reports tended to describe it in terms of the large, glowering
electronic brain of popular fancy, rather than a sophisticated software
system. A local film crew once published a photo purportedly showing the
master computer on the Cycorp premises, a massive hardware unit with
blinking lights they had encountered in a storeroom; in fact, it was the
building's air conditioner.
..
Does It Know People Can Run?
Computer professionals given a chance to interact with the growing
system often came away excited by its capabilities and disillusioned by its
shortcomings. When Vaughan Pratt, a Stanford computer scientist, was
allowed a 2 1/2-hour session supervised by a Cyc engineer in 1984 he came
away impressed by its ability to flag inconsistencies between databases and
perform other selected tasks. But when he attempted to ply Cyc with a
number of common-sense questions such as "Can people run?" and "How big is
the Earth?" he recalled the system seemed to get tied up searching for
relevant facts from within its vast database.
..
"Cyc didn't know what it knew," Pratt said in a recent interview. "The
stumbling block was that there was no mechanism for finding specific
facts." (Cycorp programmers subsequently told him they had fixed the flaw.)
But in other tests, Cyc blew away the competition as decisively as
Eurisko's space cruisers. In July 1998, the Pentagon put Cyc and a dozen
other AI systems through their analytical paces, giving each team a package
of 300 pages of abstruse data to program in their systems and following up
with a series of complicated strategic queries. Cyc scored better than all
the other systems put together, according to the company, leading the
Pentagon to make it the core of a new experimental program aimed at
developing large knowledge bases.
..
Now, three years later, Lenat believes Cyc is much closer to
fulfilling the role of an intelligent system that augments human
capabilities, which after all is the central goal of AI research. "Once you
have a truly massive amount of information integrated as knowledge, then
the human-software system will be superhuman," Lenat says, "in the same
sense that mankind with writing is superhuman compared to mankind before
writing."
..
But confident as he is that Cyc is about to emerge as a truly
intelligent machine, Lenat is thinking hard about the responsibilities
programmers have to ensure the software works exclusively to humans'
advantage.
"HAL killed the ['2001'] crew because it had been told not to lie to
them, but also to lie to them about the mission," he observes. "No one ever
told HAL that killing is worse than lying. But we've told Cyc."
..
Copyright 2001 Los Angeles Times