440 Davis Court #1602
San Francisco, CA 94111-2496
415 781 5700

March 29, 2000

03 00050 61 00032901

Mr. Ross H. Armstrong
U.S. Sales Development
IBM Global Financing
NNNN Executive Parkway
Somewhere, NY 94583

Subject:   IBM's voice recognition

Dear Ross,

Here is a little something on the voice recognition thing you mentioned the other day, along with Hutch's paper on Prometheus. You might mention to Hutch that his paper has been quite a hit at a Colloquium sponsored by Stanford on developing a dynamic knowledge repository. The IBM voice thing came up in connection with some folks trying to create an SDS capability, although they have not figured that out yet.

Here is a little something on a recent meeting I was asked to attend, but not sure why. As you know, once people discover what "knowledge management" really entails, they are not going to want it, as Prometheus found out...

Hope all is well with Claudia, you and the kids.



Rod Welch


Date: Wed, 29 Mar 2000 21:28:57 -0500

From:   Paul Fernhout
Organization: Kurtz-Fernhout Software


Subject:   Speech Input (was: Twiddler chord keyboard)

Neil Scott wrote:

The Archimedes Total Access System interface allows you to use your Palm Pilot to emulate the keyboard and mouse functions on a computer. It is surprisingly convenient to use since the keyboard and mouse functions are performed on the same device. It is very comfortable to use for doing text editing and making corrections. It is infinitely better than the twiddler.

Sounds interesting. Do you have a URL for this?

IBM has demonstrated speech recognition on the Palm Pilot which means it is possible to talk, point, and make corrections from a single hand-held device.

I spent 18 months as one of the developers on this particular IBM project. The Personal Speech Assistant is a neat device (IMHO). While I still can't comment in detail on the device, in general the speech recognition is not done on the Pilot itself -- rather it is done on an add-on device which wraps around the Palm. Inside that box is a faster processor. It does not support full dictation -- just command and control of several hundred commands. For more details see:

The PSA is one of several initiatives IBM has in the pervasive computing space. [Note: I don't claim to speak for IBM -- these opinions are my own.]

Other companies like Lernout & Hauspie and Microsoft have recently announced their own hand-held devices you can talk to as well (L&H's does dictation, I believe Microsoft just does command and control). Vendors like AT&T/Lucent have demonstrated devices that do speech recognition (again with the Palm) using wireless link to a centralized server.

Based on industry trends, and the exponential growth in computing power / cost, you can expect to see some amazing handheld things in handheld speech products over the next decade. A dictation product like IBM's ViaVoice or Dragon Dictate takes around 200 MIPS to run well (obviously more is better). A modern low-power CPU like the StrongArm can do about 200 MIPS (and a pair of AAAs can run this at full speed for somewhere around 1/2 an hour).

A new version of the StrongArm coming out in the next few months will do 600 MIPS on on 1/2 the power (or 200 MIPS on 1/6 the power). So you can do the math and see what will be available when as MIPS/Watt doubles every year or so...

Still, it is a well known fact in the speech industry that getting that last few percent of accuracy under varying acoustic conditions and multiple speakers is a difficult problem. So, this is a still a very hot area of research.


Kurtz-Fernhout Software

Paul Fernhout
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator