|March 29, 2000|
03 00050 61 00032901
Mr. Ross H. Armstrong
U.S. Sales Development
IBM Global Financing
NNNN Executive Parkway
Somewhere, NY 94583
|Subject:||IBM's voice recognition|
Here is a little something on the voice recognition thing you mentioned the other day, along with Hutch's paper on Prometheus. You might mention to Hutch that his paper has been quite a hit at a Colloquium sponsored by Stanford on developing a dynamic knowledge repository. The IBM voice thing came up in connection with some folks trying to create an SDS capability, although they have not figured that out yet.
Here is a little something on a recent meeting I was asked to attend, but not sure why. As you know, once people discover what "knowledge management" really entails, they are not going to want it, as Prometheus found out...
Hope all is well with Claudia, you and the kids.
THE WELCH COMPANY
Date: Wed, 29 Mar 2000 21:28:57 -0500
Paul Fernhout |
Organization: Kurtz-Fernhout Software
|Subject:||Speech Input (was: Twiddler chord keyboard)|
Neil Scott wrote:
Sounds interesting. Do you have a URL for this?
I spent 18 months as one of the developers on this particular IBM project. The Personal Speech Assistant is a neat device (IMHO). While I still can't comment in detail on the device, in general the speech recognition is not done on the Pilot itself -- rather it is done on an add-on device which wraps around the Palm. Inside that box is a faster processor. It does not support full dictation -- just command and control of several hundred commands. For more details see:
The PSA is one of several initiatives IBM has in the pervasive computing space. [Note: I don't claim to speak for IBM -- these opinions are my own.]
Other companies like Lernout & Hauspie and Microsoft have recently announced their own hand-held devices you can talk to as well (L&H's does dictation, I believe Microsoft just does command and control). Vendors like AT&T/Lucent have demonstrated devices that do speech recognition (again with the Palm) using wireless link to a centralized server.
Based on industry trends, and the exponential growth in computing power / cost, you can expect to see some amazing handheld things in handheld speech products over the next decade. A dictation product like IBM's ViaVoice or Dragon Dictate takes around 200 MIPS to run well (obviously more is better). A modern low-power CPU like the StrongArm can do about 200 MIPS (and a pair of AAAs can run this at full speed for somewhere around 1/2 an hour).
A new version of the StrongArm coming out in the next few months will do 600 MIPS on on 1/2 the power (or 200 MIPS on 1/6 the power). So you can do the math and see what will be available when as MIPS/Watt doubles every year or so...
Still, it is a well known fact in the speech industry that getting that last few percent of accuracy under varying acoustic conditions and multiple speakers is a difficult problem. So, this is a still a very hot area of research.
Developers of custom software and educational simulations
Creators of the Garden with Insight(TM) garden simulator