Thursday, June 24, 2010

Computers Make Strides in Recognizing Speech

There's another good installment in the NYT's Smarter Than You Think series. This one concerns progress in computer speech recognition and human-computer interaction in general.
The number of American doctors using speech software to record and transcribe accounts of patient visits and treatments has more than tripled in the past three years to 150,000. The progress is striking. A few years ago, supraspinatus (a rotator cuff muscle) got translated as “fish banana.” Today, the software transcribes all kinds of medical terminology letter perfect, doctors say. It has more trouble with other words and grammar, requiring wording changes in about one of every four sentences, doctors say.

“It’s unbelievably better than it was five years ago,” said Dr. Michael A. Lee, a pediatrician in Norwood, Mass., who now routinely uses transcription software. “But it struggles with ‘she’ and ‘he,’ for some reason. When I say ‘she,’ it writes ‘he.’ The technology is sexist. It likes to write ‘he.’ ”
Meanwhile, translation software being tested by the Defense Advanced Research Projects Agency is fast enough to keep up with some simple conversations. With some troops in Iraq, English is translated to Arabic and Arabic to English. But there is still a long way to go. When a soldier asked a civilian, “What are you transporting in your truck?” the Arabic reply was that the truck was “carrying tomatoes.” But the English translation became “pregnant tomatoes.” The speech software understood “carrying,” but not the context.

Yet if far from perfect, speech recognition software is good enough to be useful in more ways all the time. Take call centers. Today, voice software enables many calls to be automated entirely. And more advanced systems can understand even a perplexed, rambling customer with a misbehaving product well enough to route the caller to someone trained in that product, saving time and frustration for the customer. They can detect anger in a caller’s voice and respond accordingly — usually by routing the call to a manager.

So the outlook is uncertain for many of the estimated four million workers in American call centers or the nation’s 100,000 medical transcriptionists, whose jobs were already threatened by outsourcing abroad. “Basic work that can be automated is in the bull’s-eye of both technology and globalization, and the rise of artificial intelligence just magnifies that reality,” said Erik Brynjolfsson, an economist at the Sloan School of Management at the Massachusetts Institute of Technology.
Read the whole thing.


Greg said...

It's not hard to imagine the production version of the "medical assistant" incorporating remote-sensing temperature and respiration sensors (for triage), or guiding people through self-administration of flu shots, teaching new diabetics how to manage their condition, getting prescriptions filled and delivered to patients' homes... and probably much more. Querying a prescription with the doctor when there are contra-indications?

This is the start of the long-awaited productivity surge in health care.

And that's only one industry. Hmmm... the economic "singularity" may be coming much sooner than even I thought.

jdl75 said...

This singularity belief is the archetype of intellectual lazyness (besides being an avatar of puritan classical self hate).

Stuart Staniford said...

jdl75: I think you should spell "intellectual lazyness" correctly and make your argument in more than one sentence before you are in a position to accuse others of it :-)

jdl75 said...

Some things do not show up well using pure dialectics (besides having no point of being adressed through this method), that's why poetry exist, and maybe also the reason why people not understanding it usually fall into this "singularity" baby lollipop :-)

And sorry for the spelling mistake :)

Eric Hacker said...

Soon the game will be to identify who is using speech recognition software and not catching the errors. Or should we just expect to accept things like "...that cities of all kinds will fall into a deficit within the next five years and be unable to provide the level of services residents and businesses have come to accept,” the report states. " from here as spotted on today's TOD Drumbeat.

Perhaps the report writer is seriously grammar challenged, but at least now they can have a good excuse by blaming the speech recognition software.

Greg said...


The economic singularity is not the same as Kurzweil's mystical rapture/singularity.

The amount of economic production is related to the amount of labour employed. Economists pay close attention to productivity, the amount of production per person per hour. It has been rising, overall, since the start of the industrial revolution.

The economic singularity is when productivity becomes undefined - any desired amount of economic production can be achieved with any arbitrarily chosen amount of human labour.

It's a mathematical singularity-a point at which an otherwise continuous function is undefined.

KLR said...

FWIW, Kurzweil made predictions for 2010 in his 2005 book The Singularity Is Near such as full immersion VR being commonplace and real time translation of foreign languages that simply aren't ubiquitous at all.

Last visit I had to an MD he spent about 50% of his time entering data into some medical database program, imposition courtesy of Obamacare.

I'd be content simply with not logging me out every day...and all I need to do is go to the sign in page and I'm logged in of computing's seemingly infinite little irritations.