Q. What is Continuous-speech and Discreet-speech recognition?
A. Continous-speech is the process of speaking or dictating continuously to a PC, without pauses between individual words, as in conversational speech, in order to get either programs or system commands implemented or a text transcription of very high quality. The goal is to completely replace the keyboard and the mouse as the main PC interfaces, in order to increase the comfort and professional performance of the user. In fact, several studies have already confirmed that anybody can generate text by dictating to a PC approximately three times quicker than the best typist!... In comparison, Discrete-speech recognition requires users to pause slightly between each individual word, which requires a certain amount of practice. Continuous-speech is much more computationally demanding and gratifying for the user. Current applications of this kind are Speaker-Dependent. They require some initial training to recognize the unique voice of each user, by pronouncing sample words. This training usually takes 15 to 20 minutes, though Intel Pentium IV and Intel Core Duo compliant software -as for instance, Dragon NaturallySpeaking Professional- only need 3 to 5 minutes!.
Q. When and how did this technology appear in the PC market?
A. Continuous-speech recognition has been a goal of the PC industry since the very beginning. However it was only a couple of years ago -somewhat earlier than expected- with the advent of the Pentium III class of computers and significant improvements in sound signal processing, acoustic modeling, speaker enrolment and language modelling, that Continuous-speech recognition became a workable and practical solution. In June 1997, Dragon Systems introduced the first general-purpose Continuous-speech recognition program for the PC under the registered name Dragon NaturallySpeaking. It didn't take long for the other speech vendors to respond. IBM Corp. followed soon with ViaVoice. And two others, Lernout & Hauspie Speech Products (VoiceXPress) and Philips Electronics (FreeSpeech) also entered this emerging market, though both products have been discountinued recently.
Q. Who could benefit from Continuous-speech recognition? Is it adequate for me?
A. The target market for these applications includes:
Sales professionals, telecommuters, and mobile professionals
Anyone who takes notes after meetings or creates action item lists will find that thoughts can quickly be captured by dictating naturally and continuously to a PC, even on the go.
Business executives and others who do not like to type
Most executives must now work without a dedicated administrative staff. Many of them did not learn how to type efficiently or may be uncomfortable with a keyboard. These applications now let them dictate text much more quickly, and with correct spelling every time!
Small business and home office workers (SOHO)
Anyone without a secretary will find it easier to get written work done more quickly and easily in a variety of business applications, saving time and increasing productivity.
Legal and Medical Professionals
These professionals usually need to create a great deal of reports, records, notes, correspondence and other documents. In addition, some manufacturers offer specialized products for these professionals, such as: Dragon NaturallySpeaking Legal Suite and Dragon NaturallySpeaking Medical Suite which provide comprehensive professional vocabularies and mobile support. These suites are currently only available in American English and German; additional language support is being developed or is provided by third parties in other languages.
Writers and authors
Anyone who creates a significant amount of text can just speak to a PC and see their thoughts flow onto screen, which enhances both creativity and fun. Blind, disabled people and users at risk of RSI
As the professional version of most of these programs incorporates a play back function, blind people will hear the text they dictate as it is recognised by the PC. In addition, anyone who spends a significant amount of time typing documents will find that speech recognition can reduce the risk of injury from repetitive keyboard and mouse movements. By the way, the European Directive 2000/78/CE, in vigour since December 2003 compels companies and public administrations to garantee the full "equality of chances" to their workers suffering from these disabilities, by providing them whatever technological means -as for instance, speech recognition- they need.
Q. Is there any Continuous-speech recognition program that supports the MAC OS?
A. Both Dragon Systems and IBM announced the release of special versions of their programs, Dragon NaturallySpeaking and ViaVoice, for the iMac by the end of 1999. However, only ViaVoice for Mac is currently available.
Q. Which is the best Continuous-speech recognition PC program of all?
A. According to one of the most authoritative and comprehensive reviews ever published ("Speech Recognition: Finding Its Voice", PC Magazine):
"Of the four, NaturallySpeaking consistently delivered the best accuracy. Voice Xpress makes an impressive debut with tight Microsoft Word-integration and the second-best accuracy. ViaVoice offers unparalleled command-and-control capabilities, but its accuracy was disappointing. Finally, FreeSpeech costs a lot less than the competition, but it also gives you a lot less".
However, notice that accuracy and speed are no longer the only rules of the game. With most professional versions of these programs tending to offer similar performances, a friendly interface for correction and editing of the text you dictate makes a great difference.
Q. What is the accuracy and speed?
A. Independent reviewers and many users achieve a dictation speed of up to 160 words per minute with an accuracy of 95-98%, that is to say with just 2-5% of transcription errors, with the professional version of these products. For instance, Dragon NaturallySpeaking "achieved up to 99% accuracy" in independent testing performed by PC Magazine ("Speech Recognition: Finding Its Voice"). Though these impressive results imply sometimes a certain amount of training and customisation -development of specialised dictionaries which facilitate the recognition process and even voice activated macros for insertion of boilerplate text- you can expect rates of up to 150+ words per minute with accuracy of 92-94% straight out of the box -after the initial 3 to 5 minutes training- with the professional version of these products. As it happened with Optical Character Recognition (OCR) software a few years ago, any lower rate than that is not really cost-effective for professional work, as the user will spend most of the time saved dictating to the PC correcting trascription errors. This is why SpeechWare only carries the best, professional versions, of these products.
Q. What languages are currently avalaible for Continuous-speech recognition? Is there any multilingual version for polyglots?
A. Dragon NaturallySpeaking is currently available in:
On the other hand, multilinguism is fully supported in the professional versions of these products. Polyglot users simply switch back and forth between language modules of the same application, as they use different languages. By the way, some multilingual applications, such as Dragon NaturallySpeaking, are bundled with several language models -see the "Software" section of this website for more information. This significantly reduces total cost of ownership.
Q. How easy is it to dictate to a PC and to eventually import the text into another application?
A. Users can directly dictate in the window of the streamlined word processor provided by these programs or virtually in any Windows application, for example major word processors -Microsoft Word, WordPerfect and Lotus WordPro- or e-mail applications. The first solution is more effective, in particular for slow computers, as those word processors are optimised for voice input, speed and performance. When the users do that, they can export the text into other applications using the standard Windows method of "copy/paste". Files can also be saved as ASCII, Rich Text Format or Doc, which are easily read by major word processors. Besides, NaturallySpeaking, ViaVoice, and VoiceXpress all support "modeless operation", so the user can switch seamlessly between dictation and other tasks -correction, editing, issuing voice commands, use of the keyboard at any time, etc. The three also recognise Microsoft Word "natural" or plain language commands during dictation, that is to say, without even forcing the users to remember the exact name of each command of the application to execute it.
Q. Can you dictate, correct or edit later, or must you complete the whole dictation at once? How?
A. All current professional systems allow both dictation and correction or editing simultaneously and during the same session. Moreover, some of them even have the facility to play back the user voice at a later stage, by simply highlighting the words and issuing the appropiate command. This function facilitates correction, proof reading and editing by an assistant in your absence. With our Transcription Aid for Dragon NaturallySpeaking your secretary can even update the User Profile -a necessary task for improving recognition accuracy- when correcting, to increase the accuracy of the person that generated the file! Many executives are jumping today into Speech recognition thanks to this first-time ever, impressive feature!...
Q. Is it possible to dictate in a noisy environment with good results?
A. With the recent introduction of Active Noise Cancelling microphones, that filter out background noise very efficiently, you can dictate in virtually any environment with excellent results: a noisy office, a railway station, a plane!...
Q. Are multiple users supported?
A. Technically speaking, several people within an office can use the same program at different times of the day, when they install it in one single PC and create their individual User Profile, after training or adapting to their own voice for a few minutes. However, legally speaking each individual user needs to buy a license even if they all share the same application.
Q. Can you use these applications on a network?
A. Some professional systems, such as Dragon NaturallySpeaking Professional 9 offer some networking facilities, that allows simultaneous speech recognition by many users on a network. Each one of them will keep their own User Profile files on their own computer, whereas the recognition engine and other associated files will be shared. This solution is ideal for big organisations as it is much more efficient and lowers software purchasing costs.
Q. I need portability above all: can a notebook, sub-notebook, hand-held digital recorder, personal digital assistant (PDA) or even a Smart Phone be used with these applications?...
A. All existing professional Continuous-speech recognition programs can be efficiently used on a good laptop, notebook and sub-notebook running with batteries. However, a major problem with this equipment is that it doesn't entirely meet the stringent sound requirements of these applications, as they are much more prone to "internal noise" than Desktop machines due to the high level of physical integration of their components. SpeechWare recommends several models with exceptional acoustics which work flawlessly. If you already have a notebook with poor sound performance, you can use an USB microphone and sound card combo optimised for speech recognition.
The advantages of these units are threefold:
Regarding hand-held digital recorders, there are a certain number of units on the market -made by Olympus, Sony, Norcom, etc.- which can be used with the professional version of these programs as you can see in the Hand-held digital recorders section of this web. They offer "deferred" transcription of recorded files in virtually any PC -even slow ones. Their accuracy rate is very high -up to 96%-, especially when they are used with a professional microphone for speech recognition conected to the Mic-In port.
Finally with Transcription Aid for Dragon NaturallySpeaking you can now achieve the same results as with a digital recorder, but using virtually any personal digital assistant -Pocket PC, PalmPilot or Clié- and even a Windows Mobile Smartphone -a world premier feature!