Business in Vancouver: News that works for you

    It's been a long time coming, but now your computer can actually take a letter from you

    by Alan Zisman (c) 1995 First published in Business in Vancouver ,  Issue #295  June 20, 1995 High Tech Office  column

    Do you remember HAL, the computer which turned homicidal in the film 2001--A Space Odyssey? You didn't see the astronauts in that film typing commands to HAL. No mouse, either. Instead, they talked, and HAL listened and (for much of the film) obeyed.

    But that's just science fiction, right? Not necessarily. Computers have had the potential to listen and recognize voices for some time now. Over a decade ago, the puny personal computers of the day could be harnessed to respond to verbal commands, and connected to robotic devices to carry out simple tasks. And as computers have become more powerful and sound capabilities more common, voice recognition has become more widespread, though by no means commonplace. SoundBlaster cards, for example, include a utility to allow users to talk back to their computer, and multimedia Macs include software allowing voice commands.

    (A particularly nice implementation, LISTEN for Windows, Verbex Voice Systems Inc., 1090 King Georges Post Rd., Bldg. 107, Edison, NJ 08837; tel. 800-275-8729) offers a program for $139 that allows the user to start programs by voice, and comes complete with voice commands that work in a wide range of popular Windows software.)

    But when users hear about voice recognition, their eyes light up with visions of telling their computer to "take a letter." And that's simply not possible with any of these programs.

    Enter IBM. It has been quietly working on voice recognition for a number of years now, and late in 1994, released the fruits of its labour. Originally released for OS/2 only as the IBM Personal Dictation System, it is now available for either OS/2 or Windows as VoiceType Dictation.

    Amazingly, it works pretty much as advertised. It comes with a 22,000-word vocabulary in standard (U.S.) English (UK English and several other languages are already available for OS/2 and will soon be made available for Windows). Users can add up to 2,000 customized words, and specialized dictionaries are available for law, emergency medicine, radiology, and journalism (complete with umpteen American politicians' names).

    It claims to be able to keep up with dictation at a pace of 70 words per minute, somewhat faster than most typists. And while it may make more mistakes than a good typist, it generally provides the correct version of homonyms like there/their/they're. (It can do this because it doesn't look at single words, but rather at patterns of three words at a time). And it learns as it goes along, so with time it tends to make fewer mistakes.

    Of course, it's not quite that simple: for starters, this is one huge program--16 floppy disks--and IBM recommends you have at least 62 megs of hard-drive space free before you begin installation. And sorry, you can't use a compressed drive for that.

    You'll need at least 12 megs of RAM, and 16 are recommended. You'll also have to increase your Windows swap file (virtual memory) to at least 14 megs, using up still more hard-drive space. You can reclaim some of that drive space after you've successfully installed the program and used its enrolment feature--training it to recognize your way of speaking.

    And it's not just software. You have to crack open your computer's case and install a card. This is available in an ISA version for standard PCs, or a Micro-Channel version for IBM PS/2s, or as a PC Card (PCMCIA card) for portables. As with most such add-in cards for PCs, you may have to fiddle with technical obscurities such as IRQ and I/O addresses (especially if you're using the ISA version with Windows: OS/2, Micro-Channel, and Win95 all help protect the user from this sort of thing).

    Once you've installed the card and the 16 disks, be prepared to spend a lot of time with it. It's not particularly complex software to use (despite the 400-page manual), but you need to train it to recognize your voice, and it needs to train you to speak in a manner that it understands. It doesn't recognize a continuous flow of words--normal conversational speech--so it wouldn't know where one word stops and another starts. Instead, you need to learn to speak accurately, and with tiny pauses between each word, sort ... of ... like ... this ... period. (Yes, you do need to tell it about punctuation, unusual capitalization, and so forth.)

    When you dictate, your text appears in a window. To get it into the word processor of your choice, you need to copy into the clipboard and paste into the other program.

    (An add-on from a local company, however, promises to make this simpler: Digital Dictate, from Vancouver's Barr Business Systems, works together with VoiceType Dictation, allowing users to dictate directly into Microsoft Word. WordPerfect will soon be supported as well. More information on this US$295 product is available from Barr Business Systems, 291 East 2nd Avenue, Vancouver V5T 1B8, tel. 872-2277.)

    Plan on taking at least two hours to work through the training scripts for VoiceType Dictation. After that, the machine needs to be left running on its own for at least two hours to process the results of the training. Oh, one more thing: the package--card, software, and microphone--costs about $1,500.

    So this is not a package to pick up on a whim. But it promises some real benefits for people in a wide variety of areas, including users who simply can't or won't type, or people who haven't been able to use computers in situations where they simply need to be using their hands for something else. While it demands a lot of computer (and financial) resources, VoiceType Dictation makes good on its promises to a surprising degree.

Search WWW Search

Alan Zisman is a Vancouver educator, writer, and computer specialist. He can be reached at E-mail Alan