|
|
Subscribe / Log in / New account

The eSpeak Speech Synthesizer

Your author has been interested in computer speech synthesis since the late 1970s, when he interfaced a Votrax SC-01A speech synthesizer chip to his Imsai 8080 computer with some wire-wrap wire. News of the recently created eSpeak project naturally piqued his long-time interest in speech synthesis.

eSpeak is a compact phoneme-based speech synthesis system that is available under version 2 of the Gnu General Public license.

eSpeak is a software speech synthesizer for English, and potentially other languages. eSpeak produces good quality English speech. It uses a different synthesis method from other open source TTS engines, and sounds quite different. It's perhaps not as natural or "smooth", but I find the articulation clearer and easier to listen to for long periods.

eSpeak is a much simpler system than Festival, a popular speech synthesis project from the University of Edinburgh's Centre for Speech Technology Research. Unfortunately, the Festival project has been stuck at version 1.95 (2.0 beta) for the last two years.

The installation and usage document explains how to set up the software. Installation is trivial, if somewhat different than for most applications. It involves copying the binary speak file to an executable directory and moving a library directory to /usr/share. The combined executable and library files weigh in at under 500 Kb, making it suitable for use in embedded systems. Source code for eSpeak is available for those who wish to compile the software locally.

Using the software is trivial, typing "speak 'what you want to say'" causes the desired speech to be rendered and output to the speaker. Speaking the contents of a file can be done with the command: speak -f filename. eSpeak can also read its input from stdin, allowing it to be used with other applications. There are currently nineteen English phoneme sets available which provide a variety of British accents, male/female voices and tonal characteristics. German and Esperanto phoneme sets are also available. Other languages can also be supported, but the work has not yet been done.

eSpeak can output directly to the sound driver, it can also create .wav files, and send the audio to stdout. The -x option causes the program to output phoneme mnemonics to the screen.

The speech quality is quite mechanical, but is fairly easy to understand. It is not as refined as the output of Festival, but should suffice for many applications. As with most speech synthesis applications, mispronunciation is fairly common, English pronunciation rules involve many special exceptions and ambiguities, accurate text to speech conversion is a non-trivial software task.

The most recent release of eSpeak is version 1.10, released on April 29, 2006. The change log file indicates recent work on UTF-8 encoding, support for embedded pitch and amplitude modulation, improvements to numerical pronunciations, several new command line capabilities and more.

If you need a decent open-source speech synthesis application for your latest project, or simply want to play with some interesting software, give eSpeak a try.


(Log in to post comments)

The eSpeak Speech Synthesizer

Posted Jul 7, 2006 6:04 UTC (Fri) by jonabbey (guest, #2736) [Link]

This sounds like AmigaDOS, which had a 'say' command line utility back in the day that used the speech device driver the Amiga had.

I'm disappointed that there do not appear to be any speech samples on the eSpeak site.

Other similar systems

Posted Jul 9, 2006 0:12 UTC (Sun) by job (guest, #670) [Link]

There is also Flite, a simple and small Festival-compatible system.

The same research group also has other text-to-speech tools for example for building voices, as well as the Sphinx system which is quite complex.

(A quick Google search also revealed anonther research project called Mbrola which had some pretty impressive singing demos.)

The eSpeak Speech Synthesizer

Posted Jul 12, 2006 19:35 UTC (Wed) by Zenith (guest, #24899) [Link]

Ubuntu hopes to feature eSpeak in Edgy. I saw this in their LaunchPad system, and this LWN contributed article makes mention of it as well.


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds