Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Text to phoneme conversion (& .NET interface)

2008-10-09
2013-06-12
  • Rob Summers
    Rob Summers
    2008-10-09

    I'm writing some software (Windows) to generate phonetic transcriptions of English words/phrases. In order to do this I call the espeak command line program and grab its output. This is a bit painful when I'm calling espeak a few hundred times!

    What I'd like to do is call espeak using the api but just get the phonetic translation and not perform any synthesis. Is there an easy way to disable the synthesis somehow when compiling the dll?

    The reason I ask is that I'm using vb.net to write the software, I've (with a little help from a C++ to C# tool and a C# to VB.net tool) converted the espeak_lib.h file into something that vb.net could use and so far I appear to be able to successfully initialize espeak, set the voice, and call the phonemetrace function. But when I call espeak_synth either the program hangs when espeak is initialized with espeak_AUDIO_OUTPUT.AUDIO_OUTPUT_SYNCHRONOUS or a memory exception error occurs for any of the other espeak_AUDIO_OUTPUT types.

    I've tried working through the source but, I couldn't see an easy way to bypass the synthesis and just extract the phonetic transcription.

    If anyone could help I'd be grateful.

    Rob

    PS. I'll tidy up the VB.net "espeak_lib.h" file, if anyone's interested in it. Should also be easy to convert to C# too.

     
    • Rob Summers
      Rob Summers
      2008-10-09

      Actually, espeak_synth just returns -1 with espeak_AUDIO_OUTPUT_PLAYBACK or espeak_AUDIO_OUTPUT_RETRIEVAL set. With AUDIO_OUTPUT_SYNCHRONOUS I get a memory exception error. With AUDIO_OUTPUT_SYNCHPLAYBACK I get a hang.

      Rob

       
    • You want AUDIO_OUTPUT_SYNCHRONOUS mode.  This performs the synthesis, but returns the speech WAV data in buffers to a callback function in your program, so you can ignore it.  It can also give your callback function a list of phonemes (as espeakEVENT_PHONEME events).

      Call espeak_Initialize() with AUDIO_OUTPUT_SYNCHRONOUS . Set its "options" parameter to 1, to allow phoneme events.

      Register a callback function (a function in your own program) by calling espeak_SetSynthCallback(MyCallbackFunction).  Then when you call espeak_Synthesize(), the eSpeak DLL will call your callback function and its "events" parameter points to a list of events, including phoneme events.  A phoneme event contains the phoneme name in the "number" field of the espeak_EVENT structure.  The phoneme name, of up to 4 characters, is packed into the 4 bytes of "number". 

       
      • Rob Summers
        Rob Summers
        2008-10-10

        Ah thank you, the fact that I'm getting a memory exception error is probably due to me not converting the function prototype properly to vb.net. I will do some more digging.

        Rob