Menu

Transcripts making problem by using Sphinx

Help
Anonymous
2004-07-26
2012-09-22
  • Anonymous

    Anonymous - 2004-07-26

    I just start to use Sphinx IV, a totally new green hand. My task is to use Sphinx to generate the corresponding transcripts for the input continuous wav file.

    Can anyone show me the basic procedure how to use the Sphinx to get the transcripts? I know the raw transcripts could contain lots of errors. But it doesn't matter. The raw transcripts will be fine. Anyone can help me out? Thank you very much!

     
    • Philip Kwok

      Philip Kwok - 2004-07-27

      Hi,

      This will basically be very similar to what the HelloWorld, HelloDigits, or HelloNGram demos are doing, but instead the input will be your continuous wav file, not live audio from a microphone. The wavfile demo (which you can access only by downloading Sphinx-4 via CVS). A combination of those demos will do it. Moreover, it looks like you might be doing large vocabulary recognition, in which case you will need a good language model.

      I will put up a very simple transcription demo to show you how to transribe continuous wav file that contains digits only. You will have to adapt that to recognize your target vocabulary. I'll get back to you once its done.

      philip

       
    • Anonymous

      Anonymous - 2004-07-27

      Thank you very much for your nice reply! I am really looking forward to seeing your transcription demo here.

      BTW, will "make an4_words_unigram     | tee an4_words_unigram.out" work well to get transcripts under windows? Thanks again!

      Larry

       
    • Philip Kwok

      Philip Kwok - 2004-07-27

      Hi Larry,

      The an4 stuff are really just for regression tests. For more information about regression tests, please refer to:

      http://cmusphinx.sourceforge.net/sphinx4/#batch_tests

      The 'an4_words_unigram' (or AN4 in general) regression test has a very limited vocabulary (~100-200 words), and is for test setups that have multiple audio files to decode, not one continuous audio file (in which case you will need to use the endpointer to segment the audio into utterances).

      I wonder why you would be using 'make'. Sphinx-4 has been converted from 'make' to 'ant' a long while ago. Please make sure that you're using the latest stuff.

      philip

       
    • Philip Kwok

      Philip Kwok - 2004-07-27

      Hi Larry,

      I've put a transcriber demo under demo/sphinx/transcriber. As you might notice, the code is very similar to the other demos. The demo simply takes an audio file that has three utterances of connect-digits, separates them into 3 utterances, decodes them, and prints out the result. Since the demo is for digits data, you need to modify the config.xml file so that it will decode according to your vocabulary set and size. Refer to the README.html page (which points you to the Programmer's Guide) for details.

      Finally, in order to get the demo, you must download Sphinx-4 using CVS. Good luck!

      philip

       
    • Anonymous

      Anonymous - 2004-07-27

      Hi, Philip,

      Thank you so much for your help. However, when I use CVS to update my files, I don't find the transcriber you mentioned here. Could you double check it? Thanks again!

      Larry

       
    • Philip Kwok

      Philip Kwok - 2004-07-28

      Hi Larry,

      Sorry, I forgot to check in the ant target for building the demo. I just checked it in. If you do an update, and then type:

      ant -find demo.xml
      java -jar bin/Transcriber.jar

      at the top level, you should see it running. The demo code itself is at demo/sphinx/transcriber. Let me know if you still can't find it.

      philip

       
    • Anonymous

      Anonymous - 2004-07-29

      Hi, Philip:

      Thanks for your effort. It's very weird. I use CVS to get the update. It runs well under Unix. However, it can't run under windows, since there is no Transcriber.jar in the bin directory. I copied one into it, but it doesn't work either. The system always gives me the wrong information as follows:

      Exception in thread "main" java.util.zip.ZipException: The system cannot find the file specified

      at java.util.zip.ZipFile.open<Native Method>
      at java.util.zip.ZipFile.<init><ZipFile.java:112>
      at java.util.jar.JarFile.<init><JarFile.java.117>
      at java.util.jar.JarFile.<init><JarFile.java:55>

      Could you help me? Thanks a lot!

      Larry

       
    • Philip Kwok

      Philip Kwok - 2004-07-29

      Hi Larry,

      Can you cut and paste the entire exception (not just the JarFile or ZipFile part) here? I want to see where the exception was thrown.

      philip

       
    • Anonymous

      Anonymous - 2004-07-29

      OK. But one more thing here. If I use the CVS to get the whole Sphinx package under windows, and I want to execute jsapi.exe first. However, the system doesn't allow me to do that. Do you know why.

      Another issue is if I download the source file sphinx4-0.1alpha-bin.zip and sphinx4-0.1alpha-src.zip and unzip them. The new fold is sphinx4-0.1alpha. In this case, if I use CVS to update it , it doesn't work well, because the fold name is different. The CVS fold is sphinx4. I don't know whether I make my question clear.

      Thank you!

      Larry

       
    • Philip Kwok

      Philip Kwok - 2004-07-29

      Hi Larry,

      I think the jar file not found is referring to jsapi.jar. Do a 'chmod a+x jsapi.exe' to make sure you have the permissions to execute jsapi.exe. If it still doesn't work, please cut and paste here the exact error message. In any case, if nothing works, copy lib/jsapi.jar from UNIX to Windows.

      You cannot update the sphinx4-0.1alpha-bin.zip or sphinx4-0.1alpha-bin.zip via CVS. Please follow the instructions here:

      http://sourceforge.net/cvs/?group_id=1904

      to get Sphinx-4 via CVS. You should use the instructions at 'Anonymous CVS Access'. Let me know if you have any more problems.

      philip

       
    • Anonymous

      Anonymous - 2004-07-29

      Hi, Philip:

      I used 'chmod a+x jsapi.exe'  successfully solve the problem to execute jsapi.exe file. But in the second step, when I type ant to build, there are some wrong info coming again. Since I use doc environment under windows, I do not know how to copy and paste the error information here. Could you enligten me?

      Thanks!

      Larry

       
    • Anonymous

      Anonymous - 2004-07-29

      Hi, Philip:

      I figured out how to solve the problem finally, though it's still a little bit weird. Since I can't build bin directory if I use CVS download sphinx, I use the sphinx4-0.1alpha-bin.zip and sphinx4-0.1alpha-src.zip to build the sphinx firstly. And then I copy the demo.xml file from the sphinx I get from CVS to overwrite the original demo.xml.

      Now I can get the transcriber run successfully. Thanks!

      Larry

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.