Menu

Need help in finding the the function definition for a function in pocketsphinx.py

Help
2016-10-18
2016-11-05
  • Pratik Edavi

    Pratik Edavi - 2016-10-18

    Hi,

    I am using Speech recongnition python package which uses pocketsphinx. I am trying to understand the workings of the package. I followed the code as below:
    In init.py file of speechrecognition we set the paths to acoustic model, language model and dictionary.
    Later we the create the decoder object using

    decoder = pocketsphinx.Decoder(config)
    Then its running decoder.start_utt().
    Checked pocketshinx.py and went through the code
    The function is returning _pocketsphinx.Decoder_start_utt(self).
    and found
    _pocketsphinx = swig_import_helper()

    But I am unable to find _pocketsphinx file. There is _pocketsphinx.a file but cannot open it.

    Regards,
    Pratik

     
    • Nickolay V. Shmyrev

      Pocketsphinx wraps API for Python using SWIG, you need to study more how swig works. Check http://swig.org/Doc3.0/Contents.html#Contents

       
  • Dattatreya

    Dattatreya - 2016-10-18

    Hi Nickolay,
    Thank you for answering Pratik's query
    In the init.py file for for speech reconizer package of python
    I want to print the corresponding phonetic words of the words that are detected from the .wav file.
    I require this to understand the corresponding phonetic translations of the words that are detected from the audio file.
    Also it would be helpful if you can suggest a way for performing step-by-step debugging of the init.py in speech-recognizer package .

    Can you explain what does goforward.raw file do (present in pockesphinx folder inside python site-packages)?

    Regards,
    Dattatreya

     
    • Nickolay V. Shmyrev

      I want to print the corresponding phonetic words of the words that are detected from the .wav file.

      It is not possible, for performance reasons phonemes are ommitted in recognizer. You can only take them from the dictionary file.

      Also it would be helpful if you can suggest a way for performing step-by-step debugging of the init.py in speech-recognizer package .

      Run under gdb

        gdb --args python file.py
      

      put breakpoint on any pocketsphinx function and proceed from there.

      Can you explain what does goforward.raw file do (present in pockesphinx folder inside python site-packages)?

      goforward.raw file is an audio data containing the recording of "go forward ten meters". You can listen it in audacity.

       
  • Dattatreya

    Dattatreya - 2016-10-21

    Hi Nickolay,
    we are tryingto adapt to the existing acoustic model
    We followed the article mentioned in CMUsphinx for adapting acoustic model
    For this we downloaded some videos from youtube and converted them as to audio as per required format(16 bit 16khz ) using python package moviepy.
    we divided 1 audio file into 30secs internval using python (moviepy)
    then used the google api for getting the transcriptions.
    and used the output to train the model
    we had training videos of 3 1/2 hrs.
    After training tested the with another audio from same speaker. But still the output wasn't satisfactory.
    What could be the issue? Could it be beacause of the speed of the speech or something else?
    Or beacause of the way the speaker is pronouncing a particular word ?

    Eg:
    Actual Transcription(Google API):
    banks are required to pay interest deposits with banks are known as liability what is liability what is asset you have a friend who always gives you money he is your message you have a friend who always gives you money so always helps you he is your assets (Banking Awareness Lecture - Module 1_12)

    Output using adapted model:
    then so they live update this
    the message that the banks are known as the elites
    what these lady
    what the stashed in you have a friend whole lot las vegas you whiny
    he is service that
    you have a friend who was initially use your money
    well antunes gets you he is you as a

    Note: we are runnig this command to test the output
    (pocketwsphinx version is prealpha5)

    pocketsphinx_continuous -hmm en-us1 -lm en-us.lm.bin -dict cmudict-en-us.dict -mllr mllr_matrix -infile /home/eight/Videos/Banking_Awareness_Lecture_Module_1_12_converted.wav

    We also tried running the above command using by the default values of hmm,lm and dict
    (the files that are stored in /usr/local/share/pocketsphinx) but the result was the same

    Regards,
    Dattatreya

     

    Last edit: Dattatreya 2016-10-21
  • Dattatreya

    Dattatreya - 2016-10-21

    hi nickolay,
    the link of the videos we downloaded for testing
    https://www.youtube.com/watch?v=BZoIuv1kLh0&list=PLswCVWtC7kMT6Q-ZCjsJcHomJzgXQbCez

     
    • Nickolay V. Shmyrev

      This video has music on background. It should be very bad for accuracy. And accent of course. You should clean music first with some kind of NMF algorithm, then submit to transcription.

       
  • Dattatreya

    Dattatreya - 2016-10-24

    Hi Nickolay,
    Can you please clarify about the accent problem?
    How should we decide whether the speaker's accent is going to be a problem for training the acoustic model ?

    Regards,
    Dattatreya

     
    • Nickolay V. Shmyrev

      How should we decide whether the speaker's accent is going to be a problem for training the acoustic model ?

      For training accent is not a problem given you have sufficient data. Adaptation is not going to work well for accents.

       
  • Dattatreya

    Dattatreya - 2016-10-25

    Thank you

     
  • RUPA SINGH

    RUPA SINGH - 2016-11-04

    Hi Nickolay,
    I am following cmusphinx tutorial and i have added and replaced some words in the existing dictionary. For example,
    1 W AH N
    10 W AH N Z IY R OW
    11 W AH N W AH N
    11TH W AH N W AH N T IY EY CH
    12 W AH N T UW
    19 W AH N AY N
    1939 W AH N AY N TH R IY N AY N
    1946 W AH N AY N F OW R S IH K S
    1984 W AH N AY N EY T F OW R
    1985 W AH N AY N EY T F AY V
    1988 W AH N AY N EY T EY T
    1989 W AH N AY N EY T N AY N
    1ST W AH N EH S T IY
    2 T UW
    25 T UW F AY V
    3 TH R IY
    39 TH R IY N AY N
    4 F OW R
    7000 S EH V AH N Z IY R OW Z IY R OW Z IY R OW
    72 S EH V AH N T UW
    7500 S EH V AH N F AY V Z IY R OW Z IY R OW
    8 EY T
    9 N AY N

    i have replaced '8' with 'EIGHT'

    output before replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER 8 THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN

    output after replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER THE THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN

    In the second output instead of 'THE' , 'EIGHT' should come. i am unable to add or replace words in the dictionary.

    Thank you

     
    • Nickolay V. Shmyrev

      You need to start a new thread to ask a new question.

       
  • RUPA SINGH

    RUPA SINGH - 2016-11-05

    okay, thank you.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.