CMU Sphinx / Forums / Speech Recognition Theory: Problem with python "import

Joshua Fialkoff - 2009-11-17

Hello all,
I have successfully run "setup.py install" on the python module for Sphinx3
using svn checkouts of Sphinxbase and Sphinx3. The installation went fine
after I fixed the incomplete function call on line 172 that multiple people
have noted previously. However, when I try to run "import _sphinx3" from the
Python command line I get:

/home/josh/Documents/v2sm/<ipython<br>console> in <module>()

<type 'exceptions.ImportError'="">:
/usr/local/lib/libs3decoder.so.0:
undefined sym bol: hash_new

Has anyone else come across this error? Any help would be much appreciated.

Cheers,
Josh

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2009-11-17

python module for Sphinx3

We don't recommend you to use sphinx3. Use pocketsphinx instead.

I fixed the incomplete function
call on line 172

Could you submit a patch?

undefined sym bol: hash_new

This is also a bug. Try to replace hash_new\ with hash_table_new in
libs3decoder/libcfg/s3_cfg_convert.c

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Joshua Fialkoff - 2009-11-21

Thanks so much for your help.

We don't recommend you to use sphinx3. Use pocketsphinx instead.

Could you tell me why this is the preferred package? I read the comparison, it
doesn't seem to state that one is any better than the other. In any case, I'm
amenable to a switch. However, I have some model and language files that are
working properly with Sphinx3 that produce a whole bunch of "bad ciphone"
errors when I initialize pocketsphinx with the same arguments. Here's my
config file

-samprate 16000
-hmm /home/josh/Documents/v2sm/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
-dict /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-fdict /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
-lm /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP
-ctl ctlfile

When I run pocketsphinx_batch -argfile cfgfile, I get ERROR: s3dict.c, line
239: Line 5895: Bad ciphone: z; word zhu ignored over and over with different
line numbers, ciphones and words. I get the sense this is an easy thing to fix
(or at least I hope it is). Pretty new to this package though. Any help would
be much appreciated. Thanks.

This is also a bug. Try to replace hash_new\ with hash_table_new in
libs3decoder/libcfg/s3_cfg_convert.c

I made the fix you suggested to sphinx3 and now can successfully import the
_sphinx3 from python. I run the following series of commands

import _sphinx3
_sphinx3.parse_argfile("cfgfile")
_sphinx3.init()

The last line produces a segmentation fault. The full error message is "INFO:
kbcore.c(439): Begin Initialization of Core Modules: Segmentation fault"

Have you come across this before?

Could you submit a patch?

I'd be happy to, except I'm not sure what I used for that argument is actually
appropriate. I just got it to compile. So, not having researched an
appropriate fix, I think it's best that I don't submit it.

Thanks,
Josh

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2009-11-24

Could you tell me why this is the preferred package

Because it's supported and under more or less active development unlike
sphinx3.

When I run pocketsphinx_batch -argfile cfgfile, I get ERROR: s3dict.c, line
239: Line 5895: Bad ciphone: z; word zhu ignored over and over with different
line numbers, ciphones and words.

You need to convert the dictionary to use lower case words as in language
model and upper case phones as in acoustic model.

The last line produces a segmentation fault.

Ok, I've just committed the fix. _sphinx3_test.py now works for me.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Joshua Fialkoff - 2009-11-24

Thanks again for your help. I'll let you know how everything turns out.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thanks for all your guidance. I have gotten pocketsphinx running, but when I
try to use it to process a group of MFC files it doesn't seem to output the
converted text (e.g. if i input an audio file that has me saying "yellow" I
don't see the text "yellow" or something similar anywhere). I converted a
group of "raw" files using the following sphinx_fe command:

sphinxbase/src/sphinx_fe/sphinx_fe `cat
./usr/share/pocketsphinx/model/hmm/wsj1/feat.params` -c ctlfile_wav -di
./tests -do ./mfc/ -ei wav -eo mfc -raw no -mswav yes -samprate
16000

Then I ran pocketsphinx with this command:

usr/bin/pocketsphinx_batch -argfile cfgfile

This is my cfgfile:

-samprate 16000  
-hmm /home/setarisv2sm/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd   
-dict /home/setarisv2sm/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic   
-fdict /home/setarisv2sm/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler   
-lm /home/setarisv2sm/sphinx/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP   
-cepdir /home/setarisv2sm/sphinx/mfc   
-ctl ctlfile

The output contains a bunch of configuration and initialization information
followed by a block that looks like this for each MFC file in the input
directory:

  
INFO: cmn.c(175): CMN: 48.65 -0.26 2.69 2.02 1.16 1.66 1.32 0.54 1.31 0.80
0.78 0.13 0.22  
INFO: ngram_search_fwdtree.c(1470): 472 words recognized (2/fr)  
INFO: ngram_search_fwdtree.c(1472): 115471 senones evaluated (558/fr)  
INFO: ngram_search_fwdtree.c(1474): 77014 channels searched (372/fr), 73178
1st, 2230 last  
INFO: ngram_search_fwdtree.c(1478): 2230 words for which last channels
evaluated (10/fr)  
INFO: ngram_search_fwdtree.c(1481): 24 candidate words for entering last phone
(0/fr)  
INFO: ngram_search_fwdflat.c(253): Utterance vocabulary contains 1 words  
INFO: ngram_search_fwdflat.c(866): 335 words recognized (2/fr)  
INFO: ngram_search_fwdflat.c(868): 618 senones evaluated (3/fr)  
INFO: ngram_search_fwdflat.c(870): 719 channels searched (3/fr)  
INFO: ngram_search_fwdflat.c(872): 719 words searched (3/fr)  
INFO: ngram_search_fwdflat.c(874): 50 word transitions (0/fr)  
WARNING: "ngram_search.c", line 1056:  not found in last frame, using
<s> instead  
INFO: ngram_search.c(1101): lattice start node <s>.0 end node <s>.0  
INFO: ps_lattice.c(1228): Normalizer P(O) = alpha(<s>:0:205) = -536996864  
INFO: batch.c(570): yellow: 2.06 seconds speech, 0.15 seconds CPU, 0.15
seconds wall  
INFO: batch.c(572): yellow: 0.07 xRT (CPU), 0.07 xRT (elapsed)

The audio file that corresponds to the output above contains me saying
"yellow". I see that the output says "Utterance vocabulary contains 1 words",
does that mean that it recognizes that there is actually only one word in the
audio file or am I misinterpreting that?

I have also gotten sphinx3 working and in that case the output of the
converted text was very obvious, is the output formating different for
pocketsphinx? Any help in getting the text output from pocketsphinx would be
greatly appreciated.

Thanks, Josh

Nickolay V. Shmyrev - 2009-12-03

Any help in getting the text output from pocketsphinx would be greatly
appreciated.

try "-backtrace yes" to dump some info to stdout and -hyp file.txt to store
hypothesis in a file. if you'd like to use python API, result can be retrieved
programmatically, check

pocketsphinx/python/ps_test.py

for details

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Joshua Fialkoff - 2009-12-04

Thanks again for the all the help. I updated the cfgfile to include the two
extra lines you listed ("-backtrace yes" and "-hyp hyp.txt"). I posted the
results I got to . So it seems that it's not recognizing any words in the
first three audio recordings and just the word "i" in the last one (which is
much longer). Any thoughts on what could cause this?

And here are the contents of my hyp.txt file:

(hello -3192705) (please -3149908) (yellow -3149908) i (muchEasier -15046535)

For reference, the audio files (before being converted to MFC) are available .

Thank you very much again for all your help.

: http://pastebin.com/mc623bae
: http://drop.io/6m2atma
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2009-12-07

The sample rate of the input audio is certainly wrong. It should be 16 khz 16
bit mono. Also you can use "-adcin yes -cepext wav" to avoid mfc conversion.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Problem with python "import _sphin...

Speech Recognition Toolkit

Forums

Help

Problem with python "import _sphin...

Problem with python &quot;import _sphin...

Speech Recognition Toolkit

Forums

Help

Problem with python &quot;import _sphin... document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Problem with python "import _sphin...

Problem with python "import _sphin...