Hello all,
I have successfully run "setup.py install" on the python module for Sphinx3
using svn checkouts of Sphinxbase and Sphinx3. The installation went fine
after I fixed the incomplete function call on line 172 that multiple people
have noted previously. However, when I try to run "import _sphinx3" from the
Python command line I get:
/home/josh/Documents/v2sm/<ipython<br>console> in <module>()
We don't recommend you to use sphinx3. Use pocketsphinx instead.
Could you tell me why this is the preferred package? I read the comparison, it
doesn't seem to state that one is any better than the other. In any case, I'm
amenable to a switch. However, I have some model and language files that are
working properly with Sphinx3 that produce a whole bunch of "bad ciphone"
errors when I initialize pocketsphinx with the same arguments. Here's my
config file
When I run pocketsphinx_batch -argfile cfgfile, I get ERROR: s3dict.c, line
239: Line 5895: Bad ciphone: z; word zhu ignored over and over with different
line numbers, ciphones and words. I get the sense this is an easy thing to fix
(or at least I hope it is). Pretty new to this package though. Any help would
be much appreciated. Thanks.
This is also a bug. Try to replace hash_new\ with hash_table_new in
libs3decoder/libcfg/s3_cfg_convert.c
I made the fix you suggested to sphinx3 and now can successfully import the
_sphinx3 from python. I run the following series of commands
The last line produces a segmentation fault. The full error message is "INFO:
kbcore.c(439): Begin Initialization of Core Modules: Segmentation fault"
Have you come across this before?
Could you submit a patch?
I'd be happy to, except I'm not sure what I used for that argument is actually
appropriate. I just got it to compile. So, not having researched an
appropriate fix, I think it's best that I don't submit it.
Thanks,
Josh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Could you tell me why this is the preferred package
Because it's supported and under more or less active development unlike
sphinx3.
When I run pocketsphinx_batch -argfile cfgfile, I get ERROR: s3dict.c, line
239: Line 5895: Bad ciphone: z; word zhu ignored over and over with different
line numbers, ciphones and words.
You need to convert the dictionary to use lower case words as in language
model and upper case phones as in acoustic model.
The last line produces a segmentation fault.
Ok, I've just committed the fix. _sphinx3_test.py now works for me.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for all your guidance. I have gotten pocketsphinx running, but when I
try to use it to process a group of MFC files it doesn't seem to output the
converted text (e.g. if i input an audio file that has me saying "yellow" I
don't see the text "yellow" or something similar anywhere). I converted a
group of "raw" files using the following sphinx_fe command:
The output contains a bunch of configuration and initialization information
followed by a block that looks like this for each MFC file in the input
directory:
INFO: cmn.c(175): CMN: 48.65 -0.26 2.69 2.02 1.16 1.66 1.32 0.54 1.31 0.80
0.78 0.13 0.22
INFO: ngram_search_fwdtree.c(1470): 472 words recognized (2/fr)
INFO: ngram_search_fwdtree.c(1472): 115471 senones evaluated (558/fr)
INFO: ngram_search_fwdtree.c(1474): 77014 channels searched (372/fr), 73178
1st, 2230 last
INFO: ngram_search_fwdtree.c(1478): 2230 words for which last channels
evaluated (10/fr)
INFO: ngram_search_fwdtree.c(1481): 24 candidate words for entering last phone
(0/fr)
INFO: ngram_search_fwdflat.c(253): Utterance vocabulary contains 1 words
INFO: ngram_search_fwdflat.c(866): 335 words recognized (2/fr)
INFO: ngram_search_fwdflat.c(868): 618 senones evaluated (3/fr)
INFO: ngram_search_fwdflat.c(870): 719 channels searched (3/fr)
INFO: ngram_search_fwdflat.c(872): 719 words searched (3/fr)
INFO: ngram_search_fwdflat.c(874): 50 word transitions (0/fr)
WARNING: "ngram_search.c", line 1056: not found in last frame, using
<s> instead
INFO: ngram_search.c(1101): lattice start node <s>.0 end node <s>.0
INFO: ps_lattice.c(1228): Normalizer P(O) = alpha(<s>:0:205) = -536996864
INFO: batch.c(570): yellow: 2.06 seconds speech, 0.15 seconds CPU, 0.15
seconds wall
INFO: batch.c(572): yellow: 0.07 xRT (CPU), 0.07 xRT (elapsed)
The audio file that corresponds to the output above contains me saying
"yellow". I see that the output says "Utterance vocabulary contains 1 words",
does that mean that it recognizes that there is actually only one word in the
audio file or am I misinterpreting that?
I have also gotten sphinx3 working and in that case the output of the
converted text was very obvious, is the output formating different for
pocketsphinx? Any help in getting the text output from pocketsphinx would be
greatly appreciated.
Thanks, Josh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Any help in getting the text output from pocketsphinx would be greatly
appreciated.
try "-backtrace yes" to dump some info to stdout and -hyp file.txt to store
hypothesis in a file. if you'd like to use python API, result can be retrieved
programmatically, check
pocketsphinx/python/ps_test.py
for details
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks again for the all the help. I updated the cfgfile to include the two
extra lines you listed ("-backtrace yes" and "-hyp hyp.txt"). I posted the
results I got to . So it seems that it's not recognizing any words in the
first three audio recordings and just the word "i" in the last one (which is
much longer). Any thoughts on what could cause this?
And here are the contents of my hyp.txt file:
(hello -3192705)
(please -3149908)
(yellow -3149908)
i (muchEasier -15046535)
For reference, the audio files (before being converted to MFC) are available .
The sample rate of the input audio is certainly wrong. It should be 16 khz 16
bit mono. Also you can use "-adcin yes -cepext wav" to avoid mfc conversion.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello all,
I have successfully run "setup.py install" on the python module for Sphinx3
using svn checkouts of Sphinxbase and Sphinx3. The installation went fine
after I fixed the incomplete function call on line 172 that multiple people
have noted previously. However, when I try to run "import _sphinx3" from the
Python command line I get:
Has anyone else come across this error? Any help would be much appreciated.
Cheers,
Josh
We don't recommend you to use sphinx3. Use pocketsphinx instead.
Could you submit a patch?
This is also a bug. Try to replace hash_new\ with hash_table_new in
libs3decoder/libcfg/s3_cfg_convert.c
Thanks so much for your help.
Could you tell me why this is the preferred package? I read the comparison, it
doesn't seem to state that one is any better than the other. In any case, I'm
amenable to a switch. However, I have some model and language files that are
working properly with Sphinx3 that produce a whole bunch of "bad ciphone"
errors when I initialize pocketsphinx with the same arguments. Here's my
config file
-samprate 16000
-hmm /home/josh/Documents/v2sm/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd
-dict /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.dic
-fdict /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp.sphinx.filler
-lm /home/josh/Documents/v2sm/lm_giga_5k_nvp_3gram/lm_giga_5k_nvp_3gram.arpa.DMP
-ctl ctlfile
When I run pocketsphinx_batch -argfile cfgfile, I get ERROR: s3dict.c, line
239: Line 5895: Bad ciphone: z; word zhu ignored over and over with different
line numbers, ciphones and words. I get the sense this is an easy thing to fix
(or at least I hope it is). Pretty new to this package though. Any help would
be much appreciated. Thanks.
I made the fix you suggested to sphinx3 and now can successfully import the
_sphinx3 from python. I run the following series of commands
import _sphinx3
_sphinx3.parse_argfile("cfgfile")
_sphinx3.init()
The last line produces a segmentation fault. The full error message is "INFO:
kbcore.c(439): Begin Initialization of Core Modules: Segmentation fault"
Have you come across this before?
I'd be happy to, except I'm not sure what I used for that argument is actually
appropriate. I just got it to compile. So, not having researched an
appropriate fix, I think it's best that I don't submit it.
Thanks,
Josh
Because it's supported and under more or less active development unlike
sphinx3.
You need to convert the dictionary to use lower case words as in language
model and upper case phones as in acoustic model.
Ok, I've just committed the fix. _sphinx3_test.py now works for me.
Thanks again for your help. I'll let you know how everything turns out.
Thanks for all your guidance. I have gotten pocketsphinx running, but when I
try to use it to process a group of MFC files it doesn't seem to output the
converted text (e.g. if i input an audio file that has me saying "yellow" I
don't see the text "yellow" or something similar anywhere). I converted a
group of "raw" files using the following sphinx_fe command:
Then I ran pocketsphinx with this command:
This is my cfgfile:
The output contains a bunch of configuration and initialization information
followed by a block that looks like this for each MFC file in the input
directory:
The audio file that corresponds to the output above contains me saying
"yellow". I see that the output says "Utterance vocabulary contains 1 words",
does that mean that it recognizes that there is actually only one word in the
audio file or am I misinterpreting that?
I have also gotten sphinx3 working and in that case the output of the
converted text was very obvious, is the output formating different for
pocketsphinx? Any help in getting the text output from pocketsphinx would be
greatly appreciated.
Thanks, Josh
try "-backtrace yes" to dump some info to stdout and -hyp file.txt to store
hypothesis in a file. if you'd like to use python API, result can be retrieved
programmatically, check
pocketsphinx/python/ps_test.py
for details
Thanks again for the all the help. I updated the cfgfile to include the two
extra lines you listed ("-backtrace yes" and "-hyp hyp.txt"). I posted the
results I got to . So it seems that it's not recognizing any words in the
first three audio recordings and just the word "i" in the last one (which is
much longer). Any thoughts on what could cause this?
And here are the contents of my hyp.txt file:
For reference, the audio files (before being converted to MFC) are available .
Thank you very much again for all your help.
: http://pastebin.com/mc623bae
: http://drop.io/6m2atma
The sample rate of the input audio is certainly wrong. It should be 16 khz 16
bit mono. Also you can use "-adcin yes -cepext wav" to avoid mfc conversion.