The error message in the log file says ERROR: "sphinx_fe.c", line 119: Failed to open /data/SpeechData/train_si.bd4/TH001_1.wav: No such file or directory You need to firstly follow the error messgae and check if the file is available under the directory.
As Nickolay suggested, Kaldi should help you build a good engine. If you need help, please email me direclty.
Q1 & Q2: No, you don't have to use Sphinx. For the large training data you have collected, you may try other toolkits with deep learning methods. Q3: You are talking about a few different speech tasks other than STT. It is possible, but you may have to collect and annotate speech data differently for these tasks, and hire experienced speech scientists/engineers to work on these projects. Q4: Since Google or IBM's speech models are built on open domain, so it is possible to build/optimize your domain-specific...
The SLP textbook by Huang should cover all your listed topics
Kaldi has recipes for speaker identification, e.g., https://github.com/kaldi-asr/kaldi/tree/master/egs/sre08
You need to use 'int2sym.pl', for example, cat decode_tg_test/scoring/14.tra | utils/int2sym.pl...
Hi Dan, I have a similar question. I had a 1 GPU machine (K20) in the cluster before,...
Thanks. What's the performance difference in terms of speed and accuracy between...