CMU Sphinx / Forums / Help: Errors during using Scripts

Speech Recognition Toolkit

Errors during using Scripts_pl to train

Forum: Help

Creator: MaggieYou

Created: 2005-05-11

Updated: 2012-09-22

MaggieYou - 2005-05-11

From this forum,I know Train can be done by using script_pl files.but,The error happen!!
Every method seems not smooth ...

[root@localhost /]# mkdir time [root@localhost /]# cd time [root@localhost time]# ./../SphinxTrain/scripts_pl/setup_SphinxTrain.pl -SPHINXTRAINDIR /SphinxTrain -task time Making basic directory structure Platform: .i686-pc-linux-gnu Copying executables from /SphinxTrain/bin.i686-pc-linux-gnu Copying scripts from /SphinxTrain/scripts_pl Generating SphinxTrain specific scripts and config file Set up for acoustic training for time complete [root@localhost time]#

THEN:

create files
DIR: time/etc/
time.fileids
A1
A2
A3
A4
A5
IH1
IH2
IH3
IH4
IH5
time.phone
SIL
A
IH
time.filler
SIL
SIL
SIL
time.transcription
A A A A A A A A A A A A A A A A A A A A (A1)
......
A A A A A A A A A A A A A A A A A A A A (A5)
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH1)
......
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH5)

AFTER THAT,do the commands,
[root@localhost time]# ./bin/make_dict etc/time.transcription known words is 0
(A1)
(A2)
(A3)
(A4)
(A5)
(IH1)
(IH2)
(IH3)
(IH4)
(IH5)
SIOD ERROR: damaged env : #
BACKTRACE:
0: (make_dict_main)
1: (load "./bin/make_dict")
[root@localhost time]#
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous - 2005-05-11
  
  Your time.filler file is the wrong format. See tinydoc.txt.
  
  As I recall, bin/make_dict uses festival to scan your transcription file and look up words in its internal dictionary. If the internal dictionary does not "know" the words, then nothing is output. This may be what happened in your case. I have never found make_dict very good, even in English, and of course, it's not useful at all in other languages.
  
  If you have access to some large dictionary that uses your model's phone set, you should write your own perl script that does the same job (which is what I did). Or else, you must create your training dictionary (which must contain EVERY word contained in your transcription file) manually -- for a small training set, this is not difficult.
  
  I hope that helps.
  
  cheers,
  jerry
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Errors during using Scripts_pl to train

Speech Recognition Toolkit

Forums

Help

Errors during using Scripts_pl to train document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Errors during using Scripts_pl to train