From this forum,I know Train can be done by using script_pl files.but,The error happen!!
Every method seems not smooth ...
[root@localhost /]# mkdir time
[root@localhost /]# cd time
[root@localhost time]# ./../SphinxTrain/scripts_pl/setup_SphinxTrain.pl -SPHINXTRAINDIR /SphinxTrain -task time
Making basic directory structure
Platform: .i686-pc-linux-gnu
Copying executables from /SphinxTrain/bin.i686-pc-linux-gnu
Copying scripts from /SphinxTrain/scripts_pl
Generating SphinxTrain specific scripts and config file
Set up for acoustic training for time complete
[root@localhost time]#
THEN:
create files
DIR: time/etc/
time.fileids
A1
A2
A3
A4
A5
IH1
IH2
IH3
IH4
IH5
time.phone
SIL
A
IH
time.filler
SIL
SIL
SIL
time.transcription
A A A A A A A A A A A A A A A A A A A A (A1)
......
A A A A A A A A A A A A A A A A A A A A (A5)
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH1)
......
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH5)
AFTER THAT,do the commands, [root@localhost time]# ./bin/make_dict etc/time.transcription known words is 0
(A1)
(A2)
(A3)
(A4)
(A5)
(IH1)
(IH2)
(IH3)
(IH4)
(IH5)
SIOD ERROR: damaged env : #
BACKTRACE:
0: (make_dict_main)
1: (load "./bin/make_dict") [root@localhost time]#
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2005-05-11
Your time.filler file is the wrong format. See tinydoc.txt.
As I recall, bin/make_dict uses festival to scan your transcription file and look up words in its internal dictionary. If the internal dictionary does not "know" the words, then nothing is output. This may be what happened in your case. I have never found make_dict very good, even in English, and of course, it's not useful at all in other languages.
If you have access to some large dictionary that uses your model's phone set, you should write your own perl script that does the same job (which is what I did). Or else, you must create your training dictionary (which must contain EVERY word contained in your transcription file) manually -- for a small training set, this is not difficult.
I hope that helps.
cheers,
jerry
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
From this forum,I know Train can be done by using script_pl files.but,The error happen!!
Every method seems not smooth ...
THEN:
create files
DIR: time/etc/
time.fileids
A1
A2
A3
A4
A5
IH1
IH2
IH3
IH4
IH5
time.phone
SIL
A
IH
time.filler
SIL
SIL
SIL
time.transcription
A A A A A A A A A A A A A A A A A A A A (A1)
......
A A A A A A A A A A A A A A A A A A A A (A5)
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH1)
......
IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH IH (IH5)
AFTER THAT,do the commands,
[root@localhost time]# ./bin/make_dict etc/time.transcription known words is 0
(A1)
(A2)
(A3)
(A4)
(A5)
(IH1)
(IH2)
(IH3)
(IH4)
(IH5)
SIOD ERROR: damaged env : #
BACKTRACE:
0: (make_dict_main)
1: (load "./bin/make_dict")
[root@localhost time]#
Your time.filler file is the wrong format. See tinydoc.txt.
As I recall, bin/make_dict uses festival to scan your transcription file and look up words in its internal dictionary. If the internal dictionary does not "know" the words, then nothing is output. This may be what happened in your case. I have never found make_dict very good, even in English, and of course, it's not useful at all in other languages.
If you have access to some large dictionary that uses your model's phone set, you should write your own perl script that does the same job (which is what I did). Or else, you must create your training dictionary (which must contain EVERY word contained in your transcription file) manually -- for a small training set, this is not difficult.
I hope that helps.
cheers,
jerry