the error is the same as before. i tried editing allphones.mdef file but still mo use..
i've made sure that there are no extra spaces but when the .mdef file isgeneraed it always contains duplicates. how can i avoid the creation of duplicates?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) It works for me with your config. I can even build cd model. What sphinxtrain version
are you using, probably there is the difference?
2) Good news is that for 16 words recognition you don't need cd model, so you can continue
recognition with ci one. But it would be nice to find the problem.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You see our number of triphones differs significantly. Did you modify anything in your sources? Why the number of triphones differ. Can you please pack all your files (including mdef and scripts) and share/send them too, I'll try to compare them.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thanks for the links provided, they are really helpful as i had the same problem. I do search before posting but just couldn't get my hands on this one.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
A TX R^ i n/a 0 96 97 98 N
A V R i n/a 0 99 100 101 N
A^ V B e n/a 0 102 103 104 N
A^ V K e n/a 0 105 106 107 N
AA B R i n/a 1 108 109 110 N
The ^ symbol here is carriage return \r used on Windows to mark a new line. Your dictionary has both phones with a new lines and phones without them.
fotujh V I J A N A R I^
yhMj L EE D A R^
dh K EE^
Hkwfedk BH OO M I K AA^
dk K AA^
fooj.k V I V A R A ND^
Sphinx lags here and thinks they are different. Can you find a good text editor (WinEdit can do it for example) that is able to remove such newlines from the text and strip them everywhere in the dictionary? On Unix for example you can do it with a
tr -d '\r' < $f > $f.new
So it's a bug actually that must be fixed.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hi,
I had the following problem while running the RunAll.pl script in sphinxtrain. I'll be thankful if someone could help.
MODULE: 45 Prune Trees (2007-12-21 09:49)
mk_mdef_gen Log File
completed
Phase 1: Tree Pruning
prunetree Log File
completed
Phase 2: State Tying
tiestate Log File
completed
MODULE: 50 Training Context dependent models (2007-12-21 09:49)
Phase 1: Cleaning up directories:
accumulator... logs... qmanager... completed
Phase 2: Copy CI to CD initialize
init_mixw Log File
completed
Phase 3: Forward-Backward
Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
bw Log File
FATAL_ERROR: "main.c", line 1052: initialization failed
FAILED
Failed to start bw
and the log file of state tying shows this without any warning or error
tiestate: acmod_set.c:761: acmod_set_id2tri: Assertion `acmod_set_tri2id(acmod_set, multi[addr].base, multi[addr].left_context, multi[addr].right_context, multi[addr].posn) == id' failed.
**so it doesn't write the mdef file as the output which causes the error in next stages.
the error is the same as before. i tried editing allphones.mdef file but still mo use..
i've made sure that there are no extra spaces but when the .mdef file isgeneraed it always contains duplicates. how can i avoid the creation of duplicates?
Start everything from beginning with ./scripts/RunAll.pl. Then pack all logs in logdir and upload them somewhere or send them by mail.
respected mr. shmyrev,
thanks for your interest in my problem. I've packed everything and sent to your sourceforge email as told by you.
Kindly have a look.
thank you.
Hm, probably mail is too big, duplicate on nshmyrev@yandex.ru too to be sure.
Ok, I've looked into it
1) It works for me with your config. I can even build cd model. What sphinxtrain version
are you using, probably there is the difference?
2) Good news is that for 16 words recognition you don't need cd model, so you can continue
recognition with ci one. But it would be nice to find the problem.
And after all I've found the following difference:
-INFO: mk_mdef_gen.c(835): 24 n_base, 458 n_tri
+INFO: mk_mdef_gen.c(835): 24 n_base, 296 n_tri
You see our number of triphones differs significantly. Did you modify anything in your sources? Why the number of triphones differ. Can you please pack all your files (including mdef and scripts) and share/send them too, I'll try to compare them.
thank you very much shmyrev
Please use search before asking. Go through this list:
https://sourceforge.net/search/index.php?group_id=1904&search_subject=1&search_body=1&type_of_search=forums&all_words=acmod_set_tri2id&exact_phrase=&some_word=&without_words=&forum_id%5B%5D=5471&forum_id%5B%5D=5470&forum_id%5B%5D=395833&forum_id%5B%5D=395832&forum_id%5B%5D=382337&forum_id%5B%5D=395831&posted_by=&posted_date_start=&posted_date_end=&form_submit=Search
Answers like
https://sourceforge.net/forum/message.php?msg_id=4124495
or
https://sourceforge.net/forum/message.php?msg_id=3494884
are useful. Most probably you have duplicated or incorrectly named phone in phoneset. Share your .phone file to get more help.
hi nickolay,
thanks for the links provided, they are really helpful as i had the same problem. I do search before posting but just couldn't get my hands on this one.
phone file...
SIL
A
AA
I
EE
OO
E
AU
NG
ND
K
G
J
D
TX
DX
N
B
BH
M
R
L
V
S
dict file...
fotujh V I J A N A R I
yhMj L EE D A R
dh K EE
Hkwfedk BH OO M I K AA
dk K AA
fooj.k V I V A R A ND
ns DX E
ldsaxs S A K E NG G E
usr`Ro N E TX R I TX V A
ds K E
Lrj S TX A R
ekWMy M AU D A L
ckjs B AA R E
esa M E
cryk B A TX A L AA
filler dict file ...
<s> SIL
</s> SIL
Hm, is it broken still? What is the error now?
i've uploaded it to this link ...
http://www.2shared.com/file/2630088/eb8cb5f9/col_shani2004_timetar.html
thank you very much for the help mr shmyrev. the files you requested are here..
http://www.2shared.com/file/2632936/4a18b231/col_shani2004_mdef_files.html
Hm, now it's clear, look:
A^ V B e n/a 0 102 103 104 N
A^ V K e n/a 0 105 106 107 N
AA B R i n/a 1 108 109 110 N
The ^ symbol here is carriage return \r used on Windows to mark a new line. Your dictionary has both phones with a new lines and phones without them.
fotujh V I J A N A R I^
yhMj L EE D A R^
dh K EE^
Hkwfedk BH OO M I K AA^
dk K AA^
fooj.k V I V A R A ND^
Sphinx lags here and thinks they are different. Can you find a good text editor (WinEdit can do it for example) that is able to remove such newlines from the text and strip them everywhere in the dictionary? On Unix for example you can do it with a
tr -d '\r' < $f > $f.new
So it's a bug actually that must be fixed.
i forgot to post scripts.. they are here
http://www.2shared.com/file/2632963/4705b2fb/scripts_pl.html