In an attempt to make an acoustic model as good as one created years ago, I decided to try "forced alignment" (Viterbi Alignment) as described in the doc. I'd hoped that inserting the "alternate pronunciations" would improve the model.
I assume that I am missing something because the output of the logfile (stdout) is full of a ton of stuff and as I sit here coding in perl to pull the information out of it to modify my "tactfn" file, that, surely, CMU people don't write a program to actually use this output every time they want to "force align". It seems that sphinx could just as easily modify the tactfn file or create a new one. Or CMU people go through thousands of lines by hand to update the transcription file? I sure hope not.
So, what am I missing? Thanks for any help in getting those alternate pronunciation indicators into my file!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In an attempt to make an acoustic model as good as one created years ago, I decided to try "forced alignment" (Viterbi Alignment) as described in the doc. I'd hoped that inserting the "alternate pronunciations" would improve the model.
I assume that I am missing something because the output of the logfile (stdout) is full of a ton of stuff and as I sit here coding in perl to pull the information out of it to modify my "tactfn" file, that, surely, CMU people don't write a program to actually use this output every time they want to "force align". It seems that sphinx could just as easily modify the tactfn file or create a new one. Or CMU people go through thousands of lines by hand to update the transcription file? I sure hope not.
So, what am I missing? Thanks for any help in getting those alternate pronunciation indicators into my file!
The alignment result is dump by log. This is CMU style. You can Set switch in dump file. Let only give you result.
Can you tell me are you sure you get correct aligment. I try it but result is very bad based on turtle model.