In running the scripts I see the following in Module 10 in Phase 3: Forward-
Backward:
WARNING: WARNING: NEGATIVE CONVERGENCE RATIO AT ITER 2! CHECK BW AND NORM
LOGFILES
When reading the logs I see: WARNING: NEGATIVE CONVERGENCE RATIO! CHECK YOUR DATA AND TRASNCRIPTS
I checked my audio against transcripts and everything seems OK. I tried force
aligning by setting CFG_FALIGN_CI_MGAU, CFG_CI_MGAU, and CFG_FORCEDALIGN to
yes, but did not see additional details on what the problem exactly could be.
My audio files are not sliced, nor are there too much of a gap/silence between
words.
Any idea on what the problem can be?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I started pulling out feature files, and rooted it down to a single file. I am
thinking there is some slight noise within the recording causing the
convergence issue. Out of curiosity, how sensitive is sphinx to noise?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The robustness to the noise is not known. Following the common practice could
help you to avoid doing mistakes. In particular, I'm sure that with database
size large enough everything will be fine. With small database you are just
shooting yourself in the foot.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
In running the scripts I see the following in Module 10 in Phase 3: Forward-
Backward:
WARNING: WARNING: NEGATIVE CONVERGENCE RATIO AT ITER 2! CHECK BW AND NORM
LOGFILES
When reading the logs I see:
WARNING: NEGATIVE CONVERGENCE RATIO! CHECK YOUR DATA AND TRASNCRIPTS
I checked my audio against transcripts and everything seems OK. I tried force
aligning by setting CFG_FALIGN_CI_MGAU, CFG_CI_MGAU, and CFG_FORCEDALIGN to
yes, but did not see additional details on what the problem exactly could be.
My audio files are not sliced, nor are there too much of a gap/silence between
words.
Any idea on what the problem can be?
Reasons are the same: unsufficient data size, audio doesn't match the
transcripts. I wouldn't rely on your "seems OK".
I started pulling out feature files, and rooted it down to a single file. I am
thinking there is some slight noise within the recording causing the
convergence issue. Out of curiosity, how sensitive is sphinx to noise?
The robustness to the noise is not known. Following the common practice could
help you to avoid doing mistakes. In particular, I'm sure that with database
size large enough everything will be fine. With small database you are just
shooting yourself in the foot.