My assumption was that if model is tested on files which are a part of training set, WER should be minimal. However this assumption failed. Below are the results:
We selected 3 audio files on which model has already been trained. Manual transcripts for these audios are:
• so you save the money and then at the one go you (800861022103750_55)
• okay sir sir i will send the request now sir and you will receive the confirmation s m s also (800861022103750_56)
• getting late sir call back and the same number and the reconfirm what happen to your service request sir (800861022103750_57)
On testing the model on above audio files, following output was generated:
• since you have one hundred and one don't you (800861022103750_55 -9690)
• officers have a friend request thousand you will receive a confirmation sms also (800861022103750_56 -17548)
• getting a check for that i'm the same i'm american so what happened the last census cases (800861022103750_57 -22385)
Need some help to understand what can be the possible solutions
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This can also happen when the language model is not appropriate. Can you check the perplexity of your Language model with respect to your test dataset.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Assuming you have installed srilm on your machine and test.txt is the file containing test transcripts and xyz.lm.arpa is your language model then use the command
ngram -lm xyz.lm.arpa -ppl test.txt > test.ppl
Generally (not always) smaller the perplexity mumber better would be the performance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My assumption was that if model is tested on files which are a part of training set, WER should be minimal. However this assumption failed. Below are the results:
We selected 3 audio files on which model has already been trained. Manual transcripts for these audios are:
•
so you save the money and then at the one go you(800861022103750_55)•
okay sir sir i will send the request now sir and you will receive the confirmation s m s also(800861022103750_56)•
getting late sir call back and the same number and the reconfirm what happen to your service request sir(800861022103750_57)On testing the model on above audio files, following output was generated:
• since you have one hundred and one don't you (800861022103750_55 -9690)
• officers have a friend request thousand you will receive a confirmation sms also (800861022103750_56 -17548)
• getting a check for that i'm the same i'm american so what happened the last census cases (800861022103750_57 -22385)
Need some help to understand what can be the possible solutions
This can also happen when the language model is not appropriate. Can you check the perplexity of your Language model with respect to your test dataset.
Can you suggest how to check perplexity?
Assuming you have installed srilm on your machine and test.txt is the file containing test transcripts and xyz.lm.arpa is your language model then use the command
ngram -lm xyz.lm.arpa -ppl test.txt > test.ppl
Generally (not always) smaller the perplexity mumber better would be the performance.
Thanks Pankaj