After some fails I create my first model in Albanian language. I have just 19 word or sentences because i create it just for testing but the problem is the accuracy. It recognize words but almost always wrong words. I have read some articles and they say I should have more data to train.
1. What does this mean?
2. Should I keep trying with CMU Sphinx or it's better to use another software like Kaldi to get better accuracy?
Thanks in advance! :)
Last edit: Mariano Baci 2019-10-24
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
After some fails I create my first model in Albanian language. I have just 19 word or sentences because i create it just for testing but the problem is the accuracy. It recognize words but almost always wrong words. I have read some articles and they say I should have more data to train.
1. What does this mean?
2. Should I keep trying with CMU Sphinx or it's better to use another software like Kaldi to get better accuracy?
Thanks in advance! :)
Last edit: Mariano Baci 2019-10-24
You need much bigger dataset.
You first need to get a big dataset, then proceed with Kaldi. Without dataset you won't get accuracy with any toolkit.
Dataset? You mean some wav files for the same word?
Last edit: Mariano Baci 2019-10-24