So the purpose is to let people submit their voices reading a set of transcriptions much like Mozilla's Common Voice but just for adapting Sphinx's English acoustic models. Then you build the model hourly if there's any change and let people see and download, say, the latest 100 models.
I'm wondering if you guys think it would be a valuable thing to do? Thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Overall it is a great idea, you could implement it.
The problem is that modern systems do not require adapation, they simply work out of box (check Google's API). Another problem is that there is abundance of data already in the net, 10000+ hours transcribed is not a problem. The missing part is the algorithm to train a reasonably compact system in a reasonable time.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the tip! What a shame though. I guess I'll still publish it somewhere then since I'm already almost done with the basic site and server (luckily it didn't take much effort).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi guys, first off I'm an SR noob so I'm quite clueless about things. I'm thinking of creating a web UI for the process here https://cmusphinx.github.io/wiki/tutorialadapt/#adapting-the-acoustic-model
So the purpose is to let people submit their voices reading a set of transcriptions much like Mozilla's Common Voice but just for adapting Sphinx's English acoustic models. Then you build the model hourly if there's any change and let people see and download, say, the latest 100 models.
I'm wondering if you guys think it would be a valuable thing to do? Thanks!
Hello Jacky
Overall it is a great idea, you could implement it.
The problem is that modern systems do not require adapation, they simply work out of box (check Google's API). Another problem is that there is abundance of data already in the net, 10000+ hours transcribed is not a problem. The missing part is the algorithm to train a reasonably compact system in a reasonable time.
Thanks for the tip! What a shame though. I guess I'll still publish it somewhere then since I'm already almost done with the basic site and server (luckily it didn't take much effort).