First, how much does it help to write your own linguistic questions? It seems to me that a computer would do this task better. Second, How many senones should one use for optimal performance? I guess this depends on how much data I have. Are there some guidelines?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your answer. I have 220 hours of speech (in Swedish). Would it help then with more than 8000 senones? Will the decoding demand more CPU when I have more senones? Or is it just a matter of memory?
Are there any examples of a good set of linguistic questions for English? I need to write it for Swedish. As I understand that the set of questions should be limited (to about 10 questions??), I need some input on what kind of questions make sense.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> Thanks for your answer. I have 220 hours of speech (in Swedish). Would it help then with more than 8000 senones? Will the decoding demand more CPU when I have more senones? Or is it just a matter of memory?
I don't think more senones are sensible. Of course you can just experiment with the precision and find the optimal value. It would be nice to know it. The number of senones affects the speed of the decoder of course, not just memory.
> Are there any examples of a good set of linguistic questions for English?
Inside sphinxtrain there are setups for training wsj including English questions for example.
> As I understand that the set of questions should be limited (to about 10 questions??)
Well, for such a large database like yours it can be bigger, but only experiment will show you the best number.
> I need some input on what kind of questions make sense.
Try to generate them automatically first and then edit by hand.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First, how much does it help to write your own linguistic questions? It seems to me that a computer would do this task better. Second, How many senones should one use for optimal performance? I guess this depends on how much data I have. Are there some guidelines?
> First, how much does it help to write your own linguistic questions?
Up to 20% relative performance improvement. There was a thread about it some time ago. The quality of the automatic questions depends on the data.
https://sourceforge.net/forum/message.php?msg_id=4402059
> How many senones should one use for optimal performance?
Depends on the size. The FAQ lists the estimates:
http://www.speech.cs.cmu.edu/sphinxman/fr3.html
Thanks for your answer. I have 220 hours of speech (in Swedish). Would it help then with more than 8000 senones? Will the decoding demand more CPU when I have more senones? Or is it just a matter of memory?
Are there any examples of a good set of linguistic questions for English? I need to write it for Swedish. As I understand that the set of questions should be limited (to about 10 questions??), I need some input on what kind of questions make sense.
> Thanks for your answer. I have 220 hours of speech (in Swedish). Would it help then with more than 8000 senones? Will the decoding demand more CPU when I have more senones? Or is it just a matter of memory?
I don't think more senones are sensible. Of course you can just experiment with the precision and find the optimal value. It would be nice to know it. The number of senones affects the speed of the decoder of course, not just memory.
> Are there any examples of a good set of linguistic questions for English?
Inside sphinxtrain there are setups for training wsj including English questions for example.
> As I understand that the set of questions should be limited (to about 10 questions??)
Well, for such a large database like yours it can be bigger, but only experiment will show you the best number.
> I need some input on what kind of questions make sense.
Try to generate them automatically first and then edit by hand.