Upon completion of training iterations, I get this error when force-aligning is being executed:
FATAL_ERROR: "mdef.c", line 495: Duplicate base phone: g - - - n/a 24 72 73 74 N
I suspect this is due to the phone list as I have some that are somewhat related:
||\g_0
|\g_0
!\g_0
Please find the attached link to the zip file which has the logdir folder plus my phone list. I have attached the phone list as I suspect the same error might be thrown for two other phones which have somewhat similar characteristics. Looking forward to your assistance
Thanks
Last edit: Nickolay V. Shmyrev 2015-08-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I suspect this is due to the phone list as I have some that are somewhat related:
Tutorial says:
Sphinxtrain doesn't support some special characters like '*' or '/' and supports most of others like “+” or “-” or “:” But to be safe we recommend you to use alphanumeric-only phone-set.
Please find the attached link to the zip file which has the logdir folder plus my phone list.
For filesharing it is better to use resources that do not distribute spyware, for example Dropbox or Google drive.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Also, I get this notification:
* 100+ hours of training data is goodly amount of data
* Rule of thumb suggests 8000 for 100 hours, you can adjust accordingly
Does this 8000 mean the $CFG_WAVFILE_SRATE = 8000.0; variable? cause before training I had set it as advised on the training guide. Two, what does it mean adjust accordingly?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Upon completion of training iterations, I get this error when force-aligning is being executed:
I suspect this is due to the phone list as I have some that are somewhat related:
||\g_0
|\g_0
!\g_0
Please find the attached link to the zip file which has the logdir folder plus my phone list. I have attached the phone list as I suspect the same error might be thrown for two other phones which have somewhat similar characteristics. Looking forward to your assistance
Thanks
Last edit: Nickolay V. Shmyrev 2015-08-05
Tutorial says:
Sphinxtrain doesn't support some special characters like '*' or '/' and supports most of others like “+” or “-” or “:” But to be safe we recommend you to use alphanumeric-only phone-set.
For filesharing it is better to use resources that do not distribute spyware, for example Dropbox or Google drive.
Thank you so much for the quick response as well as the clarification and corrections :)
Hi Nickolay,
Also, I get this notification:
* 100+ hours of training data is goodly amount of data
* Rule of thumb suggests 8000 for 100 hours, you can adjust accordingly
Does this 8000 mean the $CFG_WAVFILE_SRATE = 8000.0; variable? cause before training I had set it as advised on the training guide. Two, what does it mean adjust accordingly?
Thanks
8000 is for N_TIED_STATES. "adjust accordingly" means you can adjust this value slightly to get best accuracy.
Much appreciated.
Last edit: Morebodi Modise 2015-08-08