i am going to do my final year project in Arabic speech recognition using cmu sphinx
.1- how long does take to build acoustic model and language model for this project if we assume that we want to build 23 continous Arabic words and check if it is person read it correct or not?
2-for fisrt step which is the recording the thing i need is 1- 16 bit 2- sample rate 16000 3- mono for windows
is there are any thing else ?
think you so much
Last edit: hasan ali gamal al-kaf 2015-10-15
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
.1- how long does take to build acoustic model and language model for this project if we assume that we want to build 23 Arabic words and check if it is person read it correct or not?
About a couple of months of work.
is there are any thing else ?
Detailed instructions are provided on the wiki page.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thank you so much
i want my system detect 23 continous words ? so should i record 23 words individual word by word or can i reocrd two or more ? what is the perfect way to reocrd ?
thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i want my system detect 23 continous words ? so should i record 23 words individual word by word or can i reocrd two or more ?
You need to record what your users of the final system are supposed to say. If they will say individual words you record individual words. If they will say two or more, record two or more.
what is the perfect way to reocrd ?
You can get good results in a quiet room with laptop disconnected from power to avoid electrical noise. Or on mobile phone.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
if i have finished recording my audio , phonetic dictionary , transcription
file and filler dictionary . how can check that my data is match together
or correct ?
is there are any tutorial should i follow related to this problem?
think you so much Nickolay
Last edit: Nickolay V. Shmyrev 2015-10-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am using windows 8
1-i download the sphinxtrain and an4 data i put in same tutorial file and
extract both of them
2-I download perl and Microsoft visual c++ 15
3- I open sphinxtrain.sln and i do batch and batch all and rebuild all .
4- in sphinx train file i change scripts_file file to perl scripts_file
and setup_tutorial to setup_tutorial.pl an4
4- i write in command line of an4 perl scripts_pl\make_feats.pl -ctl
etc\an4_train.fileids
but i did't get mfccs file
what could be the possible mistakes i have made ?
sorry i am very new in these things
think you so much
Last edit: Nickolay V. Shmyrev 2015-11-03
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
*when use the latest version of SphinxTrain and i want to build
in Microsoft visual studio i got this problem *
Error C1083 Cannot open include file: 'sphinxbase/cmd_ln.h': No such file
or directory libclust
C:\Users\Al-kaf\Desktop\sphinxtrain-5prealpha\include\s3\common.h 60I fund
some answers but i didn't understand it how can i solve this problem ?
Last edit: Nickolay V. Shmyrev 2015-11-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
After i do the acoustic model . i want to make live continuous speech
recognition using pocketsphinx * ? which tutorial i should follow ? because the tutorial in cmusphinx only
mentioned simple hello world using pocketsphinx *
*think you Nickolay *
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This error is commonly discussed on forum, you can just search. It means that decoder can not map the grammar to the actual speech content, it might be due several reasons:
1) Acoustic model is not accurate.
2) Grammar is too strict, you might want to relax the grammar.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
1) Acoustic model is not accurate.
for the test data i got accuracy of 100 percent for each data i tested
Words: 4 Correct: 4 Errors: 0 Percent correct = 100.00% Error = 0.00%
Accuracy = 100.00%
is it that indicate that my accoustic model is accurate ?
but i have small data 0.126719444444444 so i set set CFG_CD_TRAIN to "no"
what should i change to to make my acoustic model is accurate ?
2) Grammar is too strict, you might want to relax the grammar.
my grammer acullay is relax for exmaple
my grammer is public <an4> = (Bismi Lahi Rahmani Rahim|Koul houwa llahou
Ahad|Allahou Samad|Lam Yaled wa Lam youlad|wa Lam yakoun lahou koufouan
Ahad);</an4>
and my audios or the output that i want are Bismi Lahi Rahmani Rahim| then
Koul houwa llahou Ahad then Allahou Samad
then Lam Yaled wa Lam youlad then wa Lam yakoun lahou koufouan Ahad so it
is in sequence
This error is commonly discussed on forum, you can just search. It means
that decoder can not map the grammar to the actual speech content, it might
be due several reasons:
1) Acoustic model is not accurate.
2) Grammar is too strict, you might want to relax the grammar.
i am going to do my final year project in Arabic speech recognition using cmu sphinx
.1- how long does take to build acoustic model and language model for this project if we assume that we want to build 23 continous Arabic words and check if it is person read it correct or not?
2-for fisrt step which is the recording the thing i need is 1- 16 bit 2- sample rate 16000 3- mono for windows
is there are any thing else ?
think you so much
Last edit: hasan ali gamal al-kaf 2015-10-15
About a couple of months of work.
Detailed instructions are provided on the wiki page.
thank you so much
i want my system detect 23 continous words ? so should i record 23 words individual word by word or can i reocrd two or more ? what is the perfect way to reocrd ?
thank you
You need to record what your users of the final system are supposed to say. If they will say individual words you record individual words. If they will say two or more, record two or more.
You can get good results in a quiet room with laptop disconnected from power to avoid electrical noise. Or on mobile phone.
HI Nickolay ,
if i have finished recording my audio , phonetic dictionary , transcription
file and filler dictionary . how can check that my data is match together
or correct ?
is there are any tutorial should i follow related to this problem?
think you so much Nickolay
Last edit: Nickolay V. Shmyrev 2015-10-30
Run the training. It performs all necessary checks and reports errors
Our acoustic model training tutorial is here
http://cmusphinx.sourceforge.net/wiki/tutorialam
You can try to go through the steps described there.
I am using windows 8
1-i download the sphinxtrain and an4 data i put in same tutorial file and
extract both of them
2-I download perl and Microsoft visual c++ 15
3- I open sphinxtrain.sln and i do batch and batch all and rebuild all .
4- in sphinx train file i change scripts_file file to perl scripts_file
and setup_tutorial to setup_tutorial.pl an4
4- i write in command line of an4 perl scripts_pl\make_feats.pl -ctl
etc\an4_train.fileids
but i did't get mfccs file
what could be the possible mistakes i have made ?
sorry i am very new in these things
think you so much
Last edit: Nickolay V. Shmyrev 2015-11-03
I'm not sure what version did you download, you did not mention that. You need to download latest version 5prealpha.
You also need python
There is no such step in our tutorial. If you follow our tutorial precisely you will avoid issues.
HI Nickolay
*when use the latest version of SphinxTrain and i want to build
in Microsoft visual studio i got this problem *
Error C1083 Cannot open include file: 'sphinxbase/cmd_ln.h': No such file
or directory libclust
C:\Users\Al-kaf\Desktop\sphinxtrain-5prealpha\include\s3\common.h 60I fund
some answers but i didn't understand it how can i solve this problem ?
Last edit: Nickolay V. Shmyrev 2015-11-09
You need to unpack and build sphinxbase first. Sphinxbase must be renamed to just
sphinxbase
without version as per instruction.If you do not have expertise to compile, I recommend you to download precompiled bin version available in our downloads.
Hi Nickolay
After i do the acoustic model . i want to make live continuous speech
recognition using pocketsphinx *
?
which tutorial i should follow ? because the tutorial in cmusphinx only
mentioned simple hello world using pocketsphinx *
*think you Nickolay *
You still can try pocketsphinx hello world tutorial with arabic model, the tutorial describes the process.
Hi Nickolay
for my language model i going to use grammer with JSGF so i will write my
grammer using JSGF format inside the an4.lm.DMP is it right my setp?
is the grammar format correct and long enough? can i improve it?
think you so much
Last edit: Nickolay V. Shmyrev 2016-01-17
There is special configuration for grammars, you can use them instead.
It looks ok.
Hi Nickolay
while i was training my data i got this error
Phase 3: Forward-Backward
Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
bw Log File
ERROR: FATAL: "main.c", line 1846: initialization failed
ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check
the log file for details.
FAILED
ERROR: Failed to start bw
ERROR: Only 0 parts of 1 of Baum Welch were successfully completed
ERROR: Parts 1 failed to run!
ERROR: Training failed in iteration 1
I attach my html file and log file
i got 0,18 data so what percentage should i increase to ?
what possible mistakes should i fix ?
think you so much for guide me .
i attach my logdir folder also
think you so much
Last edit: Nickolay V. Shmyrev 2016-01-23
Instead of attaching the logs you can open them and read
You have duplicated phones in your phoneset file in etc/an4.phone
yes i already read
is there any ways or guide how can i discover the mistake in the logfile ?
think you so much
Last edit: Nickolay V. Shmyrev 2016-01-23
Open it and look for errors. It is easy.
think you so much
i will try
Last edit: Nickolay V. Shmyrev 2016-01-23
i am using the last version of pocket sphinx and i am using ubuntu
for the special configuration of the jsgf format
i have the jsgf format called an4.jsgf
then i use this command sphinx_jsgf2fsg < an4.jsgf > an4.fsg
bud i didn't find the fsg file
or
i use
pocketsphinx_continuous - an4.jsgf
what could be the possible error ?
think you so much
Last edit: Nickolay V. Shmyrev 2016-01-27
The correct command must be
The correct command is
No idea, you didn't provide enough information on the subject. You could provide at least the logs.
Hi Nickolay
after i got the hmm file
i want to convert audio file to text
i use the
pocketsphinx_continuous -infile myvoice.wav hmm an4.ci_cont -jsgf
an4.gram -dict an4.dic
but i got this error
ERROR: "fsg_search.c", line 913: Final result does not match the
grammar in frame 294
what could be the possible error ?
i use the lastversion of pocketsphinx and sphinxbase
think you so much
Last edit: Nickolay V. Shmyrev 2016-02-02
This error is commonly discussed on forum, you can just search. It means that decoder can not map the grammar to the actual speech content, it might be due several reasons:
1) Acoustic model is not accurate.
2) Grammar is too strict, you might want to relax the grammar.
1) Acoustic model is not accurate.
for the test data i got accuracy of 100 percent for each data i tested
Words: 4 Correct: 4 Errors: 0 Percent correct = 100.00% Error = 0.00%
Accuracy = 100.00%
is it that indicate that my accoustic model is accurate ?
but i have small data 0.126719444444444 so i set set CFG_CD_TRAIN to "no"
what should i change to to make my acoustic model is accurate ?
2) Grammar is too strict, you might want to relax the grammar.
my grammer acullay is relax for exmaple
my grammer is public <an4> = (Bismi Lahi Rahmani Rahim|Koul houwa llahou
Ahad|Allahou Samad|Lam Yaled wa Lam youlad|wa Lam yakoun lahou koufouan
Ahad);</an4>
and my audios or the output that i want are Bismi Lahi Rahmani Rahim| then
Koul houwa llahou Ahad then Allahou Samad
then Lam Yaled wa Lam youlad then wa Lam yakoun lahou koufouan Ahad so it
is in sequence
think you so much
2016-02-02 22:55 GMT+08:00 Nickolay V. Shmyrev nshmyrev@users.sf.net: