I read am doing an installation where i need to find out if someone is speaking a vowel into a microphone and if yes than which one. After trying out sphinx which is seems to be a slight overkill for my problem i found MARF and it appears to be good for my needs.
I read the pdf describing all elements of MARF. And based on your SpeakerIdent-application i wrote my little program. So far i go everything running. the testing part and the recognition part, but the recognition isnt really reliable. something between 40 - 50%.
These are my settings:
MARF.setPreprocessingMethod(MARF.DUMMY);
MARF.setFeatureExtractionMethod(MARF.LPC);
MARF.setClassificationMethod(MARF.NEURAL_NETWORK);
MARF.setDumpSpectrogram(false);
MARF.setSampleFormat(MARF.WAV);
Is there any way to get better results?
I also need to find out some characteristics of the voice of the talking person, like pitch, amplitude etc... i thought that i may can grab some values from the LPC but i couldnt find out which value is for the pitch period. Or am i wrong here? Could you explain me in just a couple of words how the vector of the 20 elements is built up?
Thanks for an answer
Vincent
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Serguei,
I read am doing an installation where i need to find out if someone is speaking a vowel into a microphone and if yes than which one. After trying out sphinx which is seems to be a slight overkill for my problem i found MARF and it appears to be good for my needs.
I read the pdf describing all elements of MARF. And based on your SpeakerIdent-application i wrote my little program. So far i go everything running. the testing part and the recognition part, but the recognition isnt really reliable. something between 40 - 50%.
These are my settings:
MARF.setPreprocessingMethod(MARF.DUMMY);
MARF.setFeatureExtractionMethod(MARF.LPC);
MARF.setClassificationMethod(MARF.NEURAL_NETWORK);
MARF.setDumpSpectrogram(false);
MARF.setSampleFormat(MARF.WAV);
Is there any way to get better results?
I also need to find out some characteristics of the voice of the talking person, like pitch, amplitude etc... i thought that i may can grab some values from the LPC but i couldnt find out which value is for the pitch period. Or am i wrong here? Could you explain me in just a couple of words how the vector of the 20 elements is built up?
Thanks for an answer
Vincent