I finally can to train the acustic model, but I have a problem with the aplication in Sphinx 4, this are my settings and my files for the Sphinx Train, and for the Sphinx 4:
I record about 3000 utterances of:
HOLA AMIGO COMO ESTAS ADIOS
and I run since 02 until 07 script, I have a small phone list:
A
E
I
O
L
M
C
S
T
D
SIL
And my dicctioany is too small:
HOLA O L A
AMIGO A M I G O
COMO C O M O
ESTAS E S T A S
ADIOS A D I O S
My filler dictionary is:
<s> SIL
</s> SIL
SIL SIL
When I finished to train I dont make a language model, only make a grammar, very same with grammar for TDIGITS example:
grammar cincopalabras;
public <palabras> (HOLA | AMIGO | COMO | ESTAS | ADIOS)*;
In addition I make a .java for my aplicattion and modify the demo.xml and then build my aplication .jar file, but when I run this not recognize anithing, only show:
Por favor comience a hablar:
Usted dijo:
Usted dijo:
Can anybody tell me whats can be wrong (acustic model or grammar for language model or aplication java file or config file)?
I check all of this, but I dont find a mistake, when I compile dont show me any error.
Omar
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2005-06-11
Omar -- you have given us important information, but not enough! It's like saying "My automobile won't start. Can anyone tell me why?" We need more information in order to be able to help you.
Is your microphone connected correctly? Can you run the standard live Sphinx-4 demos?
You said you have made your own .java file for your application. Perhaps you made a mistake in it. Show it to us.
Show us the Sphinx-4 configuration file.
You should add some instrumentation to your configuration file to display more detail about what your application is doing. See "Understanding Sphinx-4 Instrumentation" in the Sphinx-4 home page. You should be able to find out the problem with the resulting information, but if not, then show us what it prints when you run the application.
Are all of your 3000 training utterances exactly the same text? If so, then some of the beginning- and end-of-word triphones that you need for your grammar and application are not trained. Sphinx-4 will substitute other triphones or uniphones in those cases, but the result will not be optimal. However, I don't think this is the reason for your no-recognitions.
cheers,
jerry
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
publicstaticvoidmain(String[]args){try{URLurl;if(args.length>0){url=newFile(args[0]).toURI().toURL();}else{url=cincopalabras.class.getResource("cincopalabras.config.xml");}ConfigurationManagercm=newConfigurationManager(url);Recognizerrecognizer=(Recognizer)cm.lookup("recognizer");Microphonemicrophone=(Microphone)cm.lookup("microphone");/* allocate the resource necessary for the recognizer */recognizer.allocate();/* the microphone will keep recording until the program exits */if(microphone.startRecording()){System.out.println("Digaalgunadelaspalabrasholaamigocomoestasadios:");while(true){System.out.println("Comienceahablar.PresioneCtrl-Cparasalir.\n");/* * This method will return when the end of speech * is reached. Note that the endpointer will determine * the end of speech. */Resultresult=recognizer.recognize();if(result!=null){StringresultText=result.getBestResultNoFiller();System.out.println("Usteddijo:"+resultText+"\n");}else{System.out.println("Nopuedoescucharloqueusteddijo.\n");}}}else{System.out.println("Nosepudoiniciarelmicrofono.");recognizer.deallocate();System.exit(1);}}catch(IOExceptione){System.err.println("Problemascuandosecargocincopalabras:"+e);e.printStackTrace();}catch(PropertyExceptione){System.err.println("Problemasconfigurandocincopalabras:"+e);e.printStackTrace();}catch(InstantiationExceptione){System.err.println("Problemascreandocincopalabras:"+e);e.printStackTrace();}}
And train again the acustic model wtih this change (and in the dictionary) but I my application dont recognize nothing.
So, I have another question, in the model.props file for the acoustic model exist the option featureType and vectorLength and I read in the manual that: cepstra_delta_doubledelta and 39 are the common values for this parameters, but I extract features with cepstra this a problem?
HI ,
I finally can to train the acustic model, but I have a problem with the aplication in Sphinx 4, this are my settings and my files for the Sphinx Train, and for the Sphinx 4:
I record about 3000 utterances of:
HOLA AMIGO COMO ESTAS ADIOS
and I run since 02 until 07 script, I have a small phone list:
A
E
I
O
L
M
C
S
T
D
SIL
And my dicctioany is too small:
HOLA O L A
AMIGO A M I G O
COMO C O M O
ESTAS E S T A S
ADIOS A D I O S
My filler dictionary is:
<s> SIL
</s> SIL
SIL SIL
When I finished to train I dont make a language model, only make a grammar, very same with grammar for TDIGITS example:
grammar cincopalabras;
public <palabras> (HOLA | AMIGO | COMO | ESTAS | ADIOS)*;
In addition I make a .java for my aplicattion and modify the demo.xml and then build my aplication .jar file, but when I run this not recognize anithing, only show:
Por favor comience a hablar:
Usted dijo:
Usted dijo:
Can anybody tell me whats can be wrong (acustic model or grammar for language model or aplication java file or config file)?
I check all of this, but I dont find a mistake, when I compile dont show me any error.
Omar
Omar -- you have given us important information, but not enough! It's like saying "My automobile won't start. Can anyone tell me why?" We need more information in order to be able to help you.
Is your microphone connected correctly? Can you run the standard live Sphinx-4 demos?
You said you have made your own .java file for your application. Perhaps you made a mistake in it. Show it to us.
Show us the Sphinx-4 configuration file.
You should add some instrumentation to your configuration file to display more detail about what your application is doing. See "Understanding Sphinx-4 Instrumentation" in the Sphinx-4 home page. You should be able to find out the problem with the resulting information, but if not, then show us what it prints when you run the application.
Are all of your 3000 training utterances exactly the same text? If so, then some of the beginning- and end-of-word triphones that you need for your grammar and application are not trained. Sphinx-4 will substitute other triphones or uniphones in those cases, but the result will not be optimal. However, I don't think this is the reason for your no-recognitions.
cheers,
jerry
Hi,
So, I check and the microphone is connected correctly, I can run the Demos and this recognize my speech, but my aplicattion not.
I make my own .java only modifying the .java for the TDIGITS application java file this is the java file:
/
prueba para cinco palabras
/
package demo.sphinx.cincopalabras;
import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import edu.cmu.sphinx.util.props.PropertyException;
import java.io.File;
import java.io.IOException;
import java.net.URL;
public class cincopalabras {
}
This is the configuration file:
<?xml version="1.0" encoding="UTF-8"?>
<!--
Sphinx-4 Configuration file
-->
<!-- ******** -->
<!-- an4 configuration file -->
<!-- ******** -->
<config>
</component>
</config>
And all the 3000 training utterances have the same text : HOLA AMIGO COMO ESTAS ADIOS, I need other type of text?
I try to change the phone list for a unic sets of phones for word, this means that:
HOLA O_hola L_hola A_hola
AMIGO A_amigo M_amigo I_amigo Gamigo Oamigo
COMO K_como O_como M_como O_como_2
ESTAS E_estas S_estas T_estas A_estas S_estas_2
ADIOS A_adios D_adios I_adios S_adios
And train again the acustic model wtih this change (and in the dictionary) but I my application dont recognize nothing.
So, I have another question, in the model.props file for the acoustic model exist the option featureType and vectorLength and I read in the manual that: cepstra_delta_doubledelta and 39 are the common values for this parameters, but I extract features with cepstra this a problem?
My model.props are:
description = cincopalabras acoustic models
modelClass = edu.cmu.sphinx.model.acoustic.cincopalabras_8gau_13dCep_16k_40mel_130Hz_6800Hz.Model
modelLoader = edu.cmu.sphinx.model.acoustic.cincopalabras_8gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader
dataLocation = espmexacmo_cd_cont_500_8
modelDefinition = etc/espmexacmo.500.mdef
isBinary = true
featureType = cepstra_delta_doubledelta
vectorLength = 39
sparseForm = false
numberFftPoints = 512
filters = 40
gaussians = 8
maxFreq = 6800
minFreq. = 130
sampleRate = 16000
omar