I have commented the following lines in Runall.pl to build only CI models
# What pieces would you like to compute.
my @steps =
("$ST::CFG_SCRIPT_DIR/00.verify/verify_all.pl",
"$ST::CFG_SCRIPT_DIR/01.vector_quantize/slave.VQ.pl",
"$ST::CFG_SCRIPT_DIR/02.falign_ci_hmm/slave_convg.pl",
"$ST::CFG_SCRIPT_DIR/03.force_align/slave_align.pl",
"$ST::CFG_SCRIPT_DIR/04.vtln_align/slave_align.pl",
"$ST::CFG_SCRIPT_DIR/05.lda_train/slave_lda.pl",
"$ST::CFG_SCRIPT_DIR/06.mllt_train/slave_mllt.pl",
"$ST::CFG_SCRIPT_DIR/20.ci_hmm/slave_convg.pl",
When I want to decode I am getting the following error. Please give me suggestion
Loading...
06:50.356 WARNING dictionary Missing word: <unk>
in edu.cmu.sphinx.linguist.dictionary.FastDictionary:getWord-dictionary
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at edu.cmu.sphinx.linguist.util.HMMPool.<init>(HMMPool.java:78)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:476)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:406)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:323)
at edu.cmu.sphinx.decoder.Decoder.allocate(Decoder.java:109)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:182)
at demo.sphinx.hellosyllable.HELLOSYLLABLE.main(HELLOSYLLABLE.java:53)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Please suggest me that the error is only because of heap space or any other errors i have done in configuring or training? This is important for me because if it is only because of less heap space i want to run on higher end systems.
I need your valuable suggestion.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I checked this. Indeed currently sphinx4 is not efficient enough to handle 1000 CI units. In particular edu.cmu.sphinx.linguist.util.HMMPool needs modifications to work with your model:
unitTable = new Unit[numCIUnits * numCIUnits * numCIUnits];
Here it tries to allocate a very huge table without much gain for you. I think it could be reworked to deal with hash table or something like that, but it requires some trivial coding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much for your quick reply. I will try to understand the module. If you come across the fix for this, please let me know. Please post the solution here. I will also try to understand and come up with some work around.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Dear Sir,
I have successfully completed training tri-phone models and the speech recognition system is working fine. I have used Sphinx 4.
Now I am trying to develop syllable models. I have 1200 sentences, 2740 words and 1120 syllables in my data.
I have followed the steps told somebody in this forum.
I have increased no of states per HMM here. Is that ok.
elsif ($CFG_HMM_TYPE eq '.cont.') {
$CFG_DIRLABEL = 'cont';
$CFG_STATESPERHMM = 7;
$CFG_SKIPSTATE = 'no';
I have commented the following lines in Runall.pl to build only CI models
# What pieces would you like to compute.
my @steps =
("$ST::CFG_SCRIPT_DIR/00.verify/verify_all.pl",
"$ST::CFG_SCRIPT_DIR/01.vector_quantize/slave.VQ.pl",
"$ST::CFG_SCRIPT_DIR/02.falign_ci_hmm/slave_convg.pl",
"$ST::CFG_SCRIPT_DIR/03.force_align/slave_align.pl",
"$ST::CFG_SCRIPT_DIR/04.vtln_align/slave_align.pl",
"$ST::CFG_SCRIPT_DIR/05.lda_train/slave_lda.pl",
"$ST::CFG_SCRIPT_DIR/06.mllt_train/slave_mllt.pl",
"$ST::CFG_SCRIPT_DIR/20.ci_hmm/slave_convg.pl",
"$ST::CFG_SCRIPT_DIR/30.cd_hmm_untied/slave_convg.pl",
"$ST::CFG_SCRIPT_DIR/40.buildtrees/slave.treebuilder.pl",
"$ST::CFG_SCRIPT_DIR/45.prunetree/slave.state-tying.pl",
"$ST::CFG_SCRIPT_DIR/50.cd_hmm_tied/slave_convg.pl",
"$ST::CFG_SCRIPT_DIR/90.deleted_interpolation/deleted_interpolation.pl",
"$ST::CFG_SCRIPT_DIR/99.make_s2_models/make_s2_models.pl",
);
The training went on without any errors.
Now which files I have to use for decoding part. Can any body suggest me.
Can anybody give me suggestion where to modify the code for the above problem.
Does HMMpool required for CI models also or it is only to make context dependent HMMS?
All required model files are in
model_parameters/your_db_name.ci_cont
Dear Sir,
Loading...
06:50.356 WARNING dictionary Missing word: <unk>
in edu.cmu.sphinx.linguist.dictionary.FastDictionary:getWord-dictionary
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at edu.cmu.sphinx.linguist.util.HMMPool.<init>(HMMPool.java:78)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.compileGrammar(LexTreeLinguist.java:476)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.allocate(LexTreeLinguist.java:406)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.allocate(WordPruningBreadthFirstSearchManager.java:323)
at edu.cmu.sphinx.decoder.Decoder.allocate(Decoder.java:109)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:182)
at demo.sphinx.hellosyllable.HELLOSYLLABLE.main(HELLOSYLLABLE.java:53)
Dear Sir,
<!-- ******** -->
<!-- frequently tuned properties -->
<!-- ******** -->
<property name="absoluteBeamWidth" value="20"/>
<property name="relativeBeamWidth" value="1E-80"/>
<property name="absoluteWordBeamWidth" value="20"/>
<property name="relativeWordBeamWidth" value="1E-30"/>
<property name="wordInsertionProbability" value="1E-20"/>
<property name="languageWeight" value="7.0"/>
<property name="silenceInsertionProbability" value=".1"/>
<property name="linguist" value="lexTreeLinguist"/>
<property name="frontend" value="epFrontEnd"/>
<property name="recognizer" value="recognizer"/>
<property name="showCreations" value="false"/>
These are my settings in config file. Even I played with absolute beam width it is still giving error.
Any suggestions for this.
I have 3gb ram. Even though I go to higher heap space still problem exists
java -jar -mx1500m bin/HELLOSYLLABLE.jar
Dear Sir,
Please suggest me that the error is only because of heap space or any other errors i have done in configuring or training? This is important for me because if it is only because of less heap space i want to run on higher end systems.
I need your valuable suggestion.
You made a mistake in configuration. To reproduce your problem I need to access all your changes, without that it would be hard to help you.
<?xml version="1.0" encoding="UTF-8"?>
<!--
Sphinx-4 Configuration file
-->
<!-- ******** -->
<!-- biship configuration file -->
<!-- ******** -->
<config>
<!-- ******** -->
<!-- frequently tuned properties -->
<!-- ******** -->
<property name="absoluteBeamWidth" value="20"/>
<property name="relativeBeamWidth" value="1E-80"/>
<property name="absoluteWordBeamWidth" value="10"/>
<property name="relativeWordBeamWidth" value="1E-30"/>
<property name="wordInsertionProbability" value="1E-20"/>
<property name="languageWeight" value="7.0"/>
<property name="silenceInsertionProbability" value=".1"/>
<property name="linguist" value="lexTreeLinguist"/>
<property name="frontend" value="epFrontEnd"/>
<property name="recognizer" value="recognizer"/>
<property name="showCreations" value="false"/>
</config>
This says nothing to me, pack everything into ready to run archive, upload somewhere and give a link.
Copyright 1999-2002 Carnegie Mellon University.
Portions Copyright 2002 Sun Microsystems, Inc.
Portions Copyright 2002 Mitsubishi Electronic Research Laboratories.
All Rights Reserved. Use is subject to license terms.
See the file "license.terms" for information on usage and
redistribution of this file, and for a DISCLAIMER OF ALL
WARRANTIES.
description = SYLLABLE acoustic models
modelClass = edu.cmu.sphinx.model.acoustic.SYLLABLE_16gau_13dCep_16k_40mel_130Hz_6800Hz.Model
modelLoader = edu.cmu.sphinx.model.acoustic.SYLLABLE_16gau_13dCep_16k_40mel_130Hz_6800Hz.ModelLoader
isBinary = true
featureType = 1s_c_d_dd
vectorLength = 39
sparseForm = false
numberFftPoints = 512
numberFilters = 40
gaussians = 16
minimumFrequency = 130
maximumFrequency = 6800
sampleRate = 16000
dataLocation = ci_continuous_16gau
modelDefinition = etc/SYLLABLE_clean_13dCep_16k_40mel_130Hz_6800Hz.ci.mdef
ok. sir. I will do it by tomorrow. Thanks for your quick reply.
Dear Sir,
Hello
I checked this. Indeed currently sphinx4 is not efficient enough to handle 1000 CI units. In particular edu.cmu.sphinx.linguist.util.HMMPool needs modifications to work with your model:
unitTable = new Unit[numCIUnits * numCIUnits * numCIUnits];
Here it tries to allocate a very huge table without much gain for you. I think it could be reworked to deal with hash table or something like that, but it requires some trivial coding.
Thank you very much for your quick reply. I will try to understand the module. If you come across the fix for this, please let me know. Please post the solution here. I will also try to understand and come up with some work around.