From: Daniel P. <dp...@gm...> - 2014-02-26 19:03:59
|
Hi, > I am using the Switchboard recipe to force-align some audio files for > which I have the transcriptions. I also have the phoneme transcriptions of > all the words. > I am doing it using the scripts in steps/align_si.sh > and steps/get_train_ctm.sh > Whenever one word is not found in the Switchboard lexicon it aligns an OOV > word instead (i.e. <unk>). Is there a way to tell kaldi at alignment time > the transcription of these words? > I think what you mean is, "is there a way to give Kaldi at alignment time the lexicon entry for these words?". The easiest way to do this is to create a new "lang" directory that has a larger lexicon, including the new words, and provide this directory to the script that does the alignment.. You can use the prepare_lang.sh script for this; just give it an input directory that has a larger lexicon.txt or lexiconp.txt. Make sure after you do this that the phones.txt files are identical in the old and new directories, except possibly for extra disambiguation symbols (#1, #2, etc.). Dan > > Thanks in advance, > > Xavi Anguera > > > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > |