Menu

Sphinx 4 New acoustic Model Creation and training

Help
2015-11-26
2015-12-09
  • Suranga Premakumara

    I have go through http://cmusphinx.sourceforge.net/wiki/tutorialam tutorial and tried to create acoustic model.

    after run sphinxtrain run on linux terminal It gives me terminal output like this,

    suranga@ubuntu:~/Downloads/acoustic_model$ cd an4
    suranga@ubuntu:~/Downloads/acoustic_model/an4$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1) 
    Extracting features from  segments starting at  (part 1 of 1) 
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 70098 words using 42 phones
    WARNING: This phone (on) occurs in the dictionary (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic), but not in the phonelist (/home/suranga/Downloads/acoustic_model/an4/etc/an4.phone)
    WARNING: This phone (cn jh) occurs in the phonelist (/home/suranga/Downloads/acoustic_model/an4/etc/an4.phone), but not in the dictionary (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
    WARNING: This word (නීතා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දිනයක්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඕනා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (උලෙළ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කිරිල්ලී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ලිබ්බොක්කා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ජනකතා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (වැලි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (නොරූස්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දෙමාපිය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දූපතේදී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word () has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ධාරා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කනිෂ්ක) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පිදීම) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (යාම) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (රසවිත) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (නැහැ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පාටා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (රාළහාමි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ගංවතුර) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දිනය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දුස්රා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ගී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (අම්මා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පුස්තකාලය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඒම්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (යයි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඩෙලීවින්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ක්ලිනික්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කැල්ටෙක්ස්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කැරිණ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඉන්දියා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මව) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මිථ්‍යාදෘෂ්ටික) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ජෙට්වින්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මිට්සුයි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පියවර) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ගීය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කැළුම්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ව්‍යාපෘතියට) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (චක්‍රවර්තී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඉඩම්%) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දැ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඇය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඉඳුරා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පාරේ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (යන්නැයි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කැදැල්ල) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word () has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කබඩ්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ප්‍රචණ්ඩ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ව්‍යාපෘතිය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (උදාන) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දෛවය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (අල්කයිඩා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (දිනමිණ) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඩී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මව්පිය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඕස්ට්‍රේලියා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (අල්වා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මරණය) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මෙහෙයුමක්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ලන්ඩනයට) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (මන්ත්‍රීවරුන්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඇට්ලස්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (නිදහස) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (බන්ධනයක්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (පැහැති) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කරපටි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (බී) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කයිසා) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ටිමොති) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (වර්ඩ්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (ඔඩෙල්) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (චක්‍රවර්ති) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (සිට) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (වගයි) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
    WARNING: This word (කැබිලිත්ත) has duplicate entries in (/home/suranga/Downloads/acoustic_model/an4/etc/an4.dic)
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    WARNING: Error in '/home/suranga/Downloads/acoustic_model/an4/etc/an4_train.fileids', the feature file '/home/suranga/Downloads/acoustic_model/an4/feat/User1/SentNum_1.mfc' does not exist, or is empty
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 0.0289025641025641
    ERROR: Not enough data for the training, we can only train CI models (set CFG_CD_TRAIN to "no")
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 70095
            Words in filler dictionary: 3
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මම ගෙදර යමි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මම ගෙදර යමි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මම ගෙදර යන්නෙමි  </S> ). Do cases match?
    WARNING: This word: යන්නෙමි was in the transcript file, but is not in the dictionary (<S>  මම ගෙදර යන්නෙමි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මම ගෙදර යන්නෙමි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  අපි පාසලට ගියෙමු  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  අපි පාසලට ගියෙමු  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  සුභ දවසක් වේවා  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  සුභ දවසක් වේවා  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔබට සුභ දවසක්  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔබට සුභ දවසක්  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  රාජ්‍ය සේවය පිණිසයි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  රාජ්‍ය සේවය පිණිසයි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  උදෑසන ආහාරය සමබල අහාරයක් විය යුතුයි  </S> ). Do cases match?
    WARNING: This word: අහාරයක් was in the transcript file, but is not in the dictionary (<S>  උදෑසන ආහාරය සමබල අහාරයක් විය යුතුයි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  උදෑසන ආහාරය සමබල අහාරයක් විය යුතුයි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  කොළඹ විශ්ව විද්‍යාලීය පරිගණක අධ්‍යනායතනය  </S> ). Do cases match?
    WARNING: This word: අධ්‍යනායතනය was in the transcript file, but is not in the dictionary (<S>  කොළඹ විශ්ව විද්‍යාලීය පරිගණක අධ්‍යනායතනය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  කොළඹ විශ්ව විද්‍යාලීය පරිගණක අධ්‍යනායතනය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  අපේ මව් බිම රැකගත යුතුය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  අපේ මව් බිම රැකගත යුතුය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔහු නීතිය අකුරටම ක්‍රියාත්මක කළේය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔහු නීතිය අකුරටම ක්‍රියාත්මක කළේය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  අපි අපේ රට වෙනුවෙන් කැපවිය යුත්තෙමු  </S> ). Do cases match?
    WARNING: This word: කැපවිය was in the transcript file, but is not in the dictionary (<S>  අපි අපේ රට වෙනුවෙන් කැපවිය යුත්තෙමු  </S> ). Do cases match?
    WARNING: This word: යුත්තෙමු was in the transcript file, but is not in the dictionary (<S>  අපි අපේ රට වෙනුවෙන් කැපවිය යුත්තෙමු  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  අපි අපේ රට වෙනුවෙන් කැපවිය යුත්තෙමු  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  අපි ඔබගේ කාරුණික අවදානය අපේක්ෂා කරමු  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  අපි ඔබගේ කාරුණික අවදානය අපේක්ෂා කරමු  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  කරුණාකර පහත සඳහන් දුරකථන අංකයට අමතන්න  </S> ). Do cases match?
    WARNING: This word: අංකයට was in the transcript file, but is not in the dictionary (<S>  කරුණාකර පහත සඳහන් දුරකථන අංකයට අමතන්න  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  කරුණාකර පහත සඳහන් දුරකථන අංකයට අමතන්න  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ආහාර හිඟකම නිසා බොහෝ දරුවන් මන්ද පෝෂණයෙන් පෙලෙති </S> ). Do cases match?
    WARNING: This word: පෙලෙති was in the transcript file, but is not in the dictionary (<S>  ආහාර හිඟකම නිසා බොහෝ දරුවන් මන්ද පෝෂණයෙන් පෙලෙති </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ආහාර හිඟකම නිසා බොහෝ දරුවන් මන්ද පෝෂණයෙන් පෙලෙති </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඇය ස්ථාන කිහිපයක විමසා බැලුවාය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඇය ස්ථාන කිහිපයක විමසා බැලුවාය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මදුරුවන් බෝවන ස්ථාන විනාශ කල යුතුය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මදුරුවන් බෝවන ස්ථාන විනාශ කල යුතුය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  රෝග කාරක මදුරුවන් පැතිරීම වැළක්විය යුතුය  </S> ). Do cases match?
    WARNING: This word: වැළක්විය was in the transcript file, but is not in the dictionary (<S>  රෝග කාරක මදුරුවන් පැතිරීම වැළක්විය යුතුය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  රෝග කාරක මදුරුවන් පැතිරීම වැළක්විය යුතුය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔව්හු ස්ථාන කිහිපයක පත්‍රිකා බෙදා දුන්හ  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔව්හු ස්ථාන කිහිපයක පත්‍රිකා බෙදා දුන්හ  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔබට ලගම ඇති අපේ කාර්යාලය අමතන්න  </S> ). Do cases match?
    WARNING: This word: ලගම was in the transcript file, but is not in the dictionary (<S>  ඔබට ලගම ඇති අපේ කාර්යාලය අමතන්න  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔබට ලගම ඇති අපේ කාර්යාලය අමතන්න  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔහුට පලායාම හැර වෙන විකල්පයක් නොතිබුනි  </S> ). Do cases match?
    WARNING: This word: නොතිබුනි was in the transcript file, but is not in the dictionary (<S>  ඔහුට පලායාම හැර වෙන විකල්පයක් නොතිබුනි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔහුට පලායාම හැර වෙන විකල්පයක් නොතිබුනි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  දහ තුන සුබ ඉලක්කමක් ලෙස ඇතැම්හු සලකති  </S> ). Do cases match?
    WARNING: This word: ඉලක්කමක් was in the transcript file, but is not in the dictionary (<S>  දහ තුන සුබ ඉලක්කමක් ලෙස ඇතැම්හු සලකති  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  දහ තුන සුබ ඉලක්කමක් ලෙස ඇතැම්හු සලකති  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මෙවැනි පිළිම දිගු කාලයක් පවතින්නේ නැත  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මෙවැනි පිළිම දිගු කාලයක් පවතින්නේ නැත  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  එය නිදහසින් පසු ලැබූ දෙවැනි ඉහලම වර්ධනයයි  </S> ). Do cases match?
    WARNING: This word: ඉහලම was in the transcript file, but is not in the dictionary (<S>  එය නිදහසින් පසු ලැබූ දෙවැනි ඉහලම වර්ධනයයි  </S> ). Do cases match?
    WARNING: This word: වර්ධනයයි was in the transcript file, but is not in the dictionary (<S>  එය නිදහසින් පසු ලැබූ දෙවැනි ඉහලම වර්ධනයයි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  එය නිදහසින් පසු ලැබූ දෙවැනි ඉහලම වර්ධනයයි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මුළු ලෝකයම තිබුනත්  අයට සතුටක් නැත  </S> ). Do cases match?
    WARNING: This word: තිබුනත් was in the transcript file, but is not in the dictionary (<S>  මුළු ලෝකයම තිබුනත්  අයට සතුටක් නැත  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මුළු ලෝකයම තිබුනත්  අයට සතුටක් නැත  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  නෙවිල් වික්‍රමසිංහ නව නිර්මාණ කිහිපයක්ම අපට දායාද කර ඇත  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  නෙවිල් වික්‍රමසිංහ නව නිර්මාණ කිහිපයක්ම අපට දායාද කර ඇත  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඇය හිස් අහස් කුස දෙස බලමින් පවසන්නීය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඇය හිස් අහස් කුස දෙස බලමින් පවසන්නීය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  නෙවිල් මහතා තාක්ෂණික නිලධාරියෙකි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  නෙවිල් මහතා තාක්ෂණික නිලධාරියෙකි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  පසු ගිය දින කිහිපය තුල ඔහුගේ කුසට නිසි ආහාරයක් ලැබී නැත  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  පසු ගිය දින කිහිපය තුල ඔහුගේ කුසට නිසි ආහාරයක් ලැබී නැත  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඔහු ඉතා කැපවීමෙන් සේවය කරන හෙද නිලධාරියෙකි  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඔහු ඉතා කැපවීමෙන් සේවය කරන හෙද නිලධාරියෙකි  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  මීගමුවේ අංක එකේ හෝටලය ලෙස සැලකෙන මීගමුව බ්‍රව්න්ස් බීච් හෝටලය  </S> ). Do cases match?
    WARNING: This word: බ්‍රව්න්ස් was in the transcript file, but is not in the dictionary (<S>  මීගමුවේ අංක එකේ හෝටලය ලෙස සැලකෙන මීගමුව බ්‍රව්න්ස් බීච් හෝටලය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  මීගමුවේ අංක එකේ හෝටලය ලෙස සැලකෙන මීගමුව බ්‍රව්න්ස් බීච් හෝටලය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  පකිස්තානයේ තත්වයද මෙයට සමානය  </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  පකිස්තානයේ තත්වයද මෙයට සමානය  </S> ). Do cases match?
    WARNING: This word: <S> was in the transcript file, but is not in the dictionary (<S>  ඉන්දියාව මේ රටවලට  නායකත්වය දෙයි   </S> ). Do cases match?
    WARNING: This word: </S> was in the transcript file, but is not in the dictionary (<S>  ඉන්දියාව මේ රටවලට  නායකත්වය දෙයි   </S> ). Do cases match?
    Use of uninitialized value $_[0] in substitution (s///) at /usr/share/perl/5.18/File/Basename.pm line 341, <TRN> line 33.
    **fileparse(): need a valid pathname at /usr/local/lib/sphinxtrain/scripts/00.verify/verify_all.pl line 352.
    suranga@ubuntu:~/Downloads/acoustic_model/an4$ **
    

    My problem is
    this did not created "model_parameters" and "model_architecture" folders. What is my mistake here?
    my current output folder structure is,
    https://drive.google.com/file/d/0B6VUxeXOnMe7MjRQLUs2djFJLWc/view?usp=sharing
    thank you for help me.

     
    • Nickolay V. Shmyrev

      Warning messages explain your problems in plain English. You just need to fix them before you proceed.

       
  • Suranga Premakumara

    I go through all errors and warnings. But I was stucked at,
    Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
    WARNING: Error in '/home/suranga/Downloads/acoustic_model/an4/etc/an4_train.fileids', the feature file '/home/suranga/Downloads/acoustic_model/an4/feat/User1/SentNum_1.mfc' does not exist, or is empty

    I go through previous question https://sourceforge.net/p/cmusphinx/discussion/help/thread/fdb9db19/ also. but my logfiles seems okay.

    Here I have attached My an4 folder link.
    https://drive.google.com/folderview?id=0B6VUxeXOnMe7ZmJRQzgzckQyRjA&usp=sharing

    And I still have all WARNINGS like
    WARNING: This word:  was in the transcript file, but is not in the dictionary ( මම ගෙදර යමි ). Do cases match?

    I go through bellow link to find out the solution
    https://sourceforge.net/p/cmusphinx/discussion/help/thread/52d70a41/

    I have used utf-8 encoding format for all txt files that I created. and I think issue may be with my dictionary according to above solution. but I could not identify the isse.

    Thank you very much for your consideration and your time.

     
  • Nickolay V. Shmyrev

    All those warnings are valid, for example <s> must be in lowercase.

     

Log in to post a comment.