#273 small bugs in Sphinxtrain/bw with incorrect dict

Kris Thielemans


below is a small fix for a segmentation fault that occurs when a word in the training list isn't in the dictionary (E_WARN tries to print n_word (which is an int) as a string). With this fix, the user gets the warning and the program continues. However, maybe the code should actually abort in such a case? I'm not sure...

--- src/programs/bw/next_utt_states.c (revision 11227)
+++ src/programs/bw/next_utt_states.c (working copy)
@@ -76,7 +76,7 @@

 phone = mk_phone_list(&btw_mark, &n_phone, word, n_word, lex);
 if (phone == NULL) {
  • E_WARN("Unable to produce phonetic transcription for the word '%s'\n", n_word);
  • E_WARN("Unable to produce phonetic transcription for the word '%s'\n", word);
    return NULL;

There is another bug in case the (noise) dictionary uses an unknown phone (I had this when I used a noisedict with +UM+ etc while the original model didn't have these). Then at the end of running bw, there's a segmentation fault in lexicon_free. valgrind says

==10206== Invalid read of size 8
==10206== at 0x412FA9: lexicon_free (lexicon.c:279)
==10206== by 0x40DA78: main (main.c:1913)
==10206== Address 0xd7af150 is 0 bytes inside a block of size 40 free'd
==10206== at 0x4C282ED: free (vg_replace_malloc.c:366)
==10206== by 0x412FCB: lexicon_free (lexicon.c:282)
==10206== by 0x40DA78: main (main.c:1913)

Apparently the offending word was still added to the lexicon (as it's found by the iterator) but ckd_free(entry->ortho); crashes. However, I'm not sure how to fix this (as lexicon.c L210 seems fine )



  • This has been fixed now