Menu

Sphinxtrain:Phone Not Occur in Transcription

Help
Anonymous
2010-09-15
2012-09-22
  • Anonymous

    Anonymous - 2010-09-15

    Hello everyone,

    I've been working on training an acoustic model from TIMIT corpus. I have
    followed the instruction from CMU Sphinx wiki page that talks about building
    an acoustic model. I have got to the point where I have obtained all the
    necessary files and started running RunAll.pl

    However, I receive the warnings that similar to the line below:

    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in
    the phonelist, and all phones in the phonelist appear at least once
    WARNING: This phone (AA) occurs in the phonelist
    (/home/frostshoxx/Desktop/tutorial/myTIMIT/etc/myTIMIT.phone), but not in any
    word in the transcription
    (/home/frostshoxx/Desktop/tutorial/myTIMIT/etc/myTIMIT_train.transcription)

    There are also other phones as well. I double check into the dictionary file
    (myTIMIT.dic) and myTIMIT_train.transcription. I'm pretty sure that these
    phones are located in my transcription and dictionary.

    I.e.
    in Dictionary

    ACCOMPLISHED AH K AA M P L IH SH T

    in transcription

    AMBIDEXTROUS PICKPOCKETS ACCOMPLISH MORE (TRAIN/DR2/MARC0/SX378)

    The dictionary file I used come from the webservice lmtool on CMU site that I
    get from the wiki page talking about building a language model. I notice that
    in the file there are some inconsistent for spacing like below

    ABOUT AH B AW T
    ABOVE AH B AH V
    ABRUPTLY AH B R AH P T L IY
    ABSENCES, AE B S AH N S IH Z

    Does this matter at all whether the phones are a few spaces away from the word
    differently from other words?

    Thank you for the help in advance. I appreciate your patience guiding a
    beginner like me.

    Regards,

     
  • Nickolay V. Shmyrev

    Does this matter at all whether the phones are a few spaces away from the
    word differently from other words?

    It depends on SphinxTrain version you are using. It shouldn't matter in latest
    snapshot.

    Anyway, it's very easy to check them - organize spaces properly. And it's not
    just about dictionary. For example you transcription file has double spaces it
    seems. It's not a good thing to do.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.