Menu

Help in converting transcription files

Help
narayana
2007-08-30
2012-09-22
  • narayana

    narayana - 2007-08-30

    Hi
    I am using aurora data base. I want to setup Sphinx Trainer and Sphinx decoder for that database.

    I have checked the transcription files in that database they are in HTK format as follows.

    #!MLF!#
    "/FAC_13A.lab"
    sil
    one
    three
    sil
    .
    "
    /FAC_1473533A.lab"
    sil
    one
    four
    seven
    three
    five
    three
    three
    sil
    .
    "/FAC_172A.lab"
    sil
    one
    seven
    two
    sil
    .
    "
    /FAC_1911446A.lab"
    sil
    one
    nine
    one
    one
    four
    four
    six
    sil
    .
    Can any one help me out how to convert these transcription files to sphinx format.Is there any existing command or program for it.

    please help me .As i am new to Sphinx i am facing lots of problems.

    Thanks
    Narayana

     
    • Nickolay V. Shmyrev

      On Linux you can use something like:

      cat a.mlf | grep -v MLF | tr -d '\n' | sed -e 's/.lab//g;s:"*/::g;s:"::g;' | tr '.' '\n' |
      awk '{for(i=2;i<=NF;i++)printf("%s ", $i);printf ("(%s)\n", $1)}'

      Otherwise simple script or C program can do that.

       
    • narayana

      narayana - 2007-08-31

      Thanks for the help.

      There is a small correction in the script.

      cat a.mlf | grep -v MLF | tr '\n' ' '| sed -e 's/.lab//g;s:"*/::g;s:"::g;' | tr '.' '\n' |
      awk '{for(i=2;i<=NF;i++)printf("%s ", $i);printf ("(%s)\n", $1)}'

      First tr command.

      --Narayana.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.