Menu

forced-aligned transcripts

Help
Jake
2011-05-13
2012-09-22
  • Jake

    Jake - 2011-05-13

    In general, word in a training transcript don't have appended pronunciation
    variant (e.g., DID(2)). I've read some old notes online about creating force-
    aligned transcripts, could you please verify the following steps are correct
    or not missing anything?
    1. Build the CI models using the training transcript without pronunciation variant.
    2. Create the force-aligned used dictionary and filler dictionary. The filler dictionary only includes , , and <sil>. The noise words such as ++AH++ are merged into the dictionary.
    3. Run s3align to generate the forced-aligned transcript.
    4. Use the forced-aligned transcript to re-train starting from the CI, and eventually to build CD models. </sil>

    I realized that the force-aligned transcript will not only mark pronunciation
    variant, but also insert noise words. How do I verify if they are correct or
    not? Or I just trust it will work?

    Thanks for looking.

     
  • Nickolay V. Shmyrev

    Hi

    the following steps are correct or not missing anything?

    Steps are correct

    I realized that the force-aligned transcript will not only mark
    pronunciation variant, but also insert noise words.

    Because you left only silences in filler dict, forced-align will insert ONLY
    SILENCES. It will not insert other fillers

    Or I just trust it will work?

    Yes, you need to trust it

     

Log in to post a comment.