Menu

audio file length in building AM

Help
Jake
2011-02-15
2012-09-22
  • Jake

    Jake - 2011-02-15

    Hi,

    I'm not sure if the question has been asked. I read AM training on CMUSphinx
    Wiki. It says "Audio files shouldn't be very long and shouldn't be very
    short." I wonder what would happen (no problem, or errors, or hang) if I would
    use a bunch of audio files with a few minutes long for each as training data?
    If this would be the limit of SphinxTrain, is audio segmentation the only
    solution? If so, what segmentation tool(s) do you recommend me use to cut the
    long audio to 5-30 seconds long according to the tutorial on Wiki? Thanks a
    lot.

     
  • Nickolay V. Shmyrev

    I wonder what would happen (no problem, or errors, or hang) if I would use a
    bunch of audio files with a few minutes long for each as training data?

    Few minutes will work without an issue but if number minutes will be more the
    ten you'll get buffer overflow. Potential threats are also not reaching the
    final state of the utterance and ignoring it and get a reduced accuracy of the
    model because of misalignment.

    If this would be the limit of SphinxTrain, is audio segmentation the only
    solution?

    Yes

    If so, what segmentation tool(s) do you recommend me use to cut the long
    audio to 5-30 seconds long according to the tutorial on Wiki? Thanks a lot.

    You can train am with long files, then use this model to segment long files on
    shorter ones and then retrain the model.

     

Log in to post a comment.