Menu

A question about VoxForge Dataset

2011-02-22
2012-09-22
  • Vassil Panayotov

    Hi,

    I want to experiment with the data collected from VoxForge project.
    Some of the recordings however seem to be very quiet/noisy though.
    Is there some DSP tool(prefferably command line so I can use it in
    a script) that can be used to do a crude filtering of the too
    quiet utterances? I don't know much about DSP, but I guess some
    crude heuristics can be used. Maybe something like the average energy
    in the central part of the utterance?
    Are there more sophisticated tools available to assess the quality
    in terms of noise, sound level and signal to noise ratio?

    Thank you!

     
  • Nickolay V. Shmyrev

    There is no ready-to-use tool to do that. You can output SNR estimation from
    sphinx4 VAD for example you can do many other things with various external
    tools. It would be nice to include database cleanup infrastructure into core
    sphinxtrain process.

    However, my experience with database training shows that it's better to keep
    noisy and even incorrect prompts in the database. It gives better accuracy
    after all. Its counter-intuitive but confirmed many times. So you need to be
    careful about filtering.

     
  • Vassil Panayotov

    Now as I think again about the issue I realize that you are right...
    The problematic recordings may improve the generalization power
    and robustness of the model in real-world conditions.
    Thanks!

     

Log in to post a comment.