SHoUT is a toolkit for performing research on large vocabulary continuous speech recognition (LVCSR). The toolkit contains applications for training statistical models and for speech/non-speech detection, speaker diarization and decoding.
- Large Vocabulary Continuous Speech Recognition
- Acoustic Model training
- Speaker adaptation (VTLN, CMN/CVN, SMAPLR)
- speaker diarization
- speech/non-speech classification
I still dont know how to develop the shout_segment, who can tell me how? many thanks