Help training UBM with Lium Diarization

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Help training UBM with Lium Diarization

Forum: Help

Created: 2017-10-26

Updated: 2018-01-25

chen - 2017-10-26

Hi,

I would like to train a UBM GMM then speaker specific GMMs for speaker diarization using the LIUM Diarization toolkit. I have been following the documentation on the LIUM website (http://www-lium.univ-lemans.fr/diarization/doku.php/gaussian_gmm_training) but am having trouble understanding how to point to multiple feature files in the training process.

I am using the following command:
java -Xmx2048m -cp ../lium_spkdiarization-8.4.1.jar \
fr.lium.spkDiarization.programs.MTrainInit --nbComp=8 --kind=DIAG --emInitMethod=split_all --emCtrl=1,5,0.05 \
--sInputMask="./%s.seg" \
--fInputMask="./%s.mfcc" --fInputDesc="sphinx,1:1:0:0:0:0,13,0:0:0:0"
--tOutputMask="./%s.init.gmms" "test"

The howto page describes how the segmentation file (.seg) can contain segmentations of multiple files. But how do I specify the different feature files (.mfcc) corresponding to the different files named in the segmentation file?

Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2018-01-25
  
  Use kaldi diarization
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.