I would like to train a UBM GMM then speaker specific GMMs for speaker diarization using the LIUM Diarization toolkit. I have been following the documentation on the LIUM website (http://www-lium.univ-lemans.fr/diarization/doku.php/gaussian_gmm_training) but am having trouble understanding how to point to multiple feature files in the training process.
I am using the following command:
java -Xmx2048m -cp ../lium_spkdiarization-8.4.1.jar \
fr.lium.spkDiarization.programs.MTrainInit --nbComp=8 --kind=DIAG --emInitMethod=split_all --emCtrl=1,5,0.05 \
--sInputMask="./%s.seg" \
--fInputMask="./%s.mfcc" --fInputDesc="sphinx,1:1:0:0:0:0,13,0:0:0:0"
--tOutputMask="./%s.init.gmms" "test"
The howto page describes how the segmentation file (.seg) can contain segmentations of multiple files. But how do I specify the different feature files (.mfcc) corresponding to the different files named in the segmentation file?
Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I would like to train a UBM GMM then speaker specific GMMs for speaker diarization using the LIUM Diarization toolkit. I have been following the documentation on the LIUM website (http://www-lium.univ-lemans.fr/diarization/doku.php/gaussian_gmm_training) but am having trouble understanding how to point to multiple feature files in the training process.
I am using the following command:
java -Xmx2048m -cp ../lium_spkdiarization-8.4.1.jar \ fr.lium.spkDiarization.programs.MTrainInit --nbComp=8 --kind=DIAG --emInitMethod=split_all --emCtrl=1,5,0.05 \ --sInputMask="./%s.seg" \ --fInputMask="./%s.mfcc" --fInputDesc="sphinx,1:1:0:0:0:0,13,0:0:0:0"
--tOutputMask="./%s.init.gmms" "test"
The howto page describes how the segmentation file (.seg) can contain segmentations of multiple files. But how do I specify the different feature files (.mfcc) corresponding to the different files named in the segmentation file?
Thank you
Use kaldi diarization