Correct options to send sphinx's mfc file into LIUM

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

Correct options to send sphinx's mfc file into LIUM

Forum: Help

Creator: Emil Lundh

Created: 2016-04-26

Updated: 2016-04-28

Emil Lundh - 2016-04-26

I have an .mfc file encoded by sphinx_fe. Now next step, I want to segment the audio with LIUM SpkDiarization, using this .mfc file as input.
Trying LIUM options -fInputDesc sphinx doesn't quite help, because I receive the following complaint:

08:01.088 MDecode FINE | decoder.accumulation starting at 0 to 895 {make() / 10} 08:01.094 Diarization SEVERE| Exception error {run() / 10} java.lang.ArrayIndexOutOfBoundsException: -1

after a few quantities have evauated to NaN. Full output here: http://pastebin.com/Tp1EK9EZ
I speculate that I have provided the wrong parameters. The parameter list is long (default seems to be audio2sphinx,1:1:0:0:0:0,13,0:0:0:0) and I have no clue what all these should be for a feature file from sphinx_fe. Does anybody know?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-04-26
  
  I believe audio2sphinx is for PCM input. As described here:
  
  http://www-lium.univ-lemans.fr/diarization/doku.php/commun_parameter
  
  it probably should be just "sphinx".
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Emil Lundh - 2016-04-26

Sorry for the confusion, indeed I have set --fInputDesc sphinx. (See link above for the exact command line I used.) But there are 11 other parameters to set there. I have no clue here. It says they corresponds to energy, delta, umber of features, ... Any idea which ones are important? Because if I use default for them I obtain the exception cited above.

Last edit: Emil Lundh 2016-04-26

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Emil Lundh - 2016-04-27

To clarify: Setting --fInputDesc sphinx does NOT help; not alone.

Probably one has to change some more settings. The question is which.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-04-28
  
  hm, I think the right option should be not simply sphinx but something like
  
  sphinx,1:1:0:0:0:0,13,0:0:0:0
  
  Overall, I'm quite unhappy with LIUM, you'd better start writing code from scratch instead of looking on it. It is extremely hard to configure LIUM properly.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Emil Lundh - 2016-04-28

Thanks Nickolay, I guess I'll just experiment along. After all, I'm not really up to writing my own speech segmentation tool! (Were there plans for writing a segmentation library within CMU Sphinx?)
I could also redesign my workflow to give LIUM a wav file, but that feels suboptimal for a number of (internal) reasons...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.