CMU Sphinx / Forums / Speech Recognition Theory: LIUM toolkit

Yifang Xu - 2010-03-18

Does anyone know where to download LIUM speech recognition toolkit?

Thank you,

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yannick Estève - 2010-03-18

Hi,

Major parts of the LIUM ASR are available in the LIUM branch of the cmusphinx
svn.
But at this time, there is no documentation :(
In fact, we are porting it into the official CMU Sphinx version. But we need
time.

The diarization system is available here:
http://liumtools.univ-lemans.fr//index.php?option=com_content&task=blogcatego
ry&id=32&Itemid=60

-Yannick

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Maria Eskevich - 2011-04-07

Yannick, can I ask about some work with LIUM?

As it is said on the wiki you can run --threshold option to get the best
settings for your data, but I have a question about the input :
java -Xmx2024m -jar ./lib/LIUM_SpkDiarization.jar --help
--thresholds=1.5:2.5,2.5:3.5,250.0:300,0:3.0 --loadInputSegmentation
--fInputMask="./sph/%s.sph" --sInputMask="./sph/%s.uem.seg"
--sInput2Mask="./ref/%s.seg" --doTuning=2 --doCEClustering $show &> out.txt

what is the format of the ref-file? and what is the uem.seg file? again what
is the format? should I proiduce it myself or can produce by the tool?

Thank you very much in advance,

Maria

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Hello Maria

I'm not sure if Yannick can answer your question here. You might want to
contact them directly by mail or through cmusphinx-devel mailing list.

As for ref and seg files they have same format.

20041006_0800_0900_CULTURE 1 255965 4946 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 260912 85 M U U Nicolas_Demorand
20041006_0800_0900_CULTURE 1 260997 8191 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 269188 3091 M U U Nicolas_Demorand
20041006_0800_0900_CULTURE 1 272280 10497 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 282778 231 M U U Nicolas_Demorand
20041006_0800_0900_CULTURE 1 282778 231 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 283009 5884 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 288893 124 M U U Nicolas_Demorand
20041006_0800_0900_CULTURE 1 289018 4997 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 294015 4147 M U U Nicolas_Demorand
20041006_0800_0900_CULTURE 1 298162 18753 M U U Jacques_Derrida
20041006_0800_0900_CULTURE 1 316915 1640 M U U Nicolas_Demorand

It's pretty self-explaining. First column is record name, then channel, then
start and end times then male/female flag, two U items reserved and speaker
name. You can create such file with simple software script from the data you
have or manually if you have a lot of free labour.

Mihai Dogariu - 2014-07-29

Hi, a small adjustment to what Nickolay said: the next number after the start time represents the duration of that segment, not the end time. You can easily obtain the end time by adding the 2 numbers and they are expressed in tenths of miliseconds. Basically, you can divide these numbers by 100 and obtain their seconds representation.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LIUM toolkit

Speech Recognition Toolkit

Forums

Help

LIUM toolkit

LIUM toolkit

Speech Recognition Toolkit

Forums

Help

LIUM toolkit document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

LIUM toolkit