Major parts of the LIUM ASR are available in the LIUM branch of the cmusphinx
svn.
But at this time, there is no documentation :(
In fact, we are porting it into the official CMU Sphinx version. But we need
time.
As it is said on the wiki you can run --threshold option to get the best
settings for your data, but I have a question about the input :
java -Xmx2024m -jar ./lib/LIUM_SpkDiarization.jar --help
--thresholds=1.5:2.5,2.5:3.5,250.0:300,0:3.0 --loadInputSegmentation
--fInputMask="./sph/%s.sph" --sInputMask="./sph/%s.uem.seg"
--sInput2Mask="./ref/%s.seg" --doTuning=2 --doCEClustering $show &> out.txt
what is the format of the ref-file? and what is the uem.seg file? again what
is the format? should I proiduce it myself or can produce by the tool?
Thank you very much in advance,
Maria
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's pretty self-explaining. First column is record name, then channel, then
start and end times then male/female flag, two U items reserved and speaker
name. You can create such file with simple software script from the data you
have or manually if you have a lot of free labour.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, a small adjustment to what Nickolay said: the next number after the start time represents the duration of that segment, not the end time. You can easily obtain the end time by adding the 2 numbers and they are expressed in tenths of miliseconds. Basically, you can divide these numbers by 100 and obtain their seconds representation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Does anyone know where to download LIUM speech recognition toolkit?
Thank you,
Hi,
Major parts of the LIUM ASR are available in the LIUM branch of the cmusphinx
svn.
But at this time, there is no documentation :(
In fact, we are porting it into the official CMU Sphinx version. But we need
time.
The diarization system is available here:
http://liumtools.univ-lemans.fr//index.php?option=com_content&task=blogcatego
ry&id=32&Itemid=60
-Yannick
Yannick, can I ask about some work with LIUM?
As it is said on the wiki you can run --threshold option to get the best
settings for your data, but I have a question about the input :
java -Xmx2024m -jar ./lib/LIUM_SpkDiarization.jar --help
--thresholds=1.5:2.5,2.5:3.5,250.0:300,0:3.0 --loadInputSegmentation
--fInputMask="./sph/%s.sph" --sInputMask="./sph/%s.uem.seg"
--sInput2Mask="./ref/%s.seg" --doTuning=2 --doCEClustering $show &> out.txt
what is the format of the ref-file? and what is the uem.seg file? again what
is the format? should I proiduce it myself or can produce by the tool?
Thank you very much in advance,
Maria
Hello Maria
I'm not sure if Yannick can answer your question here. You might want to
contact them directly by mail or through cmusphinx-devel mailing list.
As for ref and seg files they have same format.
It's pretty self-explaining. First column is record name, then channel, then
start and end times then male/female flag, two U items reserved and speaker
name. You can create such file with simple software script from the data you
have or manually if you have a lot of free labour.
Hi, a small adjustment to what Nickolay said: the next number after the start time represents the duration of that segment, not the end time. You can easily obtain the end time by adding the 2 numbers and they are expressed in tenths of miliseconds. Basically, you can divide these numbers by 100 and obtain their seconds representation.