Thread: [Kaldi-developers] using external features files with KALDI recipe

Brought to you by: bouliagi, danielpovey, jtrmal, ngoel17, and 2 others

kaldi-developers

[Kaldi-developers] using external features files with KALDI recipe

From: Arif K. <ife...@gm...> - 2014-08-20 10:04:27

Dear Kaladi authors,

I want to use an external features extraction program/module 
(EST-edinburg speech tools) to extract features . Its just a matrix of N 
x M size , with N no of frames and M no of feature vector (in my case 48 
features - i am doing some multi-modal feature fusion experiments).

How to transform this feature vector, so that it is usable with KALDI 
i.e matrix in kaldi format?


Best regards,
Arif

Re: [Kaldi-developers] using external features files with KALDI recipe

From: Korbinian R. <kor...@gm...> - 2014-08-20 10:20:02

Hi,

it's easiest to use bin/copy-feats.  If your feature program supports
HTK or Sphinx format, then use --htk-in or --sphinx-in, otherwise
parse from ascii using ark,t and some script to produce the proper
kaldi archive format
turn = [
[ 0 0 0 ... ]
[ 0 0 0 ... ]
]

Korbinian.

On Wed, Aug 20, 2014 at 12:04 PM, Arif Khan <ife...@gm...> wrote:
> Dear Kaladi authors,
>
> I want to use an external features extraction program/module
> (EST-edinburg speech tools) to extract features . Its just a matrix of N
> x M size , with N no of frames and M no of feature vector (in my case 48
> features - i am doing some multi-modal feature fusion experiments).
>
> How to transform this feature vector, so that it is usable with KALDI
> i.e matrix in kaldi format?
>
>
> Best regards,
> Arif
>
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

Re: [Kaldi-developers] using external features files with KALDI recipe

From: Arif K. <ife...@gm...> - 2014-08-20 16:15:15

Hi,
I have two parts of my email both for solving the same problem. ie. 
Feature concatenation - mismatch

Ist part:
Thanks Korbinian for your usefull reply, I used the bin/paste-feats 
module but its give me error like:

"Code to read HTK features does not support compressed features, or 
features with VQ.". I read on kaldi mailing list that we do need to 
uncompress (by Dan) it but I dont know which utility to use.

I must mention that I am not using HTK or Sphinx, but (EST - edinburgh 
speech tools utility to extract features (that has an option for htk 
format. ) Here is the link: 
http://www.cstr.ed.ac.uk/projects/speech_tools/manual-1.2.0/x737.htm .

2nd part:
I also tried to compute the mfcc features with kaldi and than convert 
the other set of features to kaldi archive format and use the 
"paste-feats" module but there I got a mismatch in the lengths of two 
files in the numer of rows. Any work arround how to fix the number of 
rows for the two files.

Best regards,
Arif

On 20/08/14 12:19, Korbinian Riedhammer wrote:
> Hi,
>
> it's easiest to use bin/copy-feats.  If your feature program supports
> HTK or Sphinx format, then use --htk-in or --sphinx-in, otherwise
> parse from ascii using ark,t and some script to produce the proper
> kaldi archive format
> turn = [
> [ 0 0 0 ... ]
> [ 0 0 0 ... ]
> ]
>
> Korbinian.
>
> On Wed, Aug 20, 2014 at 12:04 PM, Arif Khan <ife...@gm...> wrote:
>> Dear Kaladi authors,
>>
>> I want to use an external features extraction program/module
>> (EST-edinburg speech tools) to extract features . Its just a matrix of N
>> x M size , with N no of frames and M no of feature vector (in my case 48
>> features - i am doing some multi-modal feature fusion experiments).
>>
>> How to transform this feature vector, so that it is usable with KALDI
>> i.e matrix in kaldi format?
>>
>>
>> Best regards,
>> Arif
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

Re: [Kaldi-developers] using external features files with KALDI recipe

From: Korbinian R. <kor...@gm...> - 2014-08-20 16:22:29

Jo Arif,


On Wed, Aug 20, 2014 at 6:15 PM, Arif Khan <ife...@gm...> wrote:
> Thanks Korbinian for your usefull reply, I used the bin/paste-feats module
Ok.

> but its give me error like:
>
> "Code to read HTK features does not support compressed features, or features
> with VQ.". I read on kaldi mailing list that we do need to uncompress (by
> Dan) it but I dont know which utility to use.
>
> I must mention that I am not using HTK or Sphinx, but (EST - edinburgh
> speech tools utility to extract features (that has an option for htk format.
If you're not using HTK features you shouldn't see this error.  Sure
you're working with the right tools?  I might be wrong, but I believe
there is a difference between the kaldi and HTK feature compression.

> module but there I got a mismatch in the lengths of two files in the numer
> of rows. Any work arround how to fix the number of rows for the two files.
There is an option to ignore this error, but most likely the
difference in length (rows) is due to the handling of the border
conditions at the beginning and end of the file.  HTK/Kaldi produce
one feature for each complete frame (and extrapolate on begin/end),
other tools might work differently.  As you're importing from a 3-rd
party module anyways, assuming that you do it via a text-archive, you
can just extend your processing in that conversion script to make sure
the boundary conditions match kaldi's behavior.

Korbinian.

Re: [Kaldi-developers] using external features files with KALDI recipe

From: Daniel P. <dp...@gm...> - 2014-08-20 19:56:10

Regarding compressed HTK features: you have to use HCopy, from the HTK
tools, to remove the compression.  I don't recall the exact options or
config file that you need.

Regarding the mismatch in length: there is an option to paste-feats,
something like --length-mismatch-tolerance, that you can use to make it
tolerate a small difference (it will output the length of the shortest
input).
Dan



On Wed, Aug 20, 2014 at 12:21 PM, Korbinian Riedhammer <kor...@gm...
> wrote:

> Jo Arif,
>
>
> On Wed, Aug 20, 2014 at 6:15 PM, Arif Khan <ife...@gm...> wrote:
> > Thanks Korbinian for your usefull reply, I used the bin/paste-feats
> module
> Ok.
>
> > but its give me error like:
> >
> > "Code to read HTK features does not support compressed features, or
> features
> > with VQ.". I read on kaldi mailing list that we do need to uncompress (by
> > Dan) it but I dont know which utility to use.
> >
> > I must mention that I am not using HTK or Sphinx, but (EST - edinburgh
> > speech tools utility to extract features (that has an option for htk
> format.
> If you're not using HTK features you shouldn't see this error.  Sure
> you're working with the right tools?  I might be wrong, but I believe
> there is a difference between the kaldi and HTK feature compression.
>
> > module but there I got a mismatch in the lengths of two files in the
> numer
> > of rows. Any work arround how to fix the number of rows for the two
> files.
> There is an option to ignore this error, but most likely the
> difference in length (rows) is due to the handling of the border
> conditions at the beginning and end of the file.  HTK/Kaldi produce
> one feature for each complete frame (and extrapolate on begin/end),
> other tools might work differently.  As you're importing from a 3-rd
> party module anyways, assuming that you do it via a text-archive, you
> can just extend your processing in that conversion script to make sure
> the boundary conditions match kaldi's behavior.
>
> Korbinian.
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>