kaldi-developers Mailing List for Kaldi (Page 14)

Brought to you by: bouliagi, danielpovey, jtrmal, ngoel17, and 2 others

This project can now be found here.

kaldi-developers — Kaldi Developers

You can subscribe to this list here.

2011	_Jan	_Feb	_Mar	_Apr	_May	_Jun (4)	_Jul	_Aug	_Sep (1)	_Oct (4)	_Nov (1)	_Dec (14)
2012	_Jan (1)	_Feb (8)	_Mar	_Apr (1)	_May (3)	_Jun (13)	_Jul (7)	_Aug (11)	_Sep (6)	_Oct (14)	_Nov (16)	_Dec (1)
2013	_Jan (3)	_Feb (8)	_Mar (17)	_Apr (21)	_May (27)	_Jun (11)	_Jul (11)	_Aug (21)	_Sep (39)	_Oct (17)	_Nov (39)	_Dec (28)
2014	_Jan (36)	_Feb (30)	_Mar (35)	_Apr (17)	_May (22)	_Jun (28)	_Jul (23)	_Aug (41)	_Sep (17)	_Oct (10)	_Nov (22)	_Dec (56)
2015	_Jan (30)	_Feb (32)	_Mar (37)	_Apr (28)	_May (79)	_Jun (18)	_Jul (35)	_Aug	_Sep (1)	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 .. 12 13 14 15 16 .. 37 > >> (Page 14 of 37)

Re: [Kaldi-developers] git

From: Peter K. <pe...@pe...> - 2014-11-19 22:19:24

On 11/19/14, 4:10 PM, Daniel Povey wrote:
> I am planning to migrate to git within a year or so, but in the
> meantime there are instructions on the installation page on how to use
> Kaldi with git, to give people time to get used to using Kaldi with
> git.

Excellent. Glad I asked.

Thanks, Dan.

pek


-- 
Peter Karman  .  http://peknet.com/  .  pe...@pe...

Re: [Kaldi-developers] git

From: Daniel P. <dp...@gm...> - 2014-11-19 22:10:35

I am planning to migrate to git within a year or so, but in the
meantime there are instructions on the installation page on how to use
Kaldi with git, to give people time to get used to using Kaldi with
git.
Dan


On Wed, Nov 19, 2014 at 5:07 PM, Peter Karman <pe...@pe...> wrote:
> I am a long-time Subversion user (since 2004 at least, maybe 2003 but I
> can't remember that far back). I think it's a great tool and still like it.
>
> That said, I've also been using Git for the last 3 or 4 years, and I've
> come to believe it has some social aspects that help facilitate more
> pick-up-and-help efforts amongst open source projects.
>
> So while I do not want or intend to start a thread about the relative
> technical merits of svn vs git, I do wonder if the Kaldi community has
> discussed migrating to git and if that idea holds any merit here.
>
> IME, git makes it easier for newcomers to try things out, contribute
> small improvements and generally join-in-the-fun. I find those traits
> helpful in open source projects, especially when trying to attract new
> blood.
>
> I know I can use git and svn together in my dev environment, and I will.
> I just wondered if the community has had the conversation.
>
> Thanks.
>
> --
> Peter Karman  .  http://peknet.com/  .  pe...@pe...
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

[Kaldi-developers] git

From: Peter K. <pe...@pe...> - 2014-11-19 22:07:33

I am a long-time Subversion user (since 2004 at least, maybe 2003 but I
can't remember that far back). I think it's a great tool and still like it.

That said, I've also been using Git for the last 3 or 4 years, and I've
come to believe it has some social aspects that help facilitate more
pick-up-and-help efforts amongst open source projects.

So while I do not want or intend to start a thread about the relative
technical merits of svn vs git, I do wonder if the Kaldi community has
discussed migrating to git and if that idea holds any merit here.

IME, git makes it easier for newcomers to try things out, contribute
small improvements and generally join-in-the-fun. I find those traits
helpful in open source projects, especially when trying to attract new
blood.

I know I can use git and svn together in my dev environment, and I will.
I just wondered if the community has had the conversation.

Thanks.

-- 
Peter Karman  .  http://peknet.com/  .  pe...@pe...

Re: [Kaldi-developers] Idea for improving queue script behavior

From: Daniel P. <dp...@gm...> - 2014-11-17 18:19:28

It's tricky to use a package like this, because the way queue.pl and
similar programs work is that any unrecognized arguments should be
passed through to qsub, and the qsub option format is not
standardized.  So for instance we need to recognize that "-pe smp 5"
is a single option.  This requires ad-hoc code.
I prefer to avoid CPAN modules like the plague.  Software that
requires them tends to be a huge headache.

Dan


>> For a while it has bothered me that there is no very good unified
>> interface to the queue-invoking scripts, i.e. no universal way to say
>> that you want a certain number of threads, a certain amount of memory,
>> etc, or a GPU, independent of queue mechanism; having a unified
>> mechanism would make it easier for the scripts to tell the queue what
>> resources they need.  I'm writing this email to say how I propose to
>> improve this, and to ask for help (i.e. if anyone has time to
>> implement this).
>>
>> I propose to modify queue.pl and similar scripts such as run.pl,
>> ssh.pl and slurm.pl, so that they all accept some additional options,
>> so for instance you could invoke
>>
>>   queue.pl --mem 10G --num-threads 12  JOB=1:8 exp/foo/something.JOB.log ....
>> or
>>   queue.pl --mem 10G --gpu 1  --max-jobs-run 4  JOB=1:8
>> exp/bar/something.JOB.log ....
>> (max-jobs-run would limit the simultaneously running jobs, just like
>> -tc 4 to GridEngine).
>>
>> All the other parallelization scripts would take the same options, and
>> would probably just ignore options that they didn't already recognize
>> (for future-proofing).
>> Some of these scripts would have to be configurable, e.g. GridEngine
>> can be configured in various ways.
>>
>> For example, queue.pl could look for a file located by default in
>>   conf/queue.conf
>> which would tell it how to convert the things above into actual
>> options, e.g. the following, which looks a bit like bash but would be
>> interpreted by the perl script.  Below I try to show a case where the
>> "gpu" option requires a change in queue, which makes the script a
>> little more complicated.  But I don't want to make the config language
>> super-powerful so it's hard to implement; if someone has a weird queue
>> setup that requires extra configuration, they can always modify
>> queue.pl.
>>
>> # cat conf/queue.conf
>> standard_opts -l arch=*64*
>> mem=* -l mem_free=$0,ram_free=$0
>> num_threads=* -pe smp $0
>> max_jobs_run=* -tc $0
>> default gpu=0
>> gpu=0 -q all.q
>> gpu=* -l gpu=$0 -q gpu.q
>>
>> The idea is that once queue.pl and similar scripts are updated to
>> include these standardized options, with a mechanism to convert them
>> into "normal" options, we can then start extending the scripts to take
>> advantage of this standardization, so instead of having the user pass
>> in "gpu_opts" and so on, we can just have the script add the option
>> --gpu 1 itself.  And scripts can start working out how much memory
>> different stages will need, and set the --mem option themselves.
>>
>
> I think a sane common configuration format is a great idea, and some
> common Perl library to read it / mixin with cli options ideal.
>
> I'd be happy to contribute in this way.
>
> Do you have any restrictions on the project with requiring/using CPAN
> modules? There are several different ways to approach a solution, and
> several existing implementations on CPAN. E.g. using a common config
> format (.ini, .yml, .json, .conf) with something like
> https://metacpan.org/pod/Config::Any and coupled with
> https://metacpan.org/pod/Getopt::Long can work well.
>
> Of course, Moose combines these even more easily, but I expect a large
> dependency list like Moose includes would not be welcome.
>
> Thoughts?
>
> --
> Peter Karman  .  http://peknet.com/  .  pe...@pe...
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

Re: [Kaldi-developers] Idea for improving queue script behavior

From: Peter K. <pe...@pe...> - 2014-11-17 16:45:46

On 11/12/14, 3:05 PM, Daniel Povey wrote:

> For a while it has bothered me that there is no very good unified
> interface to the queue-invoking scripts, i.e. no universal way to say
> that you want a certain number of threads, a certain amount of memory,
> etc, or a GPU, independent of queue mechanism; having a unified
> mechanism would make it easier for the scripts to tell the queue what
> resources they need.  I'm writing this email to say how I propose to
> improve this, and to ask for help (i.e. if anyone has time to
> implement this).
> 
> I propose to modify queue.pl and similar scripts such as run.pl,
> ssh.pl and slurm.pl, so that they all accept some additional options,
> so for instance you could invoke
> 
>   queue.pl --mem 10G --num-threads 12  JOB=1:8 exp/foo/something.JOB.log ....
> or
>   queue.pl --mem 10G --gpu 1  --max-jobs-run 4  JOB=1:8
> exp/bar/something.JOB.log ....
> (max-jobs-run would limit the simultaneously running jobs, just like
> -tc 4 to GridEngine).
> 
> All the other parallelization scripts would take the same options, and
> would probably just ignore options that they didn't already recognize
> (for future-proofing).
> Some of these scripts would have to be configurable, e.g. GridEngine
> can be configured in various ways.
> 
> For example, queue.pl could look for a file located by default in
>   conf/queue.conf
> which would tell it how to convert the things above into actual
> options, e.g. the following, which looks a bit like bash but would be
> interpreted by the perl script.  Below I try to show a case where the
> "gpu" option requires a change in queue, which makes the script a
> little more complicated.  But I don't want to make the config language
> super-powerful so it's hard to implement; if someone has a weird queue
> setup that requires extra configuration, they can always modify
> queue.pl.
> 
> # cat conf/queue.conf
> standard_opts -l arch=*64*
> mem=* -l mem_free=$0,ram_free=$0
> num_threads=* -pe smp $0
> max_jobs_run=* -tc $0
> default gpu=0
> gpu=0 -q all.q
> gpu=* -l gpu=$0 -q gpu.q
> 
> The idea is that once queue.pl and similar scripts are updated to
> include these standardized options, with a mechanism to convert them
> into "normal" options, we can then start extending the scripts to take
> advantage of this standardization, so instead of having the user pass
> in "gpu_opts" and so on, we can just have the script add the option
> --gpu 1 itself.  And scripts can start working out how much memory
> different stages will need, and set the --mem option themselves.
> 

I think a sane common configuration format is a great idea, and some
common Perl library to read it / mixin with cli options ideal.

I'd be happy to contribute in this way.

Do you have any restrictions on the project with requiring/using CPAN
modules? There are several different ways to approach a solution, and
several existing implementations on CPAN. E.g. using a common config
format (.ini, .yml, .json, .conf) with something like
https://metacpan.org/pod/Config::Any and coupled with
https://metacpan.org/pod/Getopt::Long can work well.

Of course, Moose combines these even more easily, but I expect a large
dependency list like Moose includes would not be welcome.

Thoughts?

-- 
Peter Karman  .  http://peknet.com/  .  pe...@pe...

[Kaldi-developers] Idea for improving queue script behavior

From: Daniel P. <dp...@gm...> - 2014-11-12 21:05:17

Hi everyone,

For a while it has bothered me that there is no very good unified
interface to the queue-invoking scripts, i.e. no universal way to say
that you want a certain number of threads, a certain amount of memory,
etc, or a GPU, independent of queue mechanism; having a unified
mechanism would make it easier for the scripts to tell the queue what
resources they need.  I'm writing this email to say how I propose to
improve this, and to ask for help (i.e. if anyone has time to
implement this).

I propose to modify queue.pl and similar scripts such as run.pl,
ssh.pl and slurm.pl, so that they all accept some additional options,
so for instance you could invoke

  queue.pl --mem 10G --num-threads 12  JOB=1:8 exp/foo/something.JOB.log ....
or
  queue.pl --mem 10G --gpu 1  --max-jobs-run 4  JOB=1:8
exp/bar/something.JOB.log ....
(max-jobs-run would limit the simultaneously running jobs, just like
-tc 4 to GridEngine).

All the other parallelization scripts would take the same options, and
would probably just ignore options that they didn't already recognize
(for future-proofing).
Some of these scripts would have to be configurable, e.g. GridEngine
can be configured in various ways.

For example, queue.pl could look for a file located by default in
  conf/queue.conf
which would tell it how to convert the things above into actual
options, e.g. the following, which looks a bit like bash but would be
interpreted by the perl script.  Below I try to show a case where the
"gpu" option requires a change in queue, which makes the script a
little more complicated.  But I don't want to make the config language
super-powerful so it's hard to implement; if someone has a weird queue
setup that requires extra configuration, they can always modify
queue.pl.

# cat conf/queue.conf
standard_opts -l arch=*64*
mem=* -l mem_free=$0,ram_free=$0
num_threads=* -pe smp $0
max_jobs_run=* -tc $0
default gpu=0
gpu=0 -q all.q
gpu=* -l gpu=$0 -q gpu.q

The idea is that once queue.pl and similar scripts are updated to
include these standardized options, with a mechanism to convert them
into "normal" options, we can then start extending the scripts to take
advantage of this standardization, so instead of having the user pass
in "gpu_opts" and so on, we can just have the script add the option
--gpu 1 itself.  And scripts can start working out how much memory
different stages will need, and set the --mem option themselves.

Dan









Dan

Re: [Kaldi-developers] Ensemble training

From: Daniel P. <dp...@gm...> - 2014-11-11 18:00:13

Attachments: image005.png image002.jpg image006.jpg image003.jpg image004.png image001.png

Others- please ignore this, this is some kind of fraud.
(
http://www.complaintsaboutbusiness.in/alchemy-solutions-fraud-fake-consultancy/
)
I was fooled by the apparently relevant subject line.
Dan


On Tue, Nov 11, 2014 at 12:53 PM, Daniel Povey <dp...@gm...> wrote:

> That is a strangely specific topic to want to be trained on.  I suspect
> what you really need is an intro on machine learning in general (->Andrew
> Ng's course?) or on speech recognition in general (->HTK Book?)
> Dan
>
>
> On Tue, Nov 11, 2014 at 8:37 AM, nikunj <ni...@al...>
> wrote:
>
>> Hi,
>>
>>
>>
>> We have a Corporate training requirement on Ensemble training.
>>
>>
>>
>> Please inform if you can support us with the same.
>>
>>
>>
>> *Regards,*
>>
>> *-----------------------------------------------*
>>
>> *Nikunj Arora*
>>
>> *(Business Development Manager)*
>>
>>
>>
>>
>> *ALCHEMY SOLUTIONS#21/1, 1st. Floor, Vasavi Chambers,Lal Bagh Fort
>> Road,Bangalore-560004. *
>>
>> *Mobile: +91 9663984279 <%2B91%209663984279>/+91 9820866974
>> <%2B91%209820866974> *
>>
>> *Direct: 080-65690716*
>>
>>
>>
>> *Alchemy Solutions :  www.alchemysolutions.net
>> <http://www.alchemysolutions.net/>*
>>
>>
>>
>> [image: Description: cid:image005.png@01CE78E0.1DCE4D70][image:
>> Description: Citrix_corporate_logo_-_BLA][image: Description:
>> cid:image011.jpg@01CE78E0.1DCE4D70][image: Description:
>> cid:image005.png@01CCB42F.55E5AE00][image: Description:
>> cid:image002.png@01CCB059.3DDE0480][image: Description:
>> cid:image006.jpg@01CCB42F.55E5AE00]
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Comprehensive Server Monitoring with Site24x7.
>> Monitor 10 servers for $9/Month.
>> Get alerted through email, SMS, voice calls or mobile push notifications.
>> Take corrective actions from your mobile device.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>
>>
>

Re: [Kaldi-developers] Ensemble training

From: Daniel P. <dp...@gm...> - 2014-11-11 17:53:59

Attachments: image003.jpg image001.png image005.png image004.png image002.jpg image006.jpg

That is a strangely specific topic to want to be trained on.  I suspect
what you really need is an intro on machine learning in general (->Andrew
Ng's course?) or on speech recognition in general (->HTK Book?)
Dan


On Tue, Nov 11, 2014 at 8:37 AM, nikunj <ni...@al...> wrote:

> Hi,
>
>
>
> We have a Corporate training requirement on Ensemble training.
>
>
>
> Please inform if you can support us with the same.
>
>
>
> *Regards,*
>
> *-----------------------------------------------*
>
> *Nikunj Arora*
>
> *(Business Development Manager)*
>
>
>
>
> *ALCHEMY SOLUTIONS#21/1, 1st. Floor, Vasavi Chambers,Lal Bagh Fort
> Road,Bangalore-560004. *
>
> *Mobile: +91 9663984279 <%2B91%209663984279>/+91 9820866974
> <%2B91%209820866974> *
>
> *Direct: 080-65690716*
>
>
>
> *Alchemy Solutions :  www.alchemysolutions.net
> <http://www.alchemysolutions.net/>*
>
>
>
> [image: Description: cid:image005.png@01CE78E0.1DCE4D70][image:
> Description: Citrix_corporate_logo_-_BLA][image: Description:
> cid:image011.jpg@01CE78E0.1DCE4D70][image: Description:
> cid:image005.png@01CCB42F.55E5AE00][image: Description:
> cid:image002.png@01CCB059.3DDE0480][image: Description:
> cid:image006.jpg@01CCB42F.55E5AE00]
>
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Ensemble training

From: nikunj <ni...@al...> - 2014-11-11 14:17:39

Attachments: image001.png image002.jpg image003.jpg image004.png image005.png image006.jpg

Hi, 

 

We have a Corporate training requirement on Ensemble training.

 

Please inform if you can support us with the same.

 

Regards,

-----------------------------------------------

Nikunj Arora

(Business Development Manager)

ALCHEMY SOLUTIONS
#21/1, 1st. Floor, Vasavi Chambers,
Lal Bagh Fort Road,
Bangalore-560004. 

Mobile: +91 9663984279/+91 9820866974 

Direct: 080-65690716

 

Alchemy Solutions :   <http://www.alchemysolutions.net/>
www.alchemysolutions.net

 

Description: cid:image005.png@01CE78E0.1DCE4D70Description:
Citrix_corporate_logo_-_BLADescription:
cid:image011.jpg@01CE78E0.1DCE4D70Description:
cid:image005.png@01CCB42F.55E5AE00Description:
cid:image002.png@01CCB059.3DDE0480Description:
cid:image006.jpg@01CCB42F.55E5AE00

[Kaldi-developers] Enhancements to Kaldi

From: Daniel P. <dp...@gm...> - 2014-11-04 02:13:45

Hello everyone,

You will remember that a couple of weeks ago I sent an email out to this
list asking for people to say what they would like improved about Kaldi.
Part of the reason I asked this is because Sanjeev and I are applying for
an NSF Community Research Infrastructure (CRI) grant to support the work on
Kaldi that goes on here at JHU - this is one of the ways we plan to pay for
my salary, since I have a research-track appointment which means my salary
needs to be covered by grants; and some other grants are ending soon.

After reading your responses, there seemed to be a few things that stood
out as major new features people would like:

  (i) An easier way to do DNN experiments, including novel architectures
  (ii) Support for convolutional and/or recurrent neural networks for
acoustic modeling
  (iii) Decoder support for RNN language models
  (iv) Voice Activity Detection
  (v) Improvements to online/real-time decoding including integration of
Voice Activity Detection, and making it easier for novices to run this

We're going to put these in the grant as things that we plan to do; of
course, plans may change for various reasons, e.g. if people from outside
JHU end up contributing substantially to these features.  I want to
emphasize that by applying for a grant to support me and others (e.g. my
students) at JHU, we are not in any way asserting that Kaldi's only home is
here-- it is, after all, a community project.  This is just the most
straightforward funding mechanism that will allow me to continue to devote
my time to Kaldi.

I do have a request right now for those on this list (and I'll be sending
out a differently-worded version of this email to some others who might not
be on the list).

The National Science Foundation (NSF) requires for CRI grants that
applicants demonstrate, among other things,
 - Usage by a diverse population of researchers worldwide
 - Research community support for the enhancements.

So my request is: are you willing to put your name to the following two
oddly specific statements,
   (1) I use Kaldi for my research in computer and information science or
engineering
   (2) The proposed enhancements would benefit my research
If you can agree to one or both of these, please just reply to me by email
(don't cc the list!) saying "Agreed", or "Agree to (1) but not (2)" or vice
versa, and state your name and institutional affiliation if it's not
obvious.  Nothing else will be required (no signatures, letters, etc.)
 Don't agonize about this or send emails to your legal department; if it
will be a hassle, just don't reply.  Replies after Wednesday may not get
used.

Of course more detailed feedback, including feedback about the specific
enhancements being proposed, or other new enhancements that would benefit
you, is still appreciated.

Dan

Re: [Kaldi-developers] Question : stddev inf problem

From: Daniel P. <dp...@gm...> - 2014-10-30 16:50:30

This is what happens when you get parameter divergence.
Certain types of nonlinearity are more susceptible to this problem than
others.  Particularly unbounded nonlinearities.
Also (and I don't know if the block affine component code supports this,
but it shouldn't be super hard to change), the max-change parameter can be
helpful in preventing very large parameter changes which could lead to
divergence.  You could try decreasing this value.
Bear in mind that the BlockAffineComponent uses what I refer to in
http://arxiv-web3.library.cornell.edu/abs/1410.7455v1 as "simple" natural
gradient SGD, which is about twice slower (on GPUs) than the "online"
natural gradient SGD.  That code was written before we had the faster
"online" NG-SGD, and I haven't updated it.
Dan


On Thu, Oct 30, 2014 at 3:37 AM, Dong-Hyun Kim <daw...@gm...> wrote:

> Hi, kaldi-developers
> my name is Dong_Hyun Kim
> I have a problem using kaldi.
> My system composed with four GTX760 cards per node and 10 node cluster.
> so I run 40 gpu card with 40 egs.
> when I run "nnet-train-simple", I get shrink.log like below;
>
> ----------------------------------------------------------------------------------------------
> nnet-subset-egs --n=2000 --randomize-order=true --srand=50
> ark:data_work/data_FB40_base/train_141002/nnet-5block/egs/train_diagnostic.egs
> ark:-
> nnet-combine-fast --num-threads=1 --verbose=3 --minibatch-size=2000
> data_work/data_FB40_base/train_141002/nnet-5block/51.mdl ark:-
> data_work/data_FB40_base/train_141002/nnet-5block/51.mdl
> LOG (nnet-combine-fast:IsComputeExclusive():cu-device.cc:209) CUDA setup
> operating under Compute Exclusive Mode.
> LOG (nnet-combine-fast:FinalizeActiveGpu():cu-device.cc:174) The active
> GPU is [0]: GeForce GTX 760 free:1994M, used:53M, total:2047M,
> free/total:0.974084 version 3.0
> LOG (nnet-combine-fast:PrintMemoryUsage():cu-device.cc:314) Memory used: 0
> bytes.
> LOG (nnet-subset-egs:main():nnet-subset-egs.cc:88) Selected a subset of
> 2000 out of 40000 neural-network training examples
> LOG (nnet-combine-fast:main():nnet-combine-fast.cc:107) Read 2000 examples
> from the validation set.
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 0 for this minibatch is 70.0758
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 1 for this minibatch is 70.0758
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 2 for this minibatch is 0.0614423
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 3 for this minibatch is 4.40091
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 4 for this minibatch is 0.630933
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 5 for this minibatch is inf
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 6 for this minibatch is 0.692641
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 7 for this minibatch is inf
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 8 for this minibatch is 0.760484
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 9 for this minibatch is 5.29073
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 10 for this minibatch is 0.756328
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 11 for this minibatch is 3.84917
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 12 for this minibatch is 0.704473
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 13 for this minibatch is 9.91905
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 14 for this minibatch is 0.766127
> VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
> for component 15 for this minibatch is 10.4979
> LOG (nnet-combine-fast:GetInitialModel():combine-nnet-fast.cc:402)
> Objective functions for the source neural nets are  [ -1.4428 ]
>
> -----------------------------------------------------------------------------------------------------
> Then, running is stopped with next message..
>
> -------------------------------------------------------------------------------------------------------
> nnet-shuffle-egs --buffer-size=5000 --srand=144
> ark:data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/egs/egs.26.42.ark
> ark:-
> LOG (main():nnet-train-simple.cc:62)
> nnet-train-simple --minibatch-size=512 --srand=144
> data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/144.mdl ark:-
> data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/145.26.mdl
> LOG (nnet-train-simple:main():nnet-train-simple.cc:72) !!Cuda!!:
> CuDevice::Instantiate().SelectGpuId(use_gpu);
> LOG (nnet-train-simple:IsComputeExclusive():cu-device.cc:209) CUDA setup
> operating under Compute Exclusive Mode.
> LOG (nnet-train-simple:FinalizeActiveGpu():cu-device.cc:174) The active
> GPU is [3]: GeForce GTX 760 free:1993M, used:53M, total:2047M,
> free/total:0.973956 version 3.0
> LOG (nnet-train-simple:PrintMemoryUsage():cu-device.cc:314) Memory used: 0
> bytes.
> LOG (nnet-train-simple:BeginNewPhase():train-nnet.cc:59) Training
> objective function (this phase) is -1.94988 over 25600 frames.
> KALDI_ASSERT: at
> nnet-train-simple:PreconditionDirectionsAlphaRescaled:nnet-precondition.cc:160,
> failed: p_trace != 0.0
> Stack trace is:
> kaldi::KaldiGetStackTrace()
> kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
> kaldi::nnet2::PreconditionDirectionsAlphaRescaled(kaldi::CuMatrixBase<float>
> const&, double, kaldi::CuMatrixBase<float>*)
> kaldi::nnet2::BlockAffineComponentPreconditioned::Update(kaldi::CuMatrixBase<float>
> const&, kaldi::CuMatrixBase<float> const&)
> kaldi::nnet2::BlockAffineComponent::Backprop(kaldi::CuMatrixBase<float>
> const&, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float>
> const&, int, kaldi::nnet2::Component*, kaldi::CuMatrix<float>*) const
> .
> .
> kaldi::nnet2::NnetSimpleTrainer::TrainOneMinibatch()
> kaldi::nnet2::NnetSimpleTrainer::TrainOnExample(kaldi::nnet2::NnetExample
> const&)
> nnet-train-simple(main+0x905) [0x57d549]
> /lib64/libc.so.6(__libc_start_main+0xfd) [0x386ba1ed1d]
> nnet-train-simple() [0x57cb89]
> bash: line 1: 30731 Broken pipe             nnet-shuffle-egs
> --buffer-size=5000 --srand=144
> ark:data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/egs/egs.26.42.ark
> ark:-
>      30733 Aborted                 (core dumped) | nnet-train-simple
> --minibatch-size=512 --srand=144
> data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/144.mdl ark:-
> data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/145.26.mdl
> # Accounting: time=37 threads=1
>
> -------------------------------------------------------------------------------------------------------------------
> As debugging, NnetUpdater::Backprop::output_deriv matrix shows inf value.
>
> How can I solve this problem?
> Thank you.
>
>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Question : stddev inf problem

From: Dong-Hyun K. <daw...@gm...> - 2014-10-30 07:37:40

Hi, kaldi-developers
my name is Dong_Hyun Kim
I have a problem using kaldi.
My system composed with four GTX760 cards per node and 10 node cluster.
so I run 40 gpu card with 40 egs.
when I run "nnet-train-simple", I get shrink.log like below;
----------------------------------------------------------------------------------------------
nnet-subset-egs --n=2000 --randomize-order=true --srand=50
ark:data_work/data_FB40_base/train_141002/nnet-5block/egs/train_diagnostic.egs
ark:-
nnet-combine-fast --num-threads=1 --verbose=3 --minibatch-size=2000
data_work/data_FB40_base/train_141002/nnet-5block/51.mdl ark:-
data_work/data_FB40_base/train_141002/nnet-5block/51.mdl
LOG (nnet-combine-fast:IsComputeExclusive():cu-device.cc:209) CUDA setup
operating under Compute Exclusive Mode.
LOG (nnet-combine-fast:FinalizeActiveGpu():cu-device.cc:174) The active GPU
is [0]: GeForce GTX 760 free:1994M, used:53M, total:2047M,
free/total:0.974084 version 3.0
LOG (nnet-combine-fast:PrintMemoryUsage():cu-device.cc:314) Memory used: 0
bytes.
LOG (nnet-subset-egs:main():nnet-subset-egs.cc:88) Selected a subset of
2000 out of 40000 neural-network training examples
LOG (nnet-combine-fast:main():nnet-combine-fast.cc:107) Read 2000 examples
from the validation set.
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 0 for this minibatch is 70.0758
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 1 for this minibatch is 70.0758
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 2 for this minibatch is 0.0614423
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 3 for this minibatch is 4.40091
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 4 for this minibatch is 0.630933
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 5 for this minibatch is inf
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 6 for this minibatch is 0.692641
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 7 for this minibatch is inf
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 8 for this minibatch is 0.760484
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 9 for this minibatch is 5.29073
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 10 for this minibatch is 0.756328
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 11 for this minibatch is 3.84917
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 12 for this minibatch is 0.704473
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 13 for this minibatch is 9.91905
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 14 for this minibatch is 0.766127
VLOG[3] (nnet-combine-fast:Propagate():nnet-update.cc:82) Stddev of data
for component 15 for this minibatch is 10.4979
LOG (nnet-combine-fast:GetInitialModel():combine-nnet-fast.cc:402)
Objective functions for the source neural nets are  [ -1.4428 ]
-----------------------------------------------------------------------------------------------------
Then, running is stopped with next message..
-------------------------------------------------------------------------------------------------------
nnet-shuffle-egs --buffer-size=5000 --srand=144
ark:data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/egs/egs.26.42.ark
ark:-
LOG (main():nnet-train-simple.cc:62)
nnet-train-simple --minibatch-size=512 --srand=144
data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/144.mdl ark:-
data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/145.26.mdl
LOG (nnet-train-simple:main():nnet-train-simple.cc:72) !!Cuda!!:
CuDevice::Instantiate().SelectGpuId(use_gpu);
LOG (nnet-train-simple:IsComputeExclusive():cu-device.cc:209) CUDA setup
operating under Compute Exclusive Mode.
LOG (nnet-train-simple:FinalizeActiveGpu():cu-device.cc:174) The active GPU
is [3]: GeForce GTX 760 free:1993M, used:53M, total:2047M,
free/total:0.973956 version 3.0
LOG (nnet-train-simple:PrintMemoryUsage():cu-device.cc:314) Memory used: 0
bytes.
LOG (nnet-train-simple:BeginNewPhase():train-nnet.cc:59) Training objective
function (this phase) is -1.94988 over 25600 frames.
KALDI_ASSERT: at
nnet-train-simple:PreconditionDirectionsAlphaRescaled:nnet-precondition.cc:160,
failed: p_trace != 0.0
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::nnet2::PreconditionDirectionsAlphaRescaled(kaldi::CuMatrixBase<float>
const&, double, kaldi::CuMatrixBase<float>*)
kaldi::nnet2::BlockAffineComponentPreconditioned::Update(kaldi::CuMatrixBase<float>
const&, kaldi::CuMatrixBase<float> const&)
kaldi::nnet2::BlockAffineComponent::Backprop(kaldi::CuMatrixBase<float>
const&, kaldi::CuMatrixBase<float> const&, kaldi::CuMatrixBase<float>
const&, int, kaldi::nnet2::Component*, kaldi::CuMatrix<float>*) const
.
.
kaldi::nnet2::NnetSimpleTrainer::TrainOneMinibatch()
kaldi::nnet2::NnetSimpleTrainer::TrainOnExample(kaldi::nnet2::NnetExample
const&)
nnet-train-simple(main+0x905) [0x57d549]
/lib64/libc.so.6(__libc_start_main+0xfd) [0x386ba1ed1d]
nnet-train-simple() [0x57cb89]
bash: line 1: 30731 Broken pipe             nnet-shuffle-egs
--buffer-size=5000 --srand=144
ark:data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/egs/egs.26.42.ark
ark:-
     30733 Aborted                 (core dumped) | nnet-train-simple
--minibatch-size=512 --srand=144
data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/144.mdl ark:-
data_work/data_FB40_comEnv2/train_comEnv2/nnet-5block/145.26.mdl
# Accounting: time=37 threads=1
-------------------------------------------------------------------------------------------------------------------
As debugging, NnetUpdater::Backprop::output_deriv matrix shows inf value.

How can I solve this problem?
Thank you.

[Kaldi-developers] Trouble in installing

From: 陈卓 <che...@gm...> - 2014-10-29 14:45:49

Dear:

 I read INSTALL instructions and installed openfst BUT when running
./configure it says

/home/ken/kaldi-trunk/tools/openfst/include/fst/minimize.h seems not
to be patched:
patch not applied?  FST tools will not work in our recipe.

What is My problem ??
and how I can trace ./configure ??

Thanks a lot



ken

Re: [Kaldi-developers] Wave-reader.cc Error on Chunk size or Warning

From: Daniel P. <dp...@gm...> - 2014-10-18 18:51:44

> In wave-reader.cc, on line 200, the code exits if all the data in not a
> single chunk
>

I don't think we've encountered files that have multiple chunks yet, so
let's cross that bridge when we cross it.  If I recall correctly, the wav
format theoretically includes a vast range of different things, so that if
we tried to truly implement it to the standard, most of Kaldi code would
end up being devoted to reading in wav files, and we'd probably have to end
up rewriting most of Windows.

. In some badly written wav files, the length in the header is not correct,
> but the file is usable anyways. For my use, I changed this ERROR to a
> WARNING. I am wondering if changing this for everyone makes sense.
>

I suspect you may be using an out of date copy of Kaldi.  IIRC this issue
no longer exists and it does just print a warning.  E.g. sox prints the
wrong header size when writing to a stream.  However, we found another
issue, that sox sometimes outputs ridiculously large sizes in the header
when writing to a stream, when you do things like time-warping
(stretching/shrinking) the audio.  The current Kaldi wav-reading code has a
bug that it outputs a wav file with the size from the header, not the the
size of the actual amount of data it read.  And I'm not sure if the
contents of the remaining part are even defined.  Tom Ko (cc'd) is going to
fix this bug, and also make it so that the wav-reading code is efficient in
the case when the size in the header is ridiculously large.


> How often do we run across wav files with multiple chunks?
>

I don't think we have ever come across wav files with multiple chunks, but
if we do, we can implement it.

Dan




>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Wave-reader.cc Error on Chunk size or Warning

From: Nagendra G. <nag...@go...> - 2014-10-18 18:40:13

In wave-reader.cc, on line 200, the code exits if all the data in not a
single chunk. In some badly written wav files, the length in the header is
not correct, but the file is usable anyways. For my use, I changed this
ERROR to a WARNING. I am wondering if changing this for everyone makes
sense. How often do we run across wav files with multiple chunks?

Re: [Kaldi-developers] align reverse

From: Daniel P. <dp...@gm...> - 2014-10-14 17:13:35

There is actually some stuff in Kaldi that uses this type of backward
decoding already, I think there is an example script like
local/run_fwdbwd.sh in one of the example setups.
But I doubt very much that this is something you really want to do or would
be useful to you for alignment purposes.
Dan


On Tue, Oct 14, 2014 at 5:41 AM, Tony Robinson <to...@ca...>
wrote:

> On 10/14/2014 07:00 AM, Saman Mousazadeh wrote:
> > I have trained two models for alignment one mono and the other is tri.
> > now I want to use these models to align data in reverse (i.e. from the
> > end of utterance to the beginning). I have changed the L fst and the
> > mono models works pretty well but the tri model does not work at all
> > (as expected!! ). Is there any way (except new training ) to use these
> > tri model for reverse alignment?
>
> This is an interesting theoretical problem.   There are two things to
> reverse, the WFSTs and the features.
>
> The WFSTs either all need to be reversed (i.e. each of H C L G), or you
> need to reverse the composition.
>
> You may also need to reverse the features, that is if you view your
> frames as t+1, t, t-1 then the first order differences will have the
> opposite sign to the normal window of frames, t-1, t, t+1.     Here by
> far the easiest is to compute all the higher order features (e.g. to
> third order differences) in the forward time order then reverse these.
> Perhaps it works to flip the sign of even differences, but I wouldn't
> trust this.
>
> I'm finding it hard to resist the temptation to ask why you want to do
> this!
>
>
> Tony
> --
> ** Cantab is hiring: www.cantabResearch.com/openings **
> Dr A J Robinson, Founder, Cantab Research Ltd
> Phone direct: 01223 778240 office: 01223 794497
> Company reg no GB 05697423, VAT reg no 925606030
> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Re: [Kaldi-developers] bug ?? in show alignments

From: Daniel P. <dp...@gm...> - 2014-10-14 17:08:40

I can't answer your question because there are not enough details - I'm not
sure exactly what you did and how its output differed from what you
expected.
Dan


On Tue, Oct 14, 2014 at 10:31 AM, Saman Mousazadeh <smo...@gm...>
wrote:

> Hi all,
> I have a model for alignment (tri)  i used it for alignment and accept
> something like this
>
> Osil
> Osil
> Osil_S
> Osil_S
> Osil_S
> Osil
> Osil
> Osil_S
> Osil_S
> T_E
> T_E
> IH1_B
> IH1_B
> D_E
> D_E
> EH1_I
> EH1_I
> R_B
> R_B
> D_E
>  instead of
> Osil
> Osil_S
> Osil
> Osil_S
> T_E
> IH1_B
> D_E
> EH1_I
> R_B
> D_E
>
> I mean splitting is not correct ( a phone is spited to two phones ) why?
>
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] bug ?? in show alignments

From: Saman M. <smo...@gm...> - 2014-10-14 14:31:11

Hi all,
I have a model for alignment (tri)  i used it for alignment and accept
something like this

Osil
Osil
Osil_S
Osil_S
Osil_S
Osil
Osil
Osil_S
Osil_S
T_E
T_E
IH1_B
IH1_B
D_E
D_E
EH1_I
EH1_I
R_B
R_B
D_E
 instead of
Osil
Osil_S
Osil
Osil_S
T_E
IH1_B
D_E
EH1_I
R_B
D_E

I mean splitting is not correct ( a phone is spited to two phones ) why?

Re: [Kaldi-developers] align reverse

From: Tony R. <to...@ca...> - 2014-10-14 09:54:16

On 10/14/2014 07:00 AM, Saman Mousazadeh wrote:
> I have trained two models for alignment one mono and the other is tri. 
> now I want to use these models to align data in reverse (i.e. from the 
> end of utterance to the beginning). I have changed the L fst and the 
> mono models works pretty well but the tri model does not work at all 
> (as expected!! ). Is there any way (except new training ) to use these 
> tri model for reverse alignment?

This is an interesting theoretical problem.   There are two things to 
reverse, the WFSTs and the features.

The WFSTs either all need to be reversed (i.e. each of H C L G), or you 
need to reverse the composition.

You may also need to reverse the features, that is if you view your 
frames as t+1, t, t-1 then the first order differences will have the 
opposite sign to the normal window of frames, t-1, t, t+1.     Here by 
far the easiest is to compute all the higher order features (e.g. to 
third order differences) in the forward time order then reverse these.   
Perhaps it works to flip the sign of even differences, but I wouldn't 
trust this.

I'm finding it hard to resist the temptation to ask why you want to do this!

Tony
-- 
** Cantab is hiring: www.cantabResearch.com/openings **
Dr A J Robinson, Founder, Cantab Research Ltd
Phone direct: 01223 778240 office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK

[Kaldi-developers] align reverse

From: Saman M. <smo...@gm...> - 2014-10-14 06:00:26

Hi everybody,
I have trained two models for alignment one mono and the other is tri. now
I want to use these models to align data in reverse (i.e. from the end of
utterance to the beginning). I have changed the L fst and the mono models
works pretty well but the tri model does not work at all (as expected!! ).
Is there any way (except new training ) to use these tri model for reverse
alignment?
Best regards
Saman

Re: [Kaldi-developers] Adaptive Beam in decoder & gmm-aligened-compiled

From: Daniel P. <dp...@gm...> - 2014-09-29 18:10:08

There are a lot of reasons why decoders could fail on very long utterances
- likely some subtle issue relating to floating-point roundoff.  Without
having access to a test case this will be hard to debug.  In addition, I'm
not sure that I have time to do this right now.  But it does need someone
who understands Kaldi and is good with debugging.
Is there someone else on this list that could help Saman debug - maybe he
could send you the files needed?  I'm thinking that it might be possible to
modify the decoder to better handle these very longs files.

Dan


On Mon, Sep 29, 2014 at 7:50 AM, Saman Mousazadeh <smo...@gm...>
wrote:

> Hi everybody,
> I have trained a model for alignment and I want to use that model for
> aligning an audio file. Since my audio is long I decided to use Adaptive
> Beam  in decoder. to do this I have changed decode_opts.min_active
> and decode_opts.max_active for decoder. Now something strange happened. If
> I do not set these parameters and I use beam=1000 (e.g.) the
> gmm-aligned-compiled will success but by setting these parameters it fails.
> I even set these parameters such that the adaptive beam will always be
> greater than 1000 but still gmm-aligned-compiled  failed. why? and how
> can I use adaptive beam for decoding?
> Best
> Saman
>
>
>
> ------------------------------------------------------------------------------
> Slashdot TV.  Videos for Nerds.  Stuff that Matters.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Adaptive Beam in decoder & gmm-aligened-compiled

From: Saman M. <smo...@gm...> - 2014-09-29 11:50:14

Hi everybody,
I have trained a model for alignment and I want to use that model for
aligning an audio file. Since my audio is long I decided to use Adaptive
Beam  in decoder. to do this I have changed decode_opts.min_active
and decode_opts.max_active for decoder. Now something strange happened. If
I do not set these parameters and I use beam=1000 (e.g.) the
gmm-aligned-compiled will success but by setting these parameters it fails.
I even set these parameters such that the adaptive beam will always be
greater than 1000 but still gmm-aligned-compiled  failed. why? and how can
I use adaptive beam for decoding?
Best
Saman

Re: [Kaldi-developers] Alignment

From: Daniel P. <dp...@gm...> - 2014-09-28 18:44:21

> Hi everybody,
> I have trained a model for alignment and I want to use that model for
> aligning an audio file which is very long (suppose one hour).  If the audio
> is of low quality it is likely that the decoding is not successful and we
> get something like this log-like per frame for AAAAA-AA is -inf over .....
> Is there any way to find out soon that this is not a good audio file? I
> mean not waiting a long time to end of processing of all frames?
> Thanks in advance
> Best regards
>

Regardless of the quality of the file, you should never get an infinite
log-like per frame.  What I suspect is happening is that since the file is
so long (one hour), there is something happening wrong happening in the
decoder related to floating-point roundoff.  The issue is likely not the
quality of the audio.  You could try compiling Kaldi with
-DKALDI_DOUBLEPRECISION=1 and see if it helps (edit this in kaldi.mk, make
clean and make).


Dan




>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Alignment

From: Saman M. <smo...@gm...> - 2014-09-28 06:51:35

Hi everybody,
I have trained a model for alignment and I want to use that model for
aligning an audio file which is very long (suppose one hour).  If the audio
is of low quality it is likely that the decoding is not successful and we
get something like this log-like per frame for AAAAA-AA is -inf over .....
Is there any way to find out soon that this is not a good audio file? I
mean not waiting a long time to end of processing of all frames?
Thanks in advance
Best regards

Re: [Kaldi-developers] help with scoring scripts for timit recipe in kaldi

From: Jan T. <af...@ce...> - 2014-09-26 16:55:41

Hi,
1) the sclite scorer treats some tokens slightly differently than others.
Those that are treated differently are word fragments and/or words at the
end of the utterance. You can also mark some words  as optionally
deletable. Silence (and possibly non-speech events) can be treated
differently as well, but I don't recall, what is the default behavior. Have
a look at the sclite command line switches to get insight what can be
switched on and off.
2) I _think_ by default the timing info does not matter. There is however
something called "time-mitigated scoring" (or something like that) that
takes the timing information into account. I'm not aware of it being used
in any of the kaldi recipes.

y.

On Fri, Sep 26, 2014 at 12:17 PM, Jan Chorowski <jan...@gm...>
wrote:

> Hello,
>
> first of all let me thank you for bringing cutting-edge speech recognition
> to the mortals!
>
> I am using Kaldi to jump-start training of recurrent neural networks for
> phoneme recognition on Timit and to compare results between Kaldi decoders
> and the recurrent net based ones.
>
> The s5 recipe for Timit ships with two scorers: sclite and basic. Sclite
> tends to compute lower error rates, which I attribute to different scoring
> of errors relating to the silence token. However, for scoring it requires
> not only the decoded phoneme sequence, but also the timing of each phoneme.
> Since my decoder doesn't align the decoded phones precisely in time, I was
> using the basic scoring script.
>
> I have two questions:
> 1. am I correct about the differences between the two scorers' computed
> error rates to different handling of the silence token? I rescored models
> obtained using the standard recipe and they get consistently higher error
> rates using the basic scorer.
> 2. Do you have any intuitions on how precise the phone timing information
> needs to be for the sclite scorer to work? Is the timing quality part of
> the score or is it only used to save on computations?
>
> Sincerely,
> Jan Chorowski
>
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

23 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 12 13 14 15 16 .. 37 > >> (Page 14 of 37)

2011	Jan	Feb	Mar	Apr	May	Jun (4)	Jul	Aug	Sep (1)	Oct (4)	Nov (1)	Dec (14)
2012	Jan (1)	Feb (8)	Mar	Apr (1)	May (3)	Jun (13)	Jul (7)	Aug (11)	Sep (6)	Oct (14)	Nov (16)	Dec (1)
2013	Jan (3)	Feb (8)	Mar (17)	Apr (21)	May (27)	Jun (11)	Jul (11)	Aug (21)	Sep (39)	Oct (17)	Nov (39)	Dec (28)
2014	Jan (36)	Feb (30)	Mar (35)	Apr (17)	May (22)	Jun (28)	Jul (23)	Aug (41)	Sep (17)	Oct (10)	Nov (22)	Dec (56)
2015	Jan (30)	Feb (32)	Mar (37)	Apr (28)	May (79)	Jun (18)	Jul (35)	Aug	Sep (1)	Oct	Nov	Dec