kaldi-developers Mailing List for Kaldi (Page 23)

Brought to you by: bouliagi, danielpovey, jtrmal, ngoel17, and 2 others

This project can now be found here.

kaldi-developers — Kaldi Developers

You can subscribe to this list here.

2011	_Jan	_Feb	_Mar	_Apr	_May	_Jun (4)	_Jul	_Aug	_Sep (1)	_Oct (4)	_Nov (1)	_Dec (14)
2012	_Jan (1)	_Feb (8)	_Mar	_Apr (1)	_May (3)	_Jun (13)	_Jul (7)	_Aug (11)	_Sep (6)	_Oct (14)	_Nov (16)	_Dec (1)
2013	_Jan (3)	_Feb (8)	_Mar (17)	_Apr (21)	_May (27)	_Jun (11)	_Jul (11)	_Aug (21)	_Sep (39)	_Oct (17)	_Nov (39)	_Dec (28)
2014	_Jan (36)	_Feb (30)	_Mar (35)	_Apr (17)	_May (22)	_Jun (28)	_Jul (23)	_Aug (41)	_Sep (17)	_Oct (10)	_Nov (22)	_Dec (56)
2015	_Jan (30)	_Feb (32)	_Mar (37)	_Apr (28)	_May (79)	_Jun (18)	_Jul (35)	_Aug	_Sep (1)	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 .. 21 22 23 24 25 .. 37 > >> (Page 23 of 37)

Re: [Kaldi-developers] Rule-based grammars supported?

From: Daniel P. <dp...@gm...> - 2014-01-28 16:21:23

Hi,
Regardless of what you conclude, I'd be interested to see the outcome of
your project.
There is no *explicit* support for such grammars, but the graph that Kaldi
searches is a finite state transducer (FST), so if you can get the grammar
in that form it will probably work.  A tool like Thrax might be helpful
here.  Doing this requires some expertise though.
Dan



On Mon, Jan 27, 2014 at 7:39 AM, Patrick Proba <pat...@gm...>wrote:

> Dear Kaldi team,
>
> I'm working on a project at my university (DHBW Stuttgart, Germany) with
> the goal to compare Kaldi's performance to commercial speech recognizers.
>
> My professor repeatedly asked me to find out whether Kaldi supports *rule-based
> grammars like JSGF or SRGS*, but I wasn't able to find any information
> about this.
>
> I would be really glad, if you could throw light on this topic.
>
> I'm looking forward to hearing from you.
>
> Best regards,
> Patrick Proba
>
>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] Rule-based grammars supported?

From: Patrick P. <pat...@gm...> - 2014-01-27 12:39:37

Dear Kaldi team,

I'm working on a project at my university (DHBW Stuttgart, Germany) with
the goal to compare Kaldi's performance to commercial speech recognizers.

My professor repeatedly asked me to find out whether Kaldi supports *rule-based
grammars like JSGF or SRGS*, but I wasn't able to find any information
about this.

I would be really glad, if you could throw light on this topic.

I'm looking forward to hearing from you.

Best regards,
Patrick Proba

Re: [Kaldi-developers] Acoustic likelihood in lattices

From: Daniel P. <dp...@gm...> - 2014-01-23 18:04:17

There is currently no code for this.  If you were to write code, it would
be similar to the code for LatticeForwardBackward but it would have to be
done for CompactLattice because the conversion from Lattice to
CompactLattice wouldn't work correctly if you had posteriors in the weights.
The reason we don't write code like this is that it destroys the semantics
of fst weights, which are supposed to be things that are summed along paths
through the FST, not things that are meaningful for a given arc in the
lattice.
Dan



On Thu, Jan 23, 2014 at 12:59 PM, Xavier Anguera <xan...@gm...> wrote:

> True, sorry for my misuse of terms. I was somehow confused by the acoustic
> likelihood on the lattices and the posteriors that are retrieved
> by lattice-to-post.
> Let me make another dummy question: would it be possible in kaldi to
> obtain a lattice where for each node I would get a posterior probability
> (e.g. using the forward-backward method used to get the posteriors in
> lattice-to-post)?
>
> Thanks for your answers,
>
> Xavier Anguera
>
>
>
> On Thu, Jan 23, 2014 at 5:22 PM, Arnab Ghoshal <ar...@gm...> wrote:
>
>> Likelihoods are not probabilities and so they don't have to be less
>> than or equal to 1. That is, *log* likelihoods can be anything. If you
>> are using GMMs and getting positive log-likelihoods that usually means
>> the variances are low. -Arnab
>>
>> On Thu, Jan 23, 2014 at 11:01 AM, Xavier Anguera <xan...@gm...>
>> wrote:
>> > Hi,
>> > I am using kaldi to extract phone-level lattices and converting them to
>> HTK
>> > format. After comparing them with HTK-extracted lattices I see that in
>> Kaldi
>> > I am getting acoustic likelihoods that can be positive, negative and
>> also
>> > some 0os (mostly in sil phonemes).
>> > After reading the forum I see that the lattices  I can get from Kaldi
>> have
>> > an acoustic likelihood which corresponds to an MPE likelihood. My
>> question
>> > is whether it would make sense to scale these likelihoods to be in the
>> same
>> > range as the HTK ones (always negative or 0) or is it impossible? in the
>> > second case, what does it mean to have a positive acoustic likelihoods?
>> >
>> > Thanks
>> >
>> > Xavier Anguera
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> > Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> > Critical Workloads, Development Environments & Everything In Between.
>> > Get a Quote or Start a Free Trial Today.
>> >
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> > _______________________________________________
>> > Kaldi-developers mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>> >
>>
>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

Re: [Kaldi-developers] Acoustic likelihood in lattices

From: Xavier A. <xan...@gm...> - 2014-01-23 17:59:41

True, sorry for my misuse of terms. I was somehow confused by the acoustic
likelihood on the lattices and the posteriors that are retrieved
by lattice-to-post.
Let me make another dummy question: would it be possible in kaldi to obtain
a lattice where for each node I would get a posterior probability (e.g.
using the forward-backward method used to get the posteriors in
lattice-to-post)?

Thanks for your answers,

Xavier Anguera



On Thu, Jan 23, 2014 at 5:22 PM, Arnab Ghoshal <ar...@gm...> wrote:

> Likelihoods are not probabilities and so they don't have to be less
> than or equal to 1. That is, *log* likelihoods can be anything. If you
> are using GMMs and getting positive log-likelihoods that usually means
> the variances are low. -Arnab
>
> On Thu, Jan 23, 2014 at 11:01 AM, Xavier Anguera <xan...@gm...>
> wrote:
> > Hi,
> > I am using kaldi to extract phone-level lattices and converting them to
> HTK
> > format. After comparing them with HTK-extracted lattices I see that in
> Kaldi
> > I am getting acoustic likelihoods that can be positive, negative and also
> > some 0os (mostly in sil phonemes).
> > After reading the forum I see that the lattices  I can get from Kaldi
> have
> > an acoustic likelihood which corresponds to an MPE likelihood. My
> question
> > is whether it would make sense to scale these likelihoods to be in the
> same
> > range as the HTK ones (always negative or 0) or is it impossible? in the
> > second case, what does it mean to have a positive acoustic likelihoods?
> >
> > Thanks
> >
> > Xavier Anguera
> >
> >
> >
> ------------------------------------------------------------------------------
> > CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> > Learn Why More Businesses Are Choosing CenturyLink Cloud For
> > Critical Workloads, Development Environments & Everything In Between.
> > Get a Quote or Start a Free Trial Today.
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Kaldi-developers mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
> >
>

Re: [Kaldi-developers] Acoustic likelihood in lattices

From: Arnab G. <ar...@gm...> - 2014-01-23 16:23:07

Likelihoods are not probabilities and so they don't have to be less
than or equal to 1. That is, *log* likelihoods can be anything. If you
are using GMMs and getting positive log-likelihoods that usually means
the variances are low. -Arnab

On Thu, Jan 23, 2014 at 11:01 AM, Xavier Anguera <xan...@gm...> wrote:
> Hi,
> I am using kaldi to extract phone-level lattices and converting them to HTK
> format. After comparing them with HTK-extracted lattices I see that in Kaldi
> I am getting acoustic likelihoods that can be positive, negative and also
> some 0os (mostly in sil phonemes).
> After reading the forum I see that the lattices  I can get from Kaldi have
> an acoustic likelihood which corresponds to an MPE likelihood. My question
> is whether it would make sense to scale these likelihoods to be in the same
> range as the HTK ones (always negative or 0) or is it impossible? in the
> second case, what does it mean to have a positive acoustic likelihoods?
>
> Thanks
>
> Xavier Anguera
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

[Kaldi-developers] Acoustic likelihood in lattices

From: Xavier A. <xan...@gm...> - 2014-01-23 16:01:28

Hi,
I am using kaldi to extract phone-level lattices and converting them to HTK
format. After comparing them with HTK-extracted lattices I see that in
Kaldi I am getting acoustic likelihoods that can be positive, negative and
also some 0os (mostly in sil phonemes).
After reading the forum I see that the lattices  I can get from Kaldi have
an acoustic likelihood which corresponds to an MPE likelihood. My question
is whether it would make sense to scale these likelihoods to be in the same
range as the HTK ones (always negative or 0) or is it impossible? in the
second case, what does it mean to have a positive acoustic likelihoods?

Thanks

Xavier Anguera

Re: [Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Daniel P. <dp...@gm...> - 2014-01-21 16:14:29

The script will always exit if the training failed, that is your main clue.
 And the most recently produced log will generally have the errors in it,
but the script will print out which part failed.
Dan



On Tue, Jan 21, 2014 at 4:33 AM, Xavier Anguera <xan...@gm...> wrote:

> Hi Dan,
> thanks for the tip. Doing what you proposed I was able to train the system.
>
> I still have a question though. Is there any general log file where I can
> check the overall success/failure of the training? I find many small log
> files in the different exp directories, but I do not find anything to tell
> me whether the models trained correctly.
>
> Thanks,
>
> Xavier Anguera
>
>
>
> On Fri, Jan 17, 2014 at 6:38 PM, Daniel Povey <dp...@gm...> wrote:
>
>>
>> Just remove all the lines that refer to "fsh" in that script, and replace
>> "tg fsh_tgpr" with "tg", and it should work.
>> Dan
>>
>>
>> Hi Dan,
>>> I believe the option you are referring to is the variable fisher_opt.
>>> Indeed, when passing this to swbd1_train_lms.sh, if empty, the fisher data
>>> is not processed.
>>> The problem is that right after this code, in run.sh it executes:
>>>
>>> LM=data/local/lm/sw1_fsh.o3g.kn.gz
>>> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>>>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tg
>>>
>>> # For some funny reason we are still using IRSTLM for doing LM pruning :)
>>> export PATH=$PATH:../../../tools/irstlm/bin/
>>> prune-lm --threshold=1e-7 data/local/lm/sw1_fsh.o3g.kn.gz /dev/stdout \
>>>   | gzip -c > data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz || exit 1
>>> LM=data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz
>>> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>>>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tgpr
>>>
>>> This expects a file data/local/lm/sw1_fsh.o3g.kn.gz which is not
>>> generated anyway.
>>> It is straighforward to make the above execution conditional upon the
>>> fisher_opt variable but my fear is whether later on in the acoustic
>>> training it will again require the fisher data (I have done a quick search
>>> of fsh in the acoustic training scripts and it seems to appear in several
>>> places).
>>>
>>> On the other hand, recipe s5 does not seem to use the fisher data, but
>>> as you noted before, it is deprecated.
>>>
>>> Please advise what to do.
>>>
>>> Thanks,
>>>
>>> Xavier Anguera
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 5:48 PM, Daniel Povey <dp...@gm...> wrote:
>>>
>>>> I would recommend s5b.
>>>> The Fisher data, which is used for language modeling I think, is
>>>> optional anyway as you will see in the run.sh (it's an option to a script);
>>>> if you don't have the eval2000 data you can just use the training-data
>>>> subset, which the script uses as an alternative test set and which gives
>>>> very similar WERs.
>>>>
>>>> Dan
>>>>
>>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 7:09 AM, Xavier Anguera <xan...@gm...>wrote:
>>>>
>>>>> Hi,
>>>>> I am trying to run the swbd/s5b recipe and I realize that it is
>>>>> somehow dependent on Switchboard (obviously...), Fisher (transcripts and
>>>>> audio) and HUB-5 (transcripts and audio).
>>>>> Given that I do not have current access to neither Fisher nor Hub-5
>>>>> datasets, is it possible to still use this recipe training with switchboard
>>>>> alone?
>>>>> Alternatively, would recipe s5 be more recommendable (i.e. does it
>>>>> also depend on these externals?).
>>>>>
>>>>> thank you all,
>>>>> Xavier Anguera
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>>>> Critical Workloads, Development Environments & Everything In Between.
>>>>> Get a Quote or Start a Free Trial Today.
>>>>>
>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>>>> _______________________________________________
>>>>> Kaldi-developers mailing list
>>>>> Kal...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: [Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Xavier A. <xan...@gm...> - 2014-01-21 09:33:16

Hi Dan,
thanks for the tip. Doing what you proposed I was able to train the system.

I still have a question though. Is there any general log file where I can
check the overall success/failure of the training? I find many small log
files in the different exp directories, but I do not find anything to tell
me whether the models trained correctly.

Thanks,

Xavier Anguera



On Fri, Jan 17, 2014 at 6:38 PM, Daniel Povey <dp...@gm...> wrote:

>
> Just remove all the lines that refer to "fsh" in that script, and replace
> "tg fsh_tgpr" with "tg", and it should work.
> Dan
>
>
> Hi Dan,
>> I believe the option you are referring to is the variable fisher_opt.
>> Indeed, when passing this to swbd1_train_lms.sh, if empty, the fisher data
>> is not processed.
>> The problem is that right after this code, in run.sh it executes:
>>
>> LM=data/local/lm/sw1_fsh.o3g.kn.gz
>> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tg
>>
>> # For some funny reason we are still using IRSTLM for doing LM pruning :)
>> export PATH=$PATH:../../../tools/irstlm/bin/
>> prune-lm --threshold=1e-7 data/local/lm/sw1_fsh.o3g.kn.gz /dev/stdout \
>>   | gzip -c > data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz || exit 1
>> LM=data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz
>> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tgpr
>>
>> This expects a file data/local/lm/sw1_fsh.o3g.kn.gz which is not
>> generated anyway.
>> It is straighforward to make the above execution conditional upon the
>> fisher_opt variable but my fear is whether later on in the acoustic
>> training it will again require the fisher data (I have done a quick search
>> of fsh in the acoustic training scripts and it seems to appear in several
>> places).
>>
>> On the other hand, recipe s5 does not seem to use the fisher data, but as
>> you noted before, it is deprecated.
>>
>> Please advise what to do.
>>
>> Thanks,
>>
>> Xavier Anguera
>>
>>
>>
>>
>>
>>
>> On Fri, Jan 17, 2014 at 5:48 PM, Daniel Povey <dp...@gm...> wrote:
>>
>>> I would recommend s5b.
>>> The Fisher data, which is used for language modeling I think, is
>>> optional anyway as you will see in the run.sh (it's an option to a script);
>>> if you don't have the eval2000 data you can just use the training-data
>>> subset, which the script uses as an alternative test set and which gives
>>> very similar WERs.
>>>
>>> Dan
>>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 7:09 AM, Xavier Anguera <xan...@gm...>wrote:
>>>
>>>> Hi,
>>>> I am trying to run the swbd/s5b recipe and I realize that it is somehow
>>>> dependent on Switchboard (obviously...), Fisher (transcripts and audio) and
>>>> HUB-5 (transcripts and audio).
>>>> Given that I do not have current access to neither Fisher nor Hub-5
>>>> datasets, is it possible to still use this recipe training with switchboard
>>>> alone?
>>>> Alternatively, would recipe s5 be more recommendable (i.e. does it also
>>>> depend on these externals?).
>>>>
>>>> thank you all,
>>>> Xavier Anguera
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>>> Critical Workloads, Development Environments & Everything In Between.
>>>> Get a Quote or Start a Free Trial Today.
>>>>
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>>> _______________________________________________
>>>> Kaldi-developers mailing list
>>>> Kal...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>>>
>>>>
>>>
>>
>

Re: [Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Daniel P. <dp...@gm...> - 2014-01-17 17:38:41

Just remove all the lines that refer to "fsh" in that script, and replace
"tg fsh_tgpr" with "tg", and it should work.
Dan


Hi Dan,
> I believe the option you are referring to is the variable fisher_opt.
> Indeed, when passing this to swbd1_train_lms.sh, if empty, the fisher data
> is not processed.
> The problem is that right after this code, in run.sh it executes:
>
> LM=data/local/lm/sw1_fsh.o3g.kn.gz
> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tg
>
> # For some funny reason we are still using IRSTLM for doing LM pruning :)
> export PATH=$PATH:../../../tools/irstlm/bin/
> prune-lm --threshold=1e-7 data/local/lm/sw1_fsh.o3g.kn.gz /dev/stdout \
>   | gzip -c > data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz || exit 1
> LM=data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz
> utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
>   data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tgpr
>
> This expects a file data/local/lm/sw1_fsh.o3g.kn.gz which is not generated
> anyway.
> It is straighforward to make the above execution conditional upon the
> fisher_opt variable but my fear is whether later on in the acoustic
> training it will again require the fisher data (I have done a quick search
> of fsh in the acoustic training scripts and it seems to appear in several
> places).
>
> On the other hand, recipe s5 does not seem to use the fisher data, but as
> you noted before, it is deprecated.
>
> Please advise what to do.
>
> Thanks,
>
> Xavier Anguera
>
>
>
>
>
>
> On Fri, Jan 17, 2014 at 5:48 PM, Daniel Povey <dp...@gm...> wrote:
>
>> I would recommend s5b.
>> The Fisher data, which is used for language modeling I think, is optional
>> anyway as you will see in the run.sh (it's an option to a script); if you
>> don't have the eval2000 data you can just use the training-data subset,
>> which the script uses as an alternative test set and which gives very
>> similar WERs.
>>
>> Dan
>>
>>
>>
>> On Fri, Jan 17, 2014 at 7:09 AM, Xavier Anguera <xan...@gm...>wrote:
>>
>>> Hi,
>>> I am trying to run the swbd/s5b recipe and I realize that it is somehow
>>> dependent on Switchboard (obviously...), Fisher (transcripts and audio) and
>>> HUB-5 (transcripts and audio).
>>> Given that I do not have current access to neither Fisher nor Hub-5
>>> datasets, is it possible to still use this recipe training with switchboard
>>> alone?
>>> Alternatively, would recipe s5 be more recommendable (i.e. does it also
>>> depend on these externals?).
>>>
>>> thank you all,
>>> Xavier Anguera
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>>> Critical Workloads, Development Environments & Everything In Between.
>>> Get a Quote or Start a Free Trial Today.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Kaldi-developers mailing list
>>> Kal...@li...
>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>>
>>>
>>
>

Re: [Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Xavier A. <xan...@gm...> - 2014-01-17 17:04:52

Hi Dan,
I believe the option you are referring to is the variable fisher_opt.
Indeed, when passing this to swbd1_train_lms.sh, if empty, the fisher data
is not processed.
The problem is that right after this code, in run.sh it executes:

LM=data/local/lm/sw1_fsh.o3g.kn.gz
utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
  data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tg

# For some funny reason we are still using IRSTLM for doing LM pruning :)
export PATH=$PATH:../../../tools/irstlm/bin/
prune-lm --threshold=1e-7 data/local/lm/sw1_fsh.o3g.kn.gz /dev/stdout \
  | gzip -c > data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz || exit 1
LM=data/local/lm/sw1_fsh.o3g.pr1-7.kn.gz
utils/format_lm_sri.sh --srilm-opts "$srilm_opts" \
  data/lang $LM data/local/dict/lexicon.txt data/lang_sw1_fsh_tgpr

This expects a file data/local/lm/sw1_fsh.o3g.kn.gz which is not generated
anyway.
It is straighforward to make the above execution conditional upon the
fisher_opt variable but my fear is whether later on in the acoustic
training it will again require the fisher data (I have done a quick search
of fsh in the acoustic training scripts and it seems to appear in several
places).

On the other hand, recipe s5 does not seem to use the fisher data, but as
you noted before, it is deprecated.

Please advise what to do.

Thanks,

Xavier Anguera






On Fri, Jan 17, 2014 at 5:48 PM, Daniel Povey <dp...@gm...> wrote:

> I would recommend s5b.
> The Fisher data, which is used for language modeling I think, is optional
> anyway as you will see in the run.sh (it's an option to a script); if you
> don't have the eval2000 data you can just use the training-data subset,
> which the script uses as an alternative test set and which gives very
> similar WERs.
>
> Dan
>
>
>
> On Fri, Jan 17, 2014 at 7:09 AM, Xavier Anguera <xan...@gm...>wrote:
>
>> Hi,
>> I am trying to run the swbd/s5b recipe and I realize that it is somehow
>> dependent on Switchboard (obviously...), Fisher (transcripts and audio) and
>> HUB-5 (transcripts and audio).
>> Given that I do not have current access to neither Fisher nor Hub-5
>> datasets, is it possible to still use this recipe training with switchboard
>> alone?
>> Alternatively, would recipe s5 be more recommendable (i.e. does it also
>> depend on these externals?).
>>
>> thank you all,
>> Xavier Anguera
>>
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>
>>
>

Re: [Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Daniel P. <dp...@gm...> - 2014-01-17 16:48:41

I would recommend s5b.
The Fisher data, which is used for language modeling I think, is optional
anyway as you will see in the run.sh (it's an option to a script); if you
don't have the eval2000 data you can just use the training-data subset,
which the script uses as an alternative test set and which gives very
similar WERs.

Dan



On Fri, Jan 17, 2014 at 7:09 AM, Xavier Anguera <xan...@gm...> wrote:

> Hi,
> I am trying to run the swbd/s5b recipe and I realize that it is somehow
> dependent on Switchboard (obviously...), Fisher (transcripts and audio) and
> HUB-5 (transcripts and audio).
> Given that I do not have current access to neither Fisher nor Hub-5
> datasets, is it possible to still use this recipe training with switchboard
> alone?
> Alternatively, would recipe s5 be more recommendable (i.e. does it also
> depend on these externals?).
>
> thank you all,
> Xavier Anguera
>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] using recipe swbd/s5b with only switchboard DDBB

From: Xavier A. <xan...@gm...> - 2014-01-17 12:09:45

Hi,
I am trying to run the swbd/s5b recipe and I realize that it is somehow
dependent on Switchboard (obviously...), Fisher (transcripts and audio) and
HUB-5 (transcripts and audio).
Given that I do not have current access to neither Fisher nor Hub-5
datasets, is it possible to still use this recipe training with switchboard
alone?
Alternatively, would recipe s5 be more recommendable (i.e. does it also
depend on these externals?).

thank you all,
Xavier Anguera

Re: [Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: KERMORVANT, C. <Chr...@a2...> - 2014-01-16 07:48:01

Hi Dan,

Ok, i look into it.
--
Christopher Kermorvant


Le 15 janv. 2014 à 18:56, "Daniel Povey" <dp...@gm...<mailto:dp...@gm...>> a écrit :

Christopher,
Could you please run nnet2/nnet-component-test on your setup and send the output?  I'm not able to reproduce this test failure on any setup that I have.
Dan



On Wed, Jan 15, 2014 at 11:23 AM, <jen...@a2...<mailto:jen...@a2...>> wrote:
Kaldi - Build # 369 - Still Failing:

See the build log in attachment for the details.


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Kaldi-developers mailing list
Kal...@li...<mailto:Kal...@li...>
https://lists.sourceforge.net/lists/listinfo/kaldi-developers

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-15 23:23:47

Attachments: build.log

Kaldi - Build # 370 - Still Failing:

See the build log in attachment for the details.

Re: [Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: Daniel P. <dp...@gm...> - 2014-01-15 17:56:19

Christopher,
Could you please run nnet2/nnet-component-test on your setup and send the
output?  I'm not able to reproduce this test failure on any setup that I
have.
Dan



On Wed, Jan 15, 2014 at 11:23 AM, <jen...@a2...> wrote:

> Kaldi - Build # 369 - Still Failing:
>
> See the build log in attachment for the details.
>
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-15 16:40:15

Attachments: build.log

Kaldi - Build # 369 - Still Failing:

See the build log in attachment for the details.

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-15 16:35:12

Attachments: build.log

Kaldi - Build # 368 - Still Failing:

See the build log in attachment for the details.

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-15 16:20:12

Attachments: build.log

Kaldi - Build # 367 - Still Failing:

See the build log in attachment for the details.

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-15 03:38:02

Attachments: build.log

Kaldi - Build # 366 - Still Failing:

See the build log in attachment for the details.

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-13 21:33:41

Attachments: build.log

Kaldi - Build # 363 - Fixed:

See the build log in attachment for the details.

[Kaldi-developers] [Jenkins] Kaldi automatic build and test

From: <jen...@a2...> - 2014-01-13 17:06:56

Attachments: build.log

Kaldi - Build # 362 - Failure:

See the build log in attachment for the details.

Re: [Kaldi-developers] nnet-latgen-faster* seems to be missing or obsolete

From: Daniel P. <dp...@gm...> - 2014-01-10 18:06:24

Hi,
Firstly, that script steps/decode_nnet.sh is deprecated, you should use
steps/nnet2/decode.sh, but it should still work.
Secondly-- I just checked the script, and it sets the string to the empty
string or to "-parallel" so the call is to nnet-latgen-faster or
nnet-latgen-faster-parallel.  Both should exist.
Dan



On Fri, Jan 10, 2014 at 11:19 AM, Xavier Anguera <xan...@gm...> wrote:

> Hi,
> in egs/wsj/steps/steps/decode_nnet_cpu.sh there is a call
> to nnet-latgen-faster$thread_string that is not found within Kaldi.
> I have found that another script called egs/wsj/steps/steps/decode_nnet.sh
> does not depend on this missing program.
> My question is whether the "_cpu"-ending script is obsolete and I should
> use decode_nnet.sh instead, or where to find the missing program.
>
> Thanks,
>
> Xavi Anguera
>
>
> ------------------------------------------------------------------------------
> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
> Learn Why More Businesses Are Choosing CenturyLink Cloud For
> Critical Workloads, Development Environments & Everything In Between.
> Get a Quote or Start a Free Trial Today.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

[Kaldi-developers] nnet-latgen-faster* seems to be missing or obsolete

From: Xavier A. <xan...@gm...> - 2014-01-10 16:20:05

Hi,
in egs/wsj/steps/steps/decode_nnet_cpu.sh there is a call
to nnet-latgen-faster$thread_string that is not found within Kaldi.
I have found that another script called egs/wsj/steps/steps/decode_nnet.sh
does not depend on this missing program.
My question is whether the "_cpu"-ending script is obsolete and I should
use decode_nnet.sh instead, or where to find the missing program.

Thanks,

Xavi Anguera

Re: [Kaldi-developers] Errors when training mono-state NN in SWBD recipe

From: Xavier A. <xan...@gm...> - 2014-01-08 20:57:49

Dear Dan, Macin and all,

I finally got my code to work. I updated to revision r3391 and my problems
went magically away.

Thanks for your help,

Xavier Anguera



On Tue, Jan 7, 2014 at 10:58 PM, Daniel Povey <dp...@gm...> wrote:

>
> Marcin, can you let us know what happens kernel 3.12?  E.g. what the
> symptoms are; a stack trace would be ideal.  Perhaps running it in valgrind
> would be helpful.
> It turns out that Xavier's problem was not resolved after all.  I'm
> following up to debug it more.
> Dan
>
>
> On Tue, Jan 7, 2014 at 10:47 AM, Daniel Povey <dp...@gm...> wrote:
>
>> Marcin (i.e. tvarog), let us know if you find out why that kernel caused
>> problems.
>> I think we resolved Xavier's problem with a bug-fix.
>> Dan
>>
>>
>>
>> On Tue, Jan 7, 2014 at 9:35 AM, Tvarog <tv...@gm...> wrote:
>>
>>> It seems that kernel 3.12 is responsible for weird behavior
>>> of src/fstbin/ binaries (no idea why though). After downgrading to 3.11
>>> everything was ok for me.
>>>
>>>

Re: [Kaldi-developers] Errors when training mono-state NN in SWBD recipe

From: Daniel P. <dp...@gm...> - 2014-01-07 21:58:29

Marcin, can you let us know what happens kernel 3.12?  E.g. what the
symptoms are; a stack trace would be ideal.  Perhaps running it in valgrind
would be helpful.
It turns out that Xavier's problem was not resolved after all.  I'm
following up to debug it more.
Dan

On Tue, Jan 7, 2014 at 10:47 AM, Daniel Povey <dp...@gm...> wrote:

> Marcin (i.e. tvarog), let us know if you find out why that kernel caused
> problems.
> I think we resolved Xavier's problem with a bug-fix.
> Dan
>
>
>
> On Tue, Jan 7, 2014 at 9:35 AM, Tvarog <tv...@gm...> wrote:
>
>> It seems that kernel 3.12 is responsible for weird behavior
>> of src/fstbin/ binaries (no idea why though). After downgrading to 3.11
>> everything was ok for me.
>>
>>

23 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 21 22 23 24 25 .. 37 > >> (Page 23 of 37)

2011	Jan	Feb	Mar	Apr	May	Jun (4)	Jul	Aug	Sep (1)	Oct (4)	Nov (1)	Dec (14)
2012	Jan (1)	Feb (8)	Mar	Apr (1)	May (3)	Jun (13)	Jul (7)	Aug (11)	Sep (6)	Oct (14)	Nov (16)	Dec (1)
2013	Jan (3)	Feb (8)	Mar (17)	Apr (21)	May (27)	Jun (11)	Jul (11)	Aug (21)	Sep (39)	Oct (17)	Nov (39)	Dec (28)
2014	Jan (36)	Feb (30)	Mar (35)	Apr (17)	May (22)	Jun (28)	Jul (23)	Aug (41)	Sep (17)	Oct (10)	Nov (22)	Dec (56)
2015	Jan (30)	Feb (32)	Mar (37)	Apr (28)	May (79)	Jun (18)	Jul (35)	Aug	Sep (1)	Oct	Nov	Dec