kaldi-users Mailing List for Kaldi (Page 2)

Brought to you by: bouliagi, danielpovey, jtrmal, ngoel17, and 2 others

This project can now be found here.

kaldi-users — Kaldi Users

You can subscribe to this list here.

2011	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul (2)	_Aug (2)	_Sep (1)	_Oct (1)	_Nov	_Dec
2012	_Jan	_Feb	_Mar (8)	_Apr (4)	_May (2)	_Jun (1)	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec
2013	_Jan	_Feb (2)	_Mar (2)	_Apr (7)	_May (31)	_Jun (40)	_Jul (65)	_Aug (37)	_Sep (12)	_Oct (57)	_Nov (15)	_Dec (35)
2014	_Jan (3)	_Feb (30)	_Mar (57)	_Apr (26)	_May (49)	_Jun (26)	_Jul (63)	_Aug (33)	_Sep (20)	_Oct (153)	_Nov (62)	_Dec (20)
2015	_Jan (6)	_Feb (21)	_Mar (42)	_Apr (33)	_May (76)	_Jun (102)	_Jul (39)	_Aug	_Sep	_Oct	_Nov	_Dec

Flat | Threaded

<< < 1 2 3 4 .. 48 > >> (Page 2 of 48)

Re: [Kaldi-users] About WSJ LM for test

From: Jan T. <jt...@gm...> - 2015-07-06 13:35:46

I'm afraid that WSJ is copyrighted and not publicly available for free, so
I don't think  anyone will be able to help you. You could try to contact
LDC directly, they have have a initiative providing the data to elligible
students (see
https://www.ldc.upenn.edu/language-resources/data/data-scholarships)
y.

On Sun, Jul 5, 2015 at 11:24 PM, Xu <as...@16...> wrote:

> Dear kaldi-users,
> This is Xu, a  new kaldi-user. I did experiment about DNN speaker
> adaptation on WSJ corpus in Kaldi recently. The adaptation model train was
> finished. In order to validate correctness of my experiment, the language
> mode for test is need. But I don't have the default language model in WSJ
> corpus for test. Would you send me the default language model which is for
> WSJ corpus test?  The language model is
> "../13-32.1/wsj1/doc/lng_modl/base_lm/bcb20onp.z" in kaldi default script.
> Thank you very much.
> Best Wishes.
> Xu
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>

[Kaldi-users] Re-scoring using RNNLM fails

From: Sunit S. <sun...@in...> - 2015-07-06 11:55:19

Hi all,

I am getting a buffer overflow error while running RNNLM scripts of WSJ. 
Any idea as to what could have gone wrong? I trained the model using a 
subset of WSJ utterances and from the logs, the training seemed alright. 
Below are the rnnlm rescore logs followed by the RNN training logs.



  steps/rnnlmrescore.sh --rnnlm_ver rnnlm-hs-0.1b --N 100 0.5 
data/lang_test_tgpr_5k data/lang_rnnlm_h30_me5-1000 data/dt05_multi_r_mc 
exp/tri4a/decode_tgpr_5k exp/tri4a/decode_tgpr_5k_rnnlm_h30_me5-1000_L0.5


steps/rnnlmrescore.sh: converting lattices to N-best.
steps/rnnlmrescore.sh: removing old LM scores.
steps/rnnlmrescore.sh: creating separate-archive form of N-best lists.
steps/rnnlmrescore.sh: doing the same with old LM scores.
steps/rnnlmrescore.sh: Creating archives with text-form of words, and LM 
scores without graph scores.
steps/rnnlmrescore.sh: invoking rnnlm_compute_scores.sh which calls 
rnnlm, to get RNN LM scores.
*** buffer overflow detected ***: ../../../tools/rnnlm-hs-0.1b/rnnlm 
terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7fda06f6b38f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fda07002c9c]
/lib/x86_64-linux-gnu/libc.so.6(+0x109b60)[0x7fda07001b60]
../../../tools/rnnlm-hs-0.1b/rnnlm[0x4011ea]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7fda06f19ec5]
../../../tools/rnnlm-hs-0.1b/rnnlm[0x4018ac]



Training logs:

../../../tools/rnnlm-hs-0.1b/rnnlm -threads 1 -independent -train 
/tmp/tmp.6aF3RDTFnf -valid /tmp/tmp.aTiUNgZnWT -rnnlm 
data/lang_rnnlm_h30_me5-1000/rnnlm -hidden 30 -rand-seed 1 -debug 2 
-class 200 -bptt 2 -bptt-block 20 -direct-order 4 -direct 1000 -binary
#
Vocab size: 9066
Words in train file: 164907
Starting training using file /tmp/tmp.6aF3RDTFnf
Iteration 0    Valid Entropy 9.595403
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
117.21k    Iteration 1    Valid Entropy 8.564008
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
123.54k    Iteration 2    Valid Entropy 8.297136
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
122.70k    Iteration 3    Valid Entropy 8.175531
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
108.22k    Iteration 4    Valid Entropy 8.107678
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
121.89k    Iteration 5    Valid Entropy 8.069274
Alpha: 0.100000  ME-alpha: 0.100000  Progress: 97.12% Words/thread/sec: 
124.64k    Iteration 6    Valid Entropy 8.049375    Decay started
Alpha: 0.050000  ME-alpha: 0.050000  Progress: 97.12% Words/thread/sec: 
111.30k    Iteration 7    Valid Entropy 8.009795
Alpha: 0.025000  ME-alpha: 0.025000  Progress: 97.12% Words/thread/sec: 
124.70k    Iteration 8    Valid Entropy 7.989441    Retry 1/2
Alpha: 0.012500  ME-alpha: 0.012500  Progress: 97.12% Words/thread/sec: 
113.82k    Iteration 9    Valid Entropy 7.982499    Retry 2/2
# Accounting: time=439 threads=1
# Ended (code 0) at Fri Jun 26 16:22:52 CEST 2015, elapsed time 439 seconds


Regards,
Sunit

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Vassil P. <vas...@gm...> - 2015-07-06 10:14:49

Hi Neal,
yes, I guess there might be tools, or combination thereof, that could
produce even better results. This is one of the raison d'êtres for the
"original-mp3" archive. It should contain enough metadata to allow
re-extraction of the aligned utterances, possibly using different tools (it
also contains 15-20% additional audio, that was discarded in order to make
LibriSpeech more balanced).
It seems to me, however, that the the audio quality of the current corpus
is OK.
Thanks for mentioning these audio analysis tools, I wasn't aware of some of
them. I've found WaveSurfer to be pretty useful too.

Vassil

On Sun, Jul 5, 2015 at 8:36 PM, Neil Nelson <nn...@in...> wrote:

> Vassil,
>
> I have always used Lame (Ubuntu Software Center is your friend) to
> convert between wav and mp3. It is well regarded. I suggest SOX for
> down-sampling. Spek will give a spectral analysis picture for the entire
> file. Audacity will give an ongoing spectral analysis but it is not
> frequency labeled. Sonic Visualizer may have something. Upgrading to
> Ubuntu 14.04 can be tricky in spots but something to consider since the
> versions of GCC and all the software tend to be limited to the OS rev.
>
> Neil
>
> On 07/05/2015 02:05 AM, Vassil Panayotov wrote:
> > BTW, when preparing LibriSpeech, I've noticed that the quality of MP3
> > conversion can vary substantially, depending on the particular tool used.
> > For example the output of mpg123(or maybe it was mpg321) was very noisy
> and
> > the ASR WER was 10-15% absolute higher than when alternative MP3 decoders
> > were used. When converting to 16kHz .wav ffmpeg cuts off the frequencies
> > higher than 7kHz. So eventually I settled for mplayer. It preserves the
> > frequency content in the 7-8kHz range and as far as I could tell the
> audio
> > sounded a bit "closer" to the original recording, although I'm not sure
> if
> > there is any measurable difference in ASR performance b/w ffmpeg and
> > mplayer produced .wav-s. The versions of the tools I've tried were those
> > shipped with Ubuntu 10.04 and 12.04, so the issues may be fixed in the
> more
> > recent releases.
> >
> > Vassil
>
> --
> RSA public key for this email address at http://pgp.mit.edu/
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>

[Kaldi-users] About WSJ LM for test

From: Xu <as...@16...> - 2015-07-06 03:24:59

Dear kaldi-users,
This is Xu, a  new kaldi-user. I did experiment about DNN speaker adaptation on WSJ corpus in Kaldi recently. The adaptation model train was finished. In order to validate correctness of my experiment, the language mode for test is need. But I don't have the default language model in WSJ corpus for test. Would you send me the default language model which is for WSJ corpus test?  The language model is "../13-32.1/wsj1/doc/lng_modl/base_lm/bcb20onp.z" in kaldi default script. Thank you very much.

Best Wishes.
Xu

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Neil N. <nn...@in...> - 2015-07-05 17:53:21

Vassil,

I have always used Lame (Ubuntu Software Center is your friend) to
convert between wav and mp3. It is well regarded. I suggest SOX for
down-sampling. Spek will give a spectral analysis picture for the entire
file. Audacity will give an ongoing spectral analysis but it is not
frequency labeled. Sonic Visualizer may have something. Upgrading to
Ubuntu 14.04 can be tricky in spots but something to consider since the
versions of GCC and all the software tend to be limited to the OS rev.

Neil

On 07/05/2015 02:05 AM, Vassil Panayotov wrote:
> BTW, when preparing LibriSpeech, I've noticed that the quality of MP3
> conversion can vary substantially, depending on the particular tool used.
> For example the output of mpg123(or maybe it was mpg321) was very noisy and
> the ASR WER was 10-15% absolute higher than when alternative MP3 decoders
> were used. When converting to 16kHz .wav ffmpeg cuts off the frequencies
> higher than 7kHz. So eventually I settled for mplayer. It preserves the
> frequency content in the 7-8kHz range and as far as I could tell the audio
> sounded a bit "closer" to the original recording, although I'm not sure if
> there is any measurable difference in ASR performance b/w ffmpeg and
> mplayer produced .wav-s. The versions of the tools I've tried were those
> shipped with Ubuntu 10.04 and 12.04, so the issues may be fixed in the more
> recent releases.
> 
> Vassil

-- 
RSA public key for this email address at http://pgp.mit.edu/

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Vassil P. <vas...@gm...> - 2015-07-05 08:05:25

BTW, when preparing LibriSpeech, I've noticed that the quality of MP3
conversion can vary substantially, depending on the particular tool used.
For example the output of mpg123(or maybe it was mpg321) was very noisy and
the ASR WER was 10-15% absolute higher than when alternative MP3 decoders
were used. When converting to 16kHz .wav ffmpeg cuts off the frequencies
higher than 7kHz. So eventually I settled for mplayer. It preserves the
frequency content in the 7-8kHz range and as far as I could tell the audio
sounded a bit "closer" to the original recording, although I'm not sure if
there is any measurable difference in ASR performance b/w ffmpeg and
mplayer produced .wav-s. The versions of the tools I've tried were those
shipped with Ubuntu 10.04 and 12.04, so the issues may be fixed in the more
recent releases.

Vassil

On Fri, Jul 3, 2015 at 9:53 PM, Daniel Povey <dp...@gm...> wrote:

> The sampling rate is critical, but the bitrate is not really critical-
> just make sure it sounds OK without super-obvious artifacts.  Vassil
> (cc'd) will know what bitrate he encoded the Librispeech data with,
> but matching this exactly is probably not important.
> Dan
>
>
> On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...>
> wrote:
> > The data I want to train on is in MP3 format at a 128kbps bitrate and a
> > 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
> > doesn't seem to have a specified bitrate, When I convert the MP3 files
> into
> > 16kHz sample-rate WAV files, what bitrate should I convert them to?
> >
> > Is there anything else I should consider when converting the speech
> files?
> >
> > On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti
> > <p.v...@gm...> wrote:
> >>
> >> You need to provide the egs directory, not exp directory. You can check
> >> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs
> directory
> >> can be created from the alignment and data directories.
> >> The context variables necessary for creating these examples can be found
> >> in nnet_ms_a_online/conf/splice.conf file.
> >>
> >> Vijay
> >>
> >> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...>
> >> wrote:
> >>>
> >>> The train_more*.sh scripts accept an 'exp' directory instead of a
> >>> 'data/train' directory. Is there another script that would accept the
> >>> 'data/train' directory as input instead?
> >>>
> >>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti
> >>> <p.v...@gm...> wrote:
> >>>>
> >>>> See the scripts steps/nnet2/train_more*.sh
> >>>>
> >>>> Vijay
> >>>>
> >>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <
> jon...@gm...>
> >>>> wrote:
> >>>>>
> >>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
> >>>>> model on a new dataset.
> >>>>>
> >>>>> I have prepared the files for this new dataset inside a data/train
> >>>>> directory, as described in the Data Preparation tutorial. I want to
> keep the
> >>>>> nnet2_a_online model initialized to the parameters it learned from
> training
> >>>>> on LibriSpeech, but continue its training on this new dataset. Is
> there a
> >>>>> script that would allow me to specify the nnet2_a_online model and
> the
> >>>>> dataset's data/train directory as input in order to output a model
> that has
> >>>>> been trained more on this new dataset?
> >>>>>
> >>>>>
> >>>>>
> ------------------------------------------------------------------------------
> >>>>> Monitor 25 network devices or servers for free with OpManager!
> >>>>> OpManager is web-based network management software that monitors
> >>>>> network devices and physical & virtual servers, alerts via email &
> sms
> >>>>> for fault. Monitor 25 devices for free with no restriction. Download
> >>>>> now
> >>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> >>>>> _______________________________________________
> >>>>> Kaldi-users mailing list
> >>>>> Kal...@li...
> >>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
> >>>>>
> >>>>
> >>>
> >>
> >
> >
> >
> ------------------------------------------------------------------------------
> > Don't Limit Your Business. Reach for the Cloud.
> > GigeNET's Cloud Solutions provide you with the tools and support that
> > you need to offload your IT needs and focus on growing your business.
> > Configured For All Businesses. Start Your Cloud Today.
> > https://www.gigenetcloud.com/
> > _______________________________________________
> > Kaldi-users mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
> >
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Daniel P. <dp...@gm...> - 2015-07-03 21:08:44

No, it doesn't implement early stopping.  You would have to just
decode different iterations and see which seems to give the best
results.
I haven't really gone with early stopping because it tends to stop
training before you reach the best WER.  On the other hand, if you
don't know what you are doing it can be dangerous not to do early
stopping, because there is a danger of seriously overtraining.
Dan


On Fri, Jul 3, 2015 at 1:52 PM, Mate Andre <ele...@gm...> wrote:
> Thanks Tony, that makes sense.
>
> Does the train_more2.sh script implement early stopping using the validation
> set created in the egs/ directory?
>
> On Fri, Jul 3, 2015 at 3:39 PM, Tony Robinson <to...@ca...>
> wrote:
>>
>> I may be missing something but I read the question as "When I convert
>> the MP3 files into 16kHz sample-rate WAV files, what bitrate should I
>> convert them to?"
>>
>> The answer is that they should be converted to 16bits per sample, 16kHz
>> mono files, so that's 256,000 bits per second.  There's not a lot of
>> point in using more than 16 bits per sample as the mp3 quanitisation is
>> worse than this and there's not a lot of point in using less than 16bits
>> per sample as why throw information away.
>>
>>
>> Tony
>>
>> On 03/07/15 19:53, Daniel Povey wrote:
>> > The sampling rate is critical, but the bitrate is not really critical-
>> > just make sure it sounds OK without super-obvious artifacts.  Vassil
>> > (cc'd) will know what bitrate he encoded the Librispeech data with,
>> > but matching this exactly is probably not important.
>> > Dan
>> >
>> >
>> > On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...>
>> > wrote:
>> >> The data I want to train on is in MP3 format at a 128kbps bitrate and a
>> >> 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
>> >> doesn't seem to have a specified bitrate, When I convert the MP3 files
>> >> into
>> >> 16kHz sample-rate WAV files, what bitrate should I convert them to?
>> >>
>> >> Is there anything else I should consider when converting the speech
>> >> files?
>> >>
>> >> On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti
>> >> <p.v...@gm...> wrote:
>> >>> You need to provide the egs directory, not exp directory. You can
>> >>> check
>> >>> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs
>> >>> directory
>> >>> can be created from the alignment and data directories.
>> >>> The context variables necessary for creating these examples can be
>> >>> found
>> >>> in nnet_ms_a_online/conf/splice.conf file.
>> >>>
>> >>> Vijay
>> >>>
>> >>> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L
>> >>> <jon...@gm...>
>> >>> wrote:
>> >>>> The train_more*.sh scripts accept an 'exp' directory instead of a
>> >>>> 'data/train' directory. Is there another script that would accept the
>> >>>> 'data/train' directory as input instead?
>> >>>>
>> >>>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti
>> >>>> <p.v...@gm...> wrote:
>> >>>>> See the scripts steps/nnet2/train_more*.sh
>> >>>>>
>> >>>>> Vijay
>> >>>>>
>> >>>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L
>> >>>>> <jon...@gm...>
>> >>>>> wrote:
>> >>>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
>> >>>>>> model on a new dataset.
>> >>>>>>
>> >>>>>> I have prepared the files for this new dataset inside a data/train
>> >>>>>> directory, as described in the Data Preparation tutorial. I want to
>> >>>>>> keep the
>> >>>>>> nnet2_a_online model initialized to the parameters it learned from
>> >>>>>> training
>> >>>>>> on LibriSpeech, but continue its training on this new dataset. Is
>> >>>>>> there a
>> >>>>>> script that would allow me to specify the nnet2_a_online model and
>> >>>>>> the
>> >>>>>> dataset's data/train directory as input in order to output a model
>> >>>>>> that has
>> >>>>>> been trained more on this new dataset?
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> ------------------------------------------------------------------------------
>> >>>>>> Monitor 25 network devices or servers for free with OpManager!
>> >>>>>> OpManager is web-based network management software that monitors
>> >>>>>> network devices and physical & virtual servers, alerts via email &
>> >>>>>> sms
>> >>>>>> for fault. Monitor 25 devices for free with no restriction.
>> >>>>>> Download
>> >>>>>> now
>> >>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> >>>>>> _______________________________________________
>> >>>>>> Kaldi-users mailing list
>> >>>>>> Kal...@li...
>> >>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >>>>>>
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> Don't Limit Your Business. Reach for the Cloud.
>> >> GigeNET's Cloud Solutions provide you with the tools and support that
>> >> you need to offload your IT needs and focus on growing your business.
>> >> Configured For All Businesses. Start Your Cloud Today.
>> >> https://www.gigenetcloud.com/
>> >> _______________________________________________
>> >> Kaldi-users mailing list
>> >> Kal...@li...
>> >> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >>
>> >
>> > ------------------------------------------------------------------------------
>> > Don't Limit Your Business. Reach for the Cloud.
>> > GigeNET's Cloud Solutions provide you with the tools and support that
>> > you need to offload your IT needs and focus on growing your business.
>> > Configured For All Businesses. Start Your Cloud Today.
>> > https://www.gigenetcloud.com/
>> > _______________________________________________
>> > Kaldi-users mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>
>>
>> --
>> Dr A J Robinson, Founder
>> We are hiring: www.speechmatics.com/careers
>> Speechmatics is a trading name of Cantab Research Limited
>> Phone direct: 01223 778240 office: 01223 794497
>> Company reg no GB 05697423, VAT reg no 925606030
>> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>>
>>
>> ------------------------------------------------------------------------------
>> Don't Limit Your Business. Reach for the Cloud.
>> GigeNET's Cloud Solutions provide you with the tools and support that
>> you need to offload your IT needs and focus on growing your business.
>> Configured For All Businesses. Start Your Cloud Today.
>> https://www.gigenetcloud.com/
>> _______________________________________________
>> Kaldi-users mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Mate A. <ele...@gm...> - 2015-07-03 20:53:00

Thanks Tony, that makes sense.

Does the train_more2.sh script implement early stopping using the
validation set created in the egs/ directory?

On Fri, Jul 3, 2015 at 3:39 PM, Tony Robinson <to...@ca...>
wrote:

> I may be missing something but I read the question as "When I convert
> the MP3 files into 16kHz sample-rate WAV files, what bitrate should I
> convert them to?"
>
> The answer is that they should be converted to 16bits per sample, 16kHz
> mono files, so that's 256,000 bits per second.  There's not a lot of
> point in using more than 16 bits per sample as the mp3 quanitisation is
> worse than this and there's not a lot of point in using less than 16bits
> per sample as why throw information away.
>
>
> Tony
>
> On 03/07/15 19:53, Daniel Povey wrote:
> > The sampling rate is critical, but the bitrate is not really critical-
> > just make sure it sounds OK without super-obvious artifacts.  Vassil
> > (cc'd) will know what bitrate he encoded the Librispeech data with,
> > but matching this exactly is probably not important.
> > Dan
> >
> >
> > On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...>
> wrote:
> >> The data I want to train on is in MP3 format at a 128kbps bitrate and a
> >> 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
> >> doesn't seem to have a specified bitrate, When I convert the MP3 files
> into
> >> 16kHz sample-rate WAV files, what bitrate should I convert them to?
> >>
> >> Is there anything else I should consider when converting the speech
> files?
> >>
> >> On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti
> >> <p.v...@gm...> wrote:
> >>> You need to provide the egs directory, not exp directory. You can check
> >>> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs
> directory
> >>> can be created from the alignment and data directories.
> >>> The context variables necessary for creating these examples can be
> found
> >>> in nnet_ms_a_online/conf/splice.conf file.
> >>>
> >>> Vijay
> >>>
> >>> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...
> >
> >>> wrote:
> >>>> The train_more*.sh scripts accept an 'exp' directory instead of a
> >>>> 'data/train' directory. Is there another script that would accept the
> >>>> 'data/train' directory as input instead?
> >>>>
> >>>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti
> >>>> <p.v...@gm...> wrote:
> >>>>> See the scripts steps/nnet2/train_more*.sh
> >>>>>
> >>>>> Vijay
> >>>>>
> >>>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <
> jon...@gm...>
> >>>>> wrote:
> >>>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
> >>>>>> model on a new dataset.
> >>>>>>
> >>>>>> I have prepared the files for this new dataset inside a data/train
> >>>>>> directory, as described in the Data Preparation tutorial. I want to
> keep the
> >>>>>> nnet2_a_online model initialized to the parameters it learned from
> training
> >>>>>> on LibriSpeech, but continue its training on this new dataset. Is
> there a
> >>>>>> script that would allow me to specify the nnet2_a_online model and
> the
> >>>>>> dataset's data/train directory as input in order to output a model
> that has
> >>>>>> been trained more on this new dataset?
> >>>>>>
> >>>>>>
> >>>>>>
> ------------------------------------------------------------------------------
> >>>>>> Monitor 25 network devices or servers for free with OpManager!
> >>>>>> OpManager is web-based network management software that monitors
> >>>>>> network devices and physical & virtual servers, alerts via email &
> sms
> >>>>>> for fault. Monitor 25 devices for free with no restriction. Download
> >>>>>> now
> >>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> >>>>>> _______________________________________________
> >>>>>> Kaldi-users mailing list
> >>>>>> Kal...@li...
> >>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
> >>>>>>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Don't Limit Your Business. Reach for the Cloud.
> >> GigeNET's Cloud Solutions provide you with the tools and support that
> >> you need to offload your IT needs and focus on growing your business.
> >> Configured For All Businesses. Start Your Cloud Today.
> >> https://www.gigenetcloud.com/
> >> _______________________________________________
> >> Kaldi-users mailing list
> >> Kal...@li...
> >> https://lists.sourceforge.net/lists/listinfo/kaldi-users
> >>
> >
> ------------------------------------------------------------------------------
> > Don't Limit Your Business. Reach for the Cloud.
> > GigeNET's Cloud Solutions provide you with the tools and support that
> > you need to offload your IT needs and focus on growing your business.
> > Configured For All Businesses. Start Your Cloud Today.
> > https://www.gigenetcloud.com/
> > _______________________________________________
> > Kaldi-users mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>
> --
> Dr A J Robinson, Founder
> We are hiring: www.speechmatics.com/careers
> Speechmatics is a trading name of Cantab Research Limited
> Phone direct: 01223 778240 office: 01223 794497
> Company reg no GB 05697423, VAT reg no 925606030
> 51 Canterbury Street, Cambridge, CB4 3QG, UK
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Tony R. <to...@ca...> - 2015-07-03 19:52:09

I may be missing something but I read the question as "When I convert 
the MP3 files into 16kHz sample-rate WAV files, what bitrate should I 
convert them to?"

The answer is that they should be converted to 16bits per sample, 16kHz 
mono files, so that's 256,000 bits per second.  There's not a lot of 
point in using more than 16 bits per sample as the mp3 quanitisation is 
worse than this and there's not a lot of point in using less than 16bits 
per sample as why throw information away.


Tony

On 03/07/15 19:53, Daniel Povey wrote:
> The sampling rate is critical, but the bitrate is not really critical-
> just make sure it sounds OK without super-obvious artifacts.  Vassil
> (cc'd) will know what bitrate he encoded the Librispeech data with,
> but matching this exactly is probably not important.
> Dan
>
>
> On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...> wrote:
>> The data I want to train on is in MP3 format at a 128kbps bitrate and a
>> 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
>> doesn't seem to have a specified bitrate, When I convert the MP3 files into
>> 16kHz sample-rate WAV files, what bitrate should I convert them to?
>>
>> Is there anything else I should consider when converting the speech files?
>>
>> On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti
>> <p.v...@gm...> wrote:
>>> You need to provide the egs directory, not exp directory. You can check
>>> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs directory
>>> can be created from the alignment and data directories.
>>> The context variables necessary for creating these examples can be found
>>> in nnet_ms_a_online/conf/splice.conf file.
>>>
>>> Vijay
>>>
>>> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...>
>>> wrote:
>>>> The train_more*.sh scripts accept an 'exp' directory instead of a
>>>> 'data/train' directory. Is there another script that would accept the
>>>> 'data/train' directory as input instead?
>>>>
>>>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti
>>>> <p.v...@gm...> wrote:
>>>>> See the scripts steps/nnet2/train_more*.sh
>>>>>
>>>>> Vijay
>>>>>
>>>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
>>>>> wrote:
>>>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
>>>>>> model on a new dataset.
>>>>>>
>>>>>> I have prepared the files for this new dataset inside a data/train
>>>>>> directory, as described in the Data Preparation tutorial. I want to keep the
>>>>>> nnet2_a_online model initialized to the parameters it learned from training
>>>>>> on LibriSpeech, but continue its training on this new dataset. Is there a
>>>>>> script that would allow me to specify the nnet2_a_online model and the
>>>>>> dataset's data/train directory as input in order to output a model that has
>>>>>> been trained more on this new dataset?
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Monitor 25 network devices or servers for free with OpManager!
>>>>>> OpManager is web-based network management software that monitors
>>>>>> network devices and physical & virtual servers, alerts via email & sms
>>>>>> for fault. Monitor 25 devices for free with no restriction. Download
>>>>>> now
>>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>>>>> _______________________________________________
>>>>>> Kaldi-users mailing list
>>>>>> Kal...@li...
>>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>>>>>
>>
>> ------------------------------------------------------------------------------
>> Don't Limit Your Business. Reach for the Cloud.
>> GigeNET's Cloud Solutions provide you with the tools and support that
>> you need to offload your IT needs and focus on growing your business.
>> Configured For All Businesses. Start Your Cloud Today.
>> https://www.gigenetcloud.com/
>> _______________________________________________
>> Kaldi-users mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users


-- 
Dr A J Robinson, Founder
We are hiring: www.speechmatics.com/careers
Speechmatics is a trading name of Cantab Research Limited
Phone direct: 01223 778240 office: 01223 794497
Company reg no GB 05697423, VAT reg no 925606030
51 Canterbury Street, Cambridge, CB4 3QG, UK

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Daniel P. <dp...@gm...> - 2015-07-03 18:54:02

The sampling rate is critical, but the bitrate is not really critical-
just make sure it sounds OK without super-obvious artifacts.  Vassil
(cc'd) will know what bitrate he encoded the Librispeech data with,
but matching this exactly is probably not important.
Dan


On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...> wrote:
> The data I want to train on is in MP3 format at a 128kbps bitrate and a
> 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
> doesn't seem to have a specified bitrate, When I convert the MP3 files into
> 16kHz sample-rate WAV files, what bitrate should I convert them to?
>
> Is there anything else I should consider when converting the speech files?
>
> On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti
> <p.v...@gm...> wrote:
>>
>> You need to provide the egs directory, not exp directory. You can check
>> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs directory
>> can be created from the alignment and data directories.
>> The context variables necessary for creating these examples can be found
>> in nnet_ms_a_online/conf/splice.conf file.
>>
>> Vijay
>>
>> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...>
>> wrote:
>>>
>>> The train_more*.sh scripts accept an 'exp' directory instead of a
>>> 'data/train' directory. Is there another script that would accept the
>>> 'data/train' directory as input instead?
>>>
>>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti
>>> <p.v...@gm...> wrote:
>>>>
>>>> See the scripts steps/nnet2/train_more*.sh
>>>>
>>>> Vijay
>>>>
>>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
>>>> wrote:
>>>>>
>>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
>>>>> model on a new dataset.
>>>>>
>>>>> I have prepared the files for this new dataset inside a data/train
>>>>> directory, as described in the Data Preparation tutorial. I want to keep the
>>>>> nnet2_a_online model initialized to the parameters it learned from training
>>>>> on LibriSpeech, but continue its training on this new dataset. Is there a
>>>>> script that would allow me to specify the nnet2_a_online model and the
>>>>> dataset's data/train directory as input in order to output a model that has
>>>>> been trained more on this new dataset?
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Monitor 25 network devices or servers for free with OpManager!
>>>>> OpManager is web-based network management software that monitors
>>>>> network devices and physical & virtual servers, alerts via email & sms
>>>>> for fault. Monitor 25 devices for free with no restriction. Download
>>>>> now
>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>>>> _______________________________________________
>>>>> Kaldi-users mailing list
>>>>> Kal...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>>>>
>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>

Re: [Kaldi-users] discriminative LSTM training

From: Gupta V. <vis...@cr...> - 2015-07-03 18:10:44

Hi,


I was finally able to discriminatively train the LSTM, but only after reducing the learning rate from 0.0001 to 0.000000001. The training is rather slow. It took 8 days to train one iteration.


I have not tried to adjust gradient clipping threshold, but I will try it also.

Thanks,

Vishwa
  _____  

From: Jerry.Jiayu.DU [mailto:jer...@qq...]
To: Vishwa.Gupta [mailto:Vis...@cr...]
Cc: kal...@li... [mailto:kal...@li...], Daniel Povey [mailto:dp...@gm...]
Sent: Wed, 10 Jun 2015 23:57:51 -0500
Subject: Re: [Kaldi-users] discriminative LSTM training


Hi Vishwa,


"NaN" normally means your LSTM model is exploded during the training, Dan's suggestion to tune down the learning rate should be helpful in your case. 
 
Since I've encountered exactly the same problem as you, when I was doing the sequential training over LSTM, here is an additional suggestion:
Apply smaller gradient clipping threshold, it worked for me. I suggest you have a try as well, setting the gradient clipping threshold to 5 to 20 or so.


also remember to check the denominator lattice size if it is reasonable.  Sometimes default beam results very "sparse" denominator lattice (like linear), and in this case the sequential training won't work.


best,
Jiayu(Jerry)



------------------ Original ------------------

From:  "Daniel Povey";<dp...@gm...>;
Date:  Jun 11, 2015
To:  "Vishwa.Gupta"<Vis...@cr...>; 
Cc:  "kal...@li..."<kal...@li...>; 
Subject:  Re: [Kaldi-users] discriminative LSTM training

Usually cases like this where after a while you see NaN's, are due to
some kind of instability in the training, which causes the parameters
to diverge.  It could be due to too-high learning rates.  It could
also be that if you apply LSTMs on long pieces of audio, as happens in
the discriminative training code, there is some kind of gradient
explosion.  However, IIRC LSTMs were specifically designed to avoid
the possibility of gradient explosion, so this would be surprising.
You could try smaller learning rates.
Dan



> When I try to do discriminative LSTM training I get the following error:
>
> If I use train_mpe.sh, it runs for a few thousand utterances and then I get
> the following error:
>
> ERROR
> (nnet-train-mpe-sequential:LatticeForwardBackwardMpeVariants():lattice-functions.cc:833)
> Total forward score over lattice = -nan, while total backward score = 0
> and then the program crashes.
>
> If I use train_mmi.sh then after few thousand utterances I get logs with
> "nan":
>
>  VLOG[1] (nnet-train-mmi-sequential:main():nnet-train-mmi-sequential.cc:346)
> Utterance 20080401_170000_bbcone_bbc_news_spk-0025_seg-0150897:0151494:
> Average MMI obj. value = nan over 595 frames. (Avg. den-posterior on ali
> -nan)
>
> However, the program keeps on running.
> Is there a workaround for that?
>
> Thanks,
>
> Vishwa
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>

------------------------------------------------------------------------------
_______________________________________________
Kaldi-users mailing list
Kal...@li...
https://lists.sourceforge.net/lists/listinfo/kaldi-users

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Jonathan L <jon...@gm...> - 2015-07-03 17:45:23

The data I want to train on is in MP3 format at a 128kbps bitrate and a
44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but
doesn't seem to have a specified bitrate, When I convert the MP3 files into
16kHz sample-rate WAV files, what bitrate should I convert them to?

Is there anything else I should consider when converting the speech files?

On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti <
p.v...@gm...> wrote:

> You need to provide the egs directory, not exp directory. You can check
> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs
> directory can be created from the alignment and data directories.
> The context variables necessary for creating these examples can be found
> in nnet_ms_a_online/conf/splice.conf file.
>
> Vijay
>
> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...>
> wrote:
>
>> The train_more*.sh scripts accept an 'exp' directory instead of a
>> 'data/train' directory. Is there another script that would accept the
>> 'data/train' directory as input instead?
>>
>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti <
>> p.v...@gm...> wrote:
>>
>>> See the scripts steps/nnet2/train_more*.sh
>>>
>>> Vijay
>>>
>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
>>> wrote:
>>>
>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
>>>> model on a new dataset.
>>>>
>>>> I have prepared the files for this new dataset inside a data/train
>>>> directory, as described in the *Data Preparation *tutorial. I want to
>>>> keep the nnet2_a_online model initialized to the parameters it learned from
>>>> training on LibriSpeech, but continue its training on this new dataset. Is
>>>> there a script that would allow me to specify the nnet2_a_online model and
>>>> the dataset's data/train directory as input in order to output a model that
>>>> has been trained more on this new dataset?
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Monitor 25 network devices or servers for free with OpManager!
>>>> OpManager is web-based network management software that monitors
>>>> network devices and physical & virtual servers, alerts via email & sms
>>>> for fault. Monitor 25 devices for free with no restriction. Download now
>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>>> _______________________________________________
>>>> Kaldi-users mailing list
>>>> Kal...@li...
>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>>>
>>>>
>>>
>>
>

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Daniel P. <dp...@gm...> - 2015-07-02 18:25:53

>
> Back to training on the Blizzard dataset, I was able to dump the iVectors
> for Blizzard's 19-hour subset. Where are they needed, though? Neither
> train_more2.sh nor get_egs2.sh seem to accept dumped iVectors as input.

It's  the --online-ivector-dir option.

> Regardless, I ran the train_more2.sh script on Blizzard's data/ and egs/
> folder (generated with get_egs2.sh), and I get the following errors in
> train.*.*.log:
>
> KALDI_ASSERT: at nnet-train-parallel:FormatNnetInput:nnet-update.cc:212,
> failed:
> data[0].input_frames.NumRows() >= num_splice
> [...]
> LOG (nnet-train-parallel:DoBackprop():nnet-update.cc:275) Error doing
> backprop, nnet info is: num-components 17
> num-updatable-components 5
> left-context 7
> right-context 7
> input-dim 140
> output-dim 5816
> parameter-dim 10351000
> [...]
>
> The logs tell me that the left and right contexts were set to 7. However, I
> specified them both to 3 when running get_egs2.sh. The
> egs/info/{left,right}_context files even confirm that they are set to 3. Is
> it possible that train_more2.sh is using the contexts from another
> directory?

The problem is that 3 < 7.  The neural net requires a certain amount
of temporal context (7 left and right, here) and if you dump less than
that in the egs it will crash.  So you need to set them to 7 when
dumping egs.

Dan



> On Tue, Jun 30, 2015 at 2:07 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> Check the script that generated it; probably the graph directory was
>> in a different location e.g. in tri6 or something like that.
>> Hopefully we would have uploaded that too.
>> We only need to regenerate the graph when the tree changes.
>> Dan
>>
>>
>> On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...> wrote:
>> > To ensure that the nnet_a_online model is performing well on the 19-hour
>> > Blizzard dataset and that it is producing correct alignments, I want to
>> > run
>> > the decoding script on the Blizzard data. However, the nnet_a_online
>> > model
>> > on kadi-asr.org doesn't seem to have a graph directory needed for
>> > decoding.
>> > Is there any way I can get a hold of this directory without training the
>> > entire model?
>
>

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Mate A. <ele...@gm...> - 2015-07-02 18:22:57

The graph was indeed in tri6b.

Back to training on the Blizzard dataset, I was able to dump the iVectors
for Blizzard's 19-hour subset. Where are they needed, though? Neither
*train_more2.sh* nor *get_egs2.sh* seem to accept dumped iVectors as input.

Regardless, I ran the *train_more2.sh* script on Blizzard's data/ and egs/
folder (generated with *get_egs2.sh*), and I get the following errors in
train.*.*.log:

KALDI_ASSERT: at nnet-train-parallel:FormatNnetInput:nnet-update.cc:212,
failed:
data[0].input_frames.NumRows() >= num_splice
[...]
LOG (nnet-train-parallel:DoBackprop():nnet-update.cc:275) Error doing
backprop, nnet info is: num-components 17
num-updatable-components 5
left-context 7
right-context 7
input-dim 140
output-dim 5816
parameter-dim 10351000
[...]

The logs tell me that the left and right contexts were set to 7. However, I
specified them both to 3 when running *get_egs2.sh*. The
*egs/info/{left,right}_context* files even confirm that they are set to 3.
Is it possible that train_more2.sh is using the contexts from another
directory?

On Tue, Jun 30, 2015 at 2:07 PM, Daniel Povey <dp...@gm...> wrote:

> Check the script that generated it; probably the graph directory was
> in a different location e.g. in tri6 or something like that.
> Hopefully we would have uploaded that too.
> We only need to regenerate the graph when the tree changes.
> Dan
>
>
> On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...> wrote:
> > To ensure that the nnet_a_online model is performing well on the 19-hour
> > Blizzard dataset and that it is producing correct alignments, I want to
> run
> > the decoding script on the Blizzard data. However, the nnet_a_online
> model
> > on kadi-asr.org doesn't seem to have a graph directory needed for
> decoding.
> > Is there any way I can get a hold of this directory without training the
> > entire model?
>

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Daniel P. <dp...@gm...> - 2015-06-30 18:07:24

Check the script that generated it; probably the graph directory was
in a different location e.g. in tri6 or something like that.
Hopefully we would have uploaded that too.
We only need to regenerate the graph when the tree changes.
Dan

On Tue, Jun 30, 2015 at 2:05 PM, Mate Andre <ele...@gm...> wrote:
> To ensure that the nnet_a_online model is performing well on the 19-hour
> Blizzard dataset and that it is producing correct alignments, I want to run
> the decoding script on the Blizzard data. However, the nnet_a_online model
> on kadi-asr.org doesn't seem to have a graph directory needed for decoding.
> Is there any way I can get a hold of this directory without training the
> entire model?

Re: [Kaldi-users] i-vector utility for short utterances by random speakers for nnet2

From: Daniel P. <dp...@gm...> - 2015-06-30 18:05:42

It is still useful.  i-vector adaptation does not need very much data
to be effective.
Dan


On Tue, Jun 30, 2015 at 1:19 AM, Kirill Katsnelson
<kir...@sm...> wrote:
> In my scenario, I am processing multiple short (1-5 words typical) utterances by new unfamiliar speakers. This is a phone conversation between the system and a random caller. The caller connects, chats for a while (10 these short utterances is already a lot) and disappears forever. Do you find i-vector adaptation useful in such a scenario for nnet2-online models? Is the fact that training utterances are typically much longer will be rather detrimental?
>
>  -kkm
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Mate A. <ele...@gm...> - 2015-06-30 18:05:28

To ensure that the nnet_a_online model is performing well on the 19-hour
Blizzard dataset and that it is producing correct alignments, I want to run
the decoding script on the Blizzard data. However, the nnet_a_online model
on kadi-asr.org <http://kaldi-asr.org/> doesn't seem to have a *graph
*directory
needed for decoding. Is there any way I can get a hold of this directory
without training the entire model?

[Kaldi-users] i-vector utility for short utterances by random speakers for nnet2

From: Kirill K. <kir...@sm...> - 2015-06-30 05:20:08

In my scenario, I am processing multiple short (1-5 words typical) utterances by new unfamiliar speakers. This is a phone conversation between the system and a random caller. The caller connects, chats for a while (10 these short utterances is already a lot) and disappears forever. Do you find i-vector adaptation useful in such a scenario for nnet2-online models? Is the fact that training utterances are typically much longer will be rather detrimental?

 -kkm

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Daniel P. <dp...@gm...> - 2015-06-29 20:28:19

> I am using the nnet_a_online model. Does this model require iVectors?

Yes it does.  You would have to extract them for your data- see the
commands used in the script that trained the nnet_a_online model.
You'd have to download the iVector extractor.  You may also have to
dump 40-dim features.  That will be in the script to- also see the
_common.sh script which it sources near the beginning.  Some parts are
in there.

> Also, regarding the left and right contexts for get_egs2.sh, would I have to
> use the values found in nnet_a_online/conf/splice.conf (--left-context=3,
> --right-context=3) ?

No, I don't think so; you have to use the left and right contexts of
the nnet model itself.  These are printed out by nnet-am-info.

Dan


>
> Thank you for the advice. I am tackling a 19-hour subset of Blizzard before
> moving on to the full, 300-hour dataset.
>
> On Mon, Jun 29, 2015 at 2:42 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> Actually, you should probably be using train_more2.sh.  It looks like
>> the update_nnet.sh script is deprecated.
>> train_more2.sh requires egs dumped by get_egs2.sh.  [the "2" format of
>> the egs is more compact.]
>> In your scenario you would be dumping egs for the blizzard data.  You
>> would need alignments for the blizzard data.  Be careful with the
>> get_egs2.sh script because like other get_egs scripts, it will dump
>> egs with the left-context and right context you specify, and the
>> features you give it, but it can't check that it's correct.   If you
>> are using one of the "online" models that uses ivectors you would have
>> to provide dumped ivectors, and these need to be computed with the
>> same iVector extractor as the model that you are starting from.
>>
>> You might want to run on a small subset first; make sure that the
>> training objective (e.g. in compute_train_prob.*.sh) is in the normal
>> range, otherwise it may mean that you did something wrong.
>>
>> To get the alignments  you would need to align using the same model as
>> was used to align the data for training the original nnet- you can
>> download that from kaldi-asr.org.
>> Dan
>>
>>
>> On Mon, Jun 29, 2015 at 7:03 AM, Mate Andre <ele...@gm...> wrote:
>> > The train_more.sh script requires an egs directory, which seems to be
>> > created by update_nnet.sh. However, update_nnet.sh requires an
>> > alignments
>> > directory.
>> >
>> > If I'm planning to run update_nnet.sh with data/train_960, does that
>> > mean I
>> > have to find alignments for train_960 before running update_nnet.sh? Is
>> > there a faster way to generate the egs directory without having to
>> > update
>> > the neural net?
>> >
>> > On Thu, Jun 25, 2015 at 2:26 PM, Daniel Povey <dp...@gm...> wrote:
>> >>
>> >> I think the script train_more.sh might be useful here.
>> >> If you only have 1 GPU it might take as long as a week, but
>> >> downloading the trained models might be a better idea.
>> >> Dan
>> >>
>> >>
>> >>
>> >> > l am going to train a deep neural net model with "multi-splice" using
>> >> > the
>> >> > LibriSpeech dataset with the local/online/run_nnet2_ms.sh script
>> >> > included in
>> >> > Kaldi's repository, which I think will give the best resulting WER.
>> >> > The
>> >> > end
>> >> > goal is to use the trained model in this phase for initializing a
>> >> > next
>> >> > model
>> >> > to train and do forced alignment on Blizzard2013 dataset,
>> >> > specifically
>> >> > the
>> >> > 2013-EH2 subset including 1 female speaker, 19 hours of speech and
>> >> > sentence-level alignments.
>> >> > I don't have much of experience with Kaldi and my questions are:
>> >> >
>> >> > 1. How long does it take to train on all (960hrs) of Librispeech on a
>> >> > GPU
>> >> > (say GTX TITAN X or K6000)? Even a rough estimate could be useful.
>> >> > 2. Is there anything to take into account before training on
>> >> > Librispeech?
>> >> > 3. And more importantly, how should I initialize/train the next model
>> >> > for
>> >> > the Blizzard2013 dataset? I managed to go through data preparation
>> >> > for
>> >> > that
>> >> > and created the necessary files.
>> >> >
>> >> >
>> >> >
>> >> > ------------------------------------------------------------------------------
>> >> > Monitor 25 network devices or servers for free with OpManager!
>> >> > OpManager is web-based network management software that monitors
>> >> > network devices and physical & virtual servers, alerts via email &
>> >> > sms
>> >> > for fault. Monitor 25 devices for free with no restriction. Download
>> >> > now
>> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> >> > _______________________________________________
>> >> > Kaldi-users mailing list
>> >> > Kal...@li...
>> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >> >
>> >
>> >
>
>

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Mate A. <ele...@gm...> - 2015-06-29 20:25:52

I am using the nnet_a_online model. Does this model require iVectors?

Also, regarding the left and right contexts for get_egs2.sh, would I have
to use the values found in nnet_a_online/conf/splice.conf (--left-context=3,
--right-context=3) ?

Thank you for the advice. I am tackling a 19-hour subset of Blizzard before
moving on to the full, 300-hour dataset.

On Mon, Jun 29, 2015 at 2:42 PM, Daniel Povey <dp...@gm...> wrote:

> Actually, you should probably be using train_more2.sh.  It looks like
> the update_nnet.sh script is deprecated.
> train_more2.sh requires egs dumped by get_egs2.sh.  [the "2" format of
> the egs is more compact.]
> In your scenario you would be dumping egs for the blizzard data.  You
> would need alignments for the blizzard data.  Be careful with the
> get_egs2.sh script because like other get_egs scripts, it will dump
> egs with the left-context and right context you specify, and the
> features you give it, but it can't check that it's correct.   If you
> are using one of the "online" models that uses ivectors you would have
> to provide dumped ivectors, and these need to be computed with the
> same iVector extractor as the model that you are starting from.
>
> You might want to run on a small subset first; make sure that the
> training objective (e.g. in compute_train_prob.*.sh) is in the normal
> range, otherwise it may mean that you did something wrong.
>
> To get the alignments  you would need to align using the same model as
> was used to align the data for training the original nnet- you can
> download that from kaldi-asr.org.
> Dan
>
>
> On Mon, Jun 29, 2015 at 7:03 AM, Mate Andre <ele...@gm...> wrote:
> > The train_more.sh script requires an egs directory, which seems to be
> > created by update_nnet.sh. However, update_nnet.sh requires an alignments
> > directory.
> >
> > If I'm planning to run update_nnet.sh with data/train_960, does that
> mean I
> > have to find alignments for train_960 before running update_nnet.sh? Is
> > there a faster way to generate the egs directory without having to update
> > the neural net?
> >
> > On Thu, Jun 25, 2015 at 2:26 PM, Daniel Povey <dp...@gm...> wrote:
> >>
> >> I think the script train_more.sh might be useful here.
> >> If you only have 1 GPU it might take as long as a week, but
> >> downloading the trained models might be a better idea.
> >> Dan
> >>
> >>
> >>
> >> > l am going to train a deep neural net model with "multi-splice" using
> >> > the
> >> > LibriSpeech dataset with the local/online/run_nnet2_ms.sh script
> >> > included in
> >> > Kaldi's repository, which I think will give the best resulting WER.
> The
> >> > end
> >> > goal is to use the trained model in this phase for initializing a next
> >> > model
> >> > to train and do forced alignment on Blizzard2013 dataset, specifically
> >> > the
> >> > 2013-EH2 subset including 1 female speaker, 19 hours of speech and
> >> > sentence-level alignments.
> >> > I don't have much of experience with Kaldi and my questions are:
> >> >
> >> > 1. How long does it take to train on all (960hrs) of Librispeech on a
> >> > GPU
> >> > (say GTX TITAN X or K6000)? Even a rough estimate could be useful.
> >> > 2. Is there anything to take into account before training on
> >> > Librispeech?
> >> > 3. And more importantly, how should I initialize/train the next model
> >> > for
> >> > the Blizzard2013 dataset? I managed to go through data preparation for
> >> > that
> >> > and created the necessary files.
> >> >
> >> >
> >> >
> ------------------------------------------------------------------------------
> >> > Monitor 25 network devices or servers for free with OpManager!
> >> > OpManager is web-based network management software that monitors
> >> > network devices and physical & virtual servers, alerts via email & sms
> >> > for fault. Monitor 25 devices for free with no restriction. Download
> now
> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> >> > _______________________________________________
> >> > Kaldi-users mailing list
> >> > Kal...@li...
> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
> >> >
> >
> >
>

Re: [Kaldi-users] LibriSpeech DNN forced alignments

From: Daniel P. <dp...@gm...> - 2015-06-29 19:04:14

That's OK.  Those warnings happen when words in the "text" file are
not covered in the words.txt (they get replaced with the designated
OOV word), but all those words are either super-rare words or
mis-spellings or normalization failures, so it's not a problem that
they are  not in the vocabulary.
Dan


On Mon, Jun 29, 2015 at 6:45 AM, Mate Andre <ele...@gm...> wrote:
>  The alignments script as been running for about a day and I've found these
> warnings in align.*.log:
>
> sym2int.pl: replacing HIGGINSES with 2
> sym2int.pl: replacing MEASTERS with 2
> sym2int.pl: replacing YO'RS with 2
> sym2int.pl: replacing HIGGINSES with 2
> sym2int.pl: replacing THEVENOT with 2
> sym2int.pl: replacing PASQUA with 2
> sym2int.pl: replacing COCHINEALS with 2
> sym2int.pl: replacing HAMPER'S with 2
> sym2int.pl: replacing HUNDRED'LL with 2
> sym2int.pl: replacing CLEMMING with 2
> sym2int.pl: replacing CLEMMING with 2
> sym2int.pl: replacing HOU'D with 2
> sym2int.pl: replacing OURSEL with 2
> sym2int.pl: replacing SWOUNDING with 2
> sym2int.pl: replacing DID' with 2
> sym2int.pl: replacing INSTINCTLY with 2
> sym2int.pl: replacing DEFYINGLY with 2
> sym2int.pl: replacing BELIEVE' with 2
> sym2int.pl: replacing BROSSEN with 2
> sym2int.pl: replacing CLEAVINGS with 2
> sym2int.pl: not warning for OOVs any more times
>
> Can these warnings be safely ignored, or am I possibly using the wrong lang
> directory? I'm currently using data/lang_nosp.
>
>
> On Fri, Jun 26, 2015 at 6:05 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> Use the tree from the regular nnet_a directory- the system has the same
>> tree.
>> Dan
>>
>>
>> On Fri, Jun 26, 2015 at 5:55 PM, Mate Andre <ele...@gm...> wrote:
>> > The "tree" file is missing from the nnet_a_online directory in the
>> > Kaldi-ASR
>> > build. Is it possible to create it without retraining the entire model?
>> >
>> > On Fri, Jun 26, 2015 at 5:02 PM, Daniel Povey <dp...@gm...> wrote:
>> >>
>> >> You need to point it to the nnet_a_online directory instead.
>> >> Dan
>> >>
>> >>
>> >> On Fri, Jun 26, 2015 at 4:59 PM, Mate Andre <ele...@gm...>
>> >> wrote:
>> >> > Thanks for the prompt reply.
>> >> >
>> >> > When using steps/online/nnet2/align.sh, I get the following error:
>> >> > "no
>> >> > such
>> >> > file exp/nnet2_online/nnet_a/conf/online_nnet2_decoding.conf". Do I
>> >> > need
>> >> > to
>> >> > generate "online_nnet2_decoding.conf" and the "conf" directory with
>> >> > another
>> >> > script, since they aren't included in the Kaldi-ASR build?
>> >> >
>> >> > On Fri, Jun 26, 2015 at 4:44 PM, Daniel Povey <dp...@gm...>
>> >> > wrote:
>> >> >>
>> >> >> It expects 140 because 140 = 40 + 100, the 40 is the "hires" MFCC
>> >> >> features (the Librispeech scripts create these from the wav data),
>> >> >> and
>> >> >> the 100 is the iVector features.  You would have to get these from
>> >> >> the
>> >> >> iVector extractor.
>> >> >> However, you may find your life is easier if you use
>> >> >> steps/online/nnet2/align.sh, that will start from the wav data and
>> >> >> do
>> >> >> the feature extraction itself.
>> >> >> Dan
>> >> >>
>> >> >>
>> >> >> On Fri, Jun 26, 2015 at 4:41 PM, Mate Andre <ele...@gm...>
>> >> >> wrote:
>> >> >> > My goal is to find alignments for the 960-hour LibriSpeech
>> >> >> > dataset. I
>> >> >> > am
>> >> >> > using the nnet2_online/nnet_a LibriSpeech model from the Kaldi-ASR
>> >> >> > site,
>> >> >> > and
>> >> >> > I am running the steps/nnet2/align.sh script in Kaldi's
>> >> >> > LibriSpeech
>> >> >> > folder
>> >> >> > using the following command:
>> >> >> >
>> >> >> > steps/nnet2/align.sh --nj 10 --cmd 'run.pl' data/train_960
>> >> >> > data/lang_nosp
>> >> >> > exp/nnet2_online/nnet_a exp/nnet2_online/nnet_a_ali
>> >> >> >
>> >> >> > where exp/nnet2_online/nnet_a contains the files in
>> >> >> > nnet2_online/nnet_a
>> >> >> > and
>> >> >> > exp/nnet2_online/nnet_a_ali is an empty directory.
>> >> >> >
>> >> >> > I'm getting the following error in the log files:
>> >> >> >
>> >> >> > ERROR (nnet-align-compiled:NnetComputer():nnet-compute.cc:70)
>> >> >> > Feature
>> >> >> > dimension is 13 but network expects 140
>> >> >> >
>> >> >> > Am I using the correct script to generate the alignments, or is
>> >> >> > there
>> >> >> > another reason I am getting this error?
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > ------------------------------------------------------------------------------
>> >> >> > Monitor 25 network devices or servers for free with OpManager!
>> >> >> > OpManager is web-based network management software that monitors
>> >> >> > network devices and physical & virtual servers, alerts via email &
>> >> >> > sms
>> >> >> > for fault. Monitor 25 devices for free with no restriction.
>> >> >> > Download
>> >> >> > now
>> >> >> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> >> >> > _______________________________________________
>> >> >> > Kaldi-users mailing list
>> >> >> > Kal...@li...
>> >> >> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>

Re: [Kaldi-users] Adapting LibriSpeech model to Blizzard2013 corpus

From: Daniel P. <dp...@gm...> - 2015-06-29 18:42:52

Actually, you should probably be using train_more2.sh.  It looks like
the update_nnet.sh script is deprecated.
train_more2.sh requires egs dumped by get_egs2.sh.  [the "2" format of
the egs is more compact.]
In your scenario you would be dumping egs for the blizzard data.  You
would need alignments for the blizzard data.  Be careful with the
get_egs2.sh script because like other get_egs scripts, it will dump
egs with the left-context and right context you specify, and the
features you give it, but it can't check that it's correct.   If you
are using one of the "online" models that uses ivectors you would have
to provide dumped ivectors, and these need to be computed with the
same iVector extractor as the model that you are starting from.

You might want to run on a small subset first; make sure that the
training objective (e.g. in compute_train_prob.*.sh) is in the normal
range, otherwise it may mean that you did something wrong.

To get the alignments  you would need to align using the same model as
was used to align the data for training the original nnet- you can
download that from kaldi-asr.org.
Dan

On Mon, Jun 29, 2015 at 7:03 AM, Mate Andre <ele...@gm...> wrote:
> The train_more.sh script requires an egs directory, which seems to be
> created by update_nnet.sh. However, update_nnet.sh requires an alignments
> directory.
>
> If I'm planning to run update_nnet.sh with data/train_960, does that mean I
> have to find alignments for train_960 before running update_nnet.sh? Is
> there a faster way to generate the egs directory without having to update
> the neural net?
>
> On Thu, Jun 25, 2015 at 2:26 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> I think the script train_more.sh might be useful here.
>> If you only have 1 GPU it might take as long as a week, but
>> downloading the trained models might be a better idea.
>> Dan
>>
>>
>>
>> > l am going to train a deep neural net model with "multi-splice" using
>> > the
>> > LibriSpeech dataset with the local/online/run_nnet2_ms.sh script
>> > included in
>> > Kaldi's repository, which I think will give the best resulting WER. The
>> > end
>> > goal is to use the trained model in this phase for initializing a next
>> > model
>> > to train and do forced alignment on Blizzard2013 dataset, specifically
>> > the
>> > 2013-EH2 subset including 1 female speaker, 19 hours of speech and
>> > sentence-level alignments.
>> > I don't have much of experience with Kaldi and my questions are:
>> >
>> > 1. How long does it take to train on all (960hrs) of Librispeech on a
>> > GPU
>> > (say GTX TITAN X or K6000)? Even a rough estimate could be useful.
>> > 2. Is there anything to take into account before training on
>> > Librispeech?
>> > 3. And more importantly, how should I initialize/train the next model
>> > for
>> > the Blizzard2013 dataset? I managed to go through data preparation for
>> > that
>> > and created the necessary files.
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Monitor 25 network devices or servers for free with OpManager!
>> > OpManager is web-based network management software that monitors
>> > network devices and physical & virtual servers, alerts via email & sms
>> > for fault. Monitor 25 devices for free with no restriction. Download now
>> > http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> > _______________________________________________
>> > Kaldi-users mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-users
>> >
>
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Vijayaditya P. <p.v...@gm...> - 2015-06-29 16:25:03

You need to provide the egs directory, not exp directory. You can check
stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs
directory can be created from the alignment and data directories.
The context variables necessary for creating these examples can be found in
nnet_ms_a_online/conf/splice.conf file.

Vijay

On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm...>
wrote:

> The train_more*.sh scripts accept an 'exp' directory instead of a
> 'data/train' directory. Is there another script that would accept the
> 'data/train' directory as input instead?
>
> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti <
> p.v...@gm...> wrote:
>
>> See the scripts steps/nnet2/train_more*.sh
>>
>> Vijay
>>
>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
>> wrote:
>>
>>> I'm looking to further train an existing LibriSpeech nnet2_a_online
>>> model on a new dataset.
>>>
>>> I have prepared the files for this new dataset inside a data/train
>>> directory, as described in the *Data Preparation *tutorial. I want to
>>> keep the nnet2_a_online model initialized to the parameters it learned from
>>> training on LibriSpeech, but continue its training on this new dataset. Is
>>> there a script that would allow me to specify the nnet2_a_online model and
>>> the dataset's data/train directory as input in order to output a model that
>>> has been trained more on this new dataset?
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Monitor 25 network devices or servers for free with OpManager!
>>> OpManager is web-based network management software that monitors
>>> network devices and physical & virtual servers, alerts via email & sms
>>> for fault. Monitor 25 devices for free with no restriction. Download now
>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>> _______________________________________________
>>> Kaldi-users mailing list
>>> Kal...@li...
>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>>
>>>
>>
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Jonathan L <jon...@gm...> - 2015-06-29 16:14:37

The train_more*.sh scripts accept an 'exp' directory instead of a
'data/train' directory. Is there another script that would accept the
'data/train' directory as input instead?

On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti <
p.v...@gm...> wrote:

> See the scripts steps/nnet2/train_more*.sh
>
> Vijay
>
> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
> wrote:
>
>> I'm looking to further train an existing LibriSpeech nnet2_a_online model
>> on a new dataset.
>>
>> I have prepared the files for this new dataset inside a data/train
>> directory, as described in the *Data Preparation *tutorial. I want to
>> keep the nnet2_a_online model initialized to the parameters it learned from
>> training on LibriSpeech, but continue its training on this new dataset. Is
>> there a script that would allow me to specify the nnet2_a_online model and
>> the dataset's data/train directory as input in order to output a model that
>> has been trained more on this new dataset?
>>
>>
>> ------------------------------------------------------------------------------
>> Monitor 25 network devices or servers for free with OpManager!
>> OpManager is web-based network management software that monitors
>> network devices and physical & virtual servers, alerts via email & sms
>> for fault. Monitor 25 devices for free with no restriction. Download now
>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>> _______________________________________________
>> Kaldi-users mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>
>>
>

Re: [Kaldi-users] LibriSpeech nnet2 model: training more on a new dataset

From: Vijayaditya P. <p.v...@gm...> - 2015-06-29 16:09:00

See the scripts steps/nnet2/train_more*.sh

Vijay

On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L <jon...@gm...>
wrote:

> I'm looking to further train an existing LibriSpeech nnet2_a_online model
> on a new dataset.
>
> I have prepared the files for this new dataset inside a data/train
> directory, as described in the *Data Preparation *tutorial. I want to
> keep the nnet2_a_online model initialized to the parameters it learned from
> training on LibriSpeech, but continue its training on this new dataset. Is
> there a script that would allow me to specify the nnet2_a_online model and
> the dataset's data/train directory as input in order to output a model that
> has been trained more on this new dataset?
>
>
> ------------------------------------------------------------------------------
> Monitor 25 network devices or servers for free with OpManager!
> OpManager is web-based network management software that monitors
> network devices and physical & virtual servers, alerts via email & sms
> for fault. Monitor 25 devices for free with no restriction. Download now
> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> _______________________________________________
> Kaldi-users mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>

1 message has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 2 3 4 .. 48 > >> (Page 2 of 48)

2011	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug (2)	Sep (1)	Oct (1)	Nov	Dec
2012	Jan	Feb	Mar (8)	Apr (4)	May (2)	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec
2013	Jan	Feb (2)	Mar (2)	Apr (7)	May (31)	Jun (40)	Jul (65)	Aug (37)	Sep (12)	Oct (57)	Nov (15)	Dec (35)
2014	Jan (3)	Feb (30)	Mar (57)	Apr (26)	May (49)	Jun (26)	Jul (63)	Aug (33)	Sep (20)	Oct (153)	Nov (62)	Dec (20)
2015	Jan (6)	Feb (21)	Mar (42)	Apr (33)	May (76)	Jun (102)	Jul (39)	Aug	Sep	Oct	Nov	Dec