From: Mate A. <ele...@gm...> - 2015-07-03 20:53:00
|
Thanks Tony, that makes sense. Does the train_more2.sh script implement early stopping using the validation set created in the egs/ directory? On Fri, Jul 3, 2015 at 3:39 PM, Tony Robinson <to...@ca...> wrote: > I may be missing something but I read the question as "When I convert > the MP3 files into 16kHz sample-rate WAV files, what bitrate should I > convert them to?" > > The answer is that they should be converted to 16bits per sample, 16kHz > mono files, so that's 256,000 bits per second. There's not a lot of > point in using more than 16 bits per sample as the mp3 quanitisation is > worse than this and there's not a lot of point in using less than 16bits > per sample as why throw information away. > > > Tony > > On 03/07/15 19:53, Daniel Povey wrote: > > The sampling rate is critical, but the bitrate is not really critical- > > just make sure it sounds OK without super-obvious artifacts. Vassil > > (cc'd) will know what bitrate he encoded the Librispeech data with, > > but matching this exactly is probably not important. > > Dan > > > > > > On Fri, Jul 3, 2015 at 10:45 AM, Jonathan L <jon...@gm...> > wrote: > >> The data I want to train on is in MP3 format at a 128kbps bitrate and a > >> 44.1kHz sample rate. The LibriSpeech data has a 16kHz sample rate, but > >> doesn't seem to have a specified bitrate, When I convert the MP3 files > into > >> 16kHz sample-rate WAV files, what bitrate should I convert them to? > >> > >> Is there anything else I should consider when converting the speech > files? > >> > >> On Mon, Jun 29, 2015 at 12:24 PM, Vijayaditya Peddinti > >> <p.v...@gm...> wrote: > >>> You need to provide the egs directory, not exp directory. You can check > >>> stage -3 of steps/nnet2/train_multisplice_accel2.sh to see how egs > directory > >>> can be created from the alignment and data directories. > >>> The context variables necessary for creating these examples can be > found > >>> in nnet_ms_a_online/conf/splice.conf file. > >>> > >>> Vijay > >>> > >>> On Mon, Jun 29, 2015 at 9:14 AM, Jonathan L <jon...@gm... > > > >>> wrote: > >>>> The train_more*.sh scripts accept an 'exp' directory instead of a > >>>> 'data/train' directory. Is there another script that would accept the > >>>> 'data/train' directory as input instead? > >>>> > >>>> On Mon, Jun 29, 2015 at 12:08 PM, Vijayaditya Peddinti > >>>> <p.v...@gm...> wrote: > >>>>> See the scripts steps/nnet2/train_more*.sh > >>>>> > >>>>> Vijay > >>>>> > >>>>> On Mon, Jun 29, 2015 at 9:02 AM, Jonathan L < > jon...@gm...> > >>>>> wrote: > >>>>>> I'm looking to further train an existing LibriSpeech nnet2_a_online > >>>>>> model on a new dataset. > >>>>>> > >>>>>> I have prepared the files for this new dataset inside a data/train > >>>>>> directory, as described in the Data Preparation tutorial. I want to > keep the > >>>>>> nnet2_a_online model initialized to the parameters it learned from > training > >>>>>> on LibriSpeech, but continue its training on this new dataset. Is > there a > >>>>>> script that would allow me to specify the nnet2_a_online model and > the > >>>>>> dataset's data/train directory as input in order to output a model > that has > >>>>>> been trained more on this new dataset? > >>>>>> > >>>>>> > >>>>>> > ------------------------------------------------------------------------------ > >>>>>> Monitor 25 network devices or servers for free with OpManager! > >>>>>> OpManager is web-based network management software that monitors > >>>>>> network devices and physical & virtual servers, alerts via email & > sms > >>>>>> for fault. Monitor 25 devices for free with no restriction. Download > >>>>>> now > >>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o > >>>>>> _______________________________________________ > >>>>>> Kaldi-users mailing list > >>>>>> Kal...@li... > >>>>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users > >>>>>> > >> > >> > ------------------------------------------------------------------------------ > >> Don't Limit Your Business. Reach for the Cloud. > >> GigeNET's Cloud Solutions provide you with the tools and support that > >> you need to offload your IT needs and focus on growing your business. > >> Configured For All Businesses. Start Your Cloud Today. > >> https://www.gigenetcloud.com/ > >> _______________________________________________ > >> Kaldi-users mailing list > >> Kal...@li... > >> https://lists.sourceforge.net/lists/listinfo/kaldi-users > >> > > > ------------------------------------------------------------------------------ > > Don't Limit Your Business. Reach for the Cloud. > > GigeNET's Cloud Solutions provide you with the tools and support that > > you need to offload your IT needs and focus on growing your business. > > Configured For All Businesses. Start Your Cloud Today. > > https://www.gigenetcloud.com/ > > _______________________________________________ > > Kaldi-users mailing list > > Kal...@li... > > https://lists.sourceforge.net/lists/listinfo/kaldi-users > > > -- > Dr A J Robinson, Founder > We are hiring: www.speechmatics.com/careers > Speechmatics is a trading name of Cantab Research Limited > Phone direct: 01223 778240 office: 01223 794497 > Company reg no GB 05697423, VAT reg no 925606030 > 51 Canterbury Street, Cambridge, CB4 3QG, UK > > > ------------------------------------------------------------------------------ > Don't Limit Your Business. Reach for the Cloud. > GigeNET's Cloud Solutions provide you with the tools and support that > you need to offload your IT needs and focus on growing your business. > Configured For All Businesses. Start Your Cloud Today. > https://www.gigenetcloud.com/ > _______________________________________________ > Kaldi-users mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-users > |