kaldi-developers Mailing List for Kaldi (Page 30)

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Kaldi - Build # 17 - Successful:

Check console output at http://jenkins.a2ialab.com/jenkins/job/Kaldi/17/ to view the results.
Great, thanks!
Dan

On Wed, Jul 10, 2013 at 2:35 AM, KERMORVANT, Christopher
<Chr...@a2...> wrote:
> Hi Dan,
>
>
> I have configured a jenkins server for continuos testing. It checks out the sources and run
>
> #cd tools;make; cd ..;
> cd src ;./configure;make depend; make
> cd src;make test
>
>
> The first line (building the tools) is done only for the first checkout.
>
>
> I will now configure the message sent to the kaldi-dev list; it will  send some test messages, sorry for the trouble.
>
> --
> Chris
> ________________________________________
> De : Daniel Povey [dp...@gm...]
> Envoyé : lundi 8 juillet 2013 18:23
> À : KERMORVANT, Christopher
> Cc : KERMORVANT, Christopher; kal...@li...
> Objet : Re: [Kaldi-users] Committed change to build setup
>
> That's an excellent idea!  It's something I had been hoping to do for
> a long time but never got around to.
> If you do, set it up to automatically send an email to the list
> kal...@li... (list cc'd) if compilation or
> tests fail.
>
> Dan
>
> On Mon, Jul 8, 2013 at 6:05 AM, Christopher Kermorvant
> <chr...@a2...> wrote:
>> Hi Dan,
>>
>>
>> We are using Kaldi for handwriting recognition research at A2iA. We would
>> like to support the development the software but we don't have many
>> resources internally. However, we could setup a continuous compilation
>> server such as jenkins (http://jenkins-ci.org/) to continuously validate the
>> compilation and the tests. This server would be public and synchronized with
>> the source code.
>>
>> Do you think it is interesting for the project ?
>>
>> thank you,
>>
>> --
>> Christopher Kermorvant
>> R&D Manager - A2iA - Paris
>>
>>
>> On 07/07/2013 04:07 PM, Mailing list used for User Communication and Updates
>> wrote:
>>>
>>> The compilation(including "make ext") is working OK for me too on Ubuntu
>>> 10.04.
>>> Only tried to run the online decoders(voxforge/online_demo) so far -
>>> everything seems to be fine with them.
>>>
>>> Vassil
>>>
>>>
>>> On Sun, Jul 7, 2013 at 5:31 AM, Daniel Povey <dp...@gm...> wrote:
>>>>
>>>> Everyone,
>>>> I have just merged from ^/sandbox/sharedlibs, where Jan Trmal, Ondrej
>>>> Platek and others have been working on different build scripts that
>>>> now support a shared-library option.  If anyone can test it and make
>>>> sure it still works for them it would be great.
>>>> If people have made local changes to their Makefiles they may get
>>>> conflicts.
>>>> Dan
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> This SF.net email is sponsored by Windows:
>>>
>>> Build for Windows Store.
>>>
>>> http://p.sf.net/sfu/windows-dev2dev
>>> _______________________________________________
>>> Kaldi-users mailing list
>>> Kal...@li...
>>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>>
>>
>>
>> --
>> Christopher Kermorvant
>> Responsable R&D - A2iA - France
>> 39 rue de la Bienfaisance - 75008 Paris
>> +33 (0) 144 420 086 / +33 (0) 689 513 601 (mobile)
>>

Hi Dan,

I have configured a jenkins server for continuos testing. It checks out the sources and run 

#cd tools;make; cd ..;
cd src ;./configure;make depend; make
cd src;make test

The first line (building the tools) is done only for the first checkout.

I will now configure the message sent to the kaldi-dev list; it will  send some test messages, sorry for the trouble.

-- 
Chris
________________________________________
De : Daniel Povey [dp...@gm...]
Envoyé : lundi 8 juillet 2013 18:23
À : KERMORVANT, Christopher
Cc : KERMORVANT, Christopher; kal...@li...
Objet : Re: [Kaldi-users] Committed change to build setup

That's an excellent idea!  It's something I had been hoping to do for
a long time but never got around to.
If you do, set it up to automatically send an email to the list
kal...@li... (list cc'd) if compilation or
tests fail.

Dan

On Mon, Jul 8, 2013 at 6:05 AM, Christopher Kermorvant
<chr...@a2...> wrote:
> Hi Dan,
>
>
> We are using Kaldi for handwriting recognition research at A2iA. We would
> like to support the development the software but we don't have many
> resources internally. However, we could setup a continuous compilation
> server such as jenkins (http://jenkins-ci.org/) to continuously validate the
> compilation and the tests. This server would be public and synchronized with
> the source code.
>
> Do you think it is interesting for the project ?
>
> thank you,
>
> --
> Christopher Kermorvant
> R&D Manager - A2iA - Paris
>
>
> On 07/07/2013 04:07 PM, Mailing list used for User Communication and Updates
> wrote:
>>
>> The compilation(including "make ext") is working OK for me too on Ubuntu
>> 10.04.
>> Only tried to run the online decoders(voxforge/online_demo) so far -
>> everything seems to be fine with them.
>>
>> Vassil
>>
>>
>> On Sun, Jul 7, 2013 at 5:31 AM, Daniel Povey <dp...@gm...> wrote:
>>>
>>> Everyone,
>>> I have just merged from ^/sandbox/sharedlibs, where Jan Trmal, Ondrej
>>> Platek and others have been working on different build scripts that
>>> now support a shared-library option.  If anyone can test it and make
>>> sure it still works for them it would be great.
>>> If people have made local changes to their Makefiles they may get
>>> conflicts.
>>> Dan
>>
>>
>> ------------------------------------------------------------------------------
>>
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Kaldi-users mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>
>
> --
> Christopher Kermorvant
> Responsable R&D - A2iA - France
> 39 rue de la Bienfaisance - 75008 Paris
> +33 (0) 144 420 086 / +33 (0) 689 513 601 (mobile)
>

That's an excellent idea!  It's something I had been hoping to do for
a long time but never got around to.
If you do, set it up to automatically send an email to the list
kal...@li... (list cc'd) if compilation or
tests fail.

Dan

On Mon, Jul 8, 2013 at 6:05 AM, Christopher Kermorvant
<chr...@a2...> wrote:
> Hi Dan,
>
>
> We are using Kaldi for handwriting recognition research at A2iA. We would
> like to support the development the software but we don't have many
> resources internally. However, we could setup a continuous compilation
> server such as jenkins (http://jenkins-ci.org/) to continuously validate the
> compilation and the tests. This server would be public and synchronized with
> the source code.
>
> Do you think it is interesting for the project ?
>
> thank you,
>
> --
> Christopher Kermorvant
> R&D Manager - A2iA - Paris
>
>
> On 07/07/2013 04:07 PM, Mailing list used for User Communication and Updates
> wrote:
>>
>> The compilation(including "make ext") is working OK for me too on Ubuntu
>> 10.04.
>> Only tried to run the online decoders(voxforge/online_demo) so far -
>> everything seems to be fine with them.
>>
>> Vassil
>>
>>
>> On Sun, Jul 7, 2013 at 5:31 AM, Daniel Povey <dp...@gm...> wrote:
>>>
>>> Everyone,
>>> I have just merged from ^/sandbox/sharedlibs, where Jan Trmal, Ondrej
>>> Platek and others have been working on different build scripts that
>>> now support a shared-library option.  If anyone can test it and make
>>> sure it still works for them it would be great.
>>> If people have made local changes to their Makefiles they may get
>>> conflicts.
>>> Dan
>>
>>
>> ------------------------------------------------------------------------------
>>
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Kaldi-users mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-users
>
>
>
> --
> Christopher Kermorvant
> Responsable R&D - A2iA - France
> 39 rue de la Bienfaisance - 75008 Paris
> +33 (0) 144 420 086 / +33 (0) 689 513 601 (mobile)
>

If you create a username at sourceforge, you could try it using the
"svn+ssh" method (it should be described on the same page as the
install instructions).  This will probably use a different port and
might skirt any firewall issues.
Dan

On Sat, Jun 29, 2013 at 7:47 AM, Arnab Ghoshal <ar...@gm...> wrote:
> That's odd. It works perfectly fine for me. By any chance are you
> trying this behind some firewall that may be blocking certain ports?
> -Arnab
>
> On Tue, Jun 25, 2013 at 6:16 PM, Tina Kohler
> <tin...@al...> wrote:
>> I've tried the instructions on http://kaldi.sourceforge.net/install.html on
>> 3 different computers specifically this command:
>> svn co svn://svn.code.sf.net/p/kaldi/code/stable kaldi-stable
>>
>> and each attempt gives me this error:
>>
>> svn: Can't connect to host 'svn.code.sf.net': Connection refused
>>
>> I can see the directory structure when I navigate to
>> http://svn.code.sf.net/p/kaldi/code/stable in Chrome, though.
>>
>> What am I doing wrong?
>>
>> Thanks,
>> Tina
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

That's odd. It works perfectly fine for me. By any chance are you
trying this behind some firewall that may be blocking certain ports?
-Arnab

On Tue, Jun 25, 2013 at 6:16 PM, Tina Kohler
<tin...@al...> wrote:
> I've tried the instructions on http://kaldi.sourceforge.net/install.html on
> 3 different computers specifically this command:
> svn co svn://svn.code.sf.net/p/kaldi/code/stable kaldi-stable
>
> and each attempt gives me this error:
>
> svn: Can't connect to host 'svn.code.sf.net': Connection refused
>
> I can see the directory structure when I navigate to
> http://svn.code.sf.net/p/kaldi/code/stable in Chrome, though.
>
> What am I doing wrong?
>
> Thanks,
> Tina
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Once you understand the whole hidden markov model formalism it should
be clear.  It's picking the best sequence of hidden states.
Dan

On Wed, Jun 26, 2013 at 4:03 PM, Arif Khan <ife...@gm...> wrote:
> Hi Dan,
>
> Thanks for quick reply!
>
>
> The training automatically figures out where it is
>       Can you be little more in detail. I have to train another corpus, and
> I need to know the internal mechanism.
> some source or reference will be helpful.
>
> Best regards,
> Arif
>
>
>
> On Wed, Jun 26, 2013 at 9:57 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> The silence is optional between each word, by default Kaldi scripts do
>> this in the lexicon.
>> The training automatically figures out where it is.
>> Dan
>>
>>
>> On Wed, Jun 26, 2013 at 3:53 PM, Arif Khan <ife...@gm...> wrote:
>> > Hi,
>> >
>> > In the WSJ0 transcripts specification, I did not find any thing for
>> > representing SILENCE,
>> > neither it is clear to me from the wsj example scripts  of kaldi that
>> > tells
>> > us how silence
>> > handled is training.
>> >
>> > My question is how we train the SILENCE for a corpus if it is not
>> > explicitly
>> > written
>> > in the transcripts of the training data.
>> >
>> > Best regards,
>> > Arif
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > This SF.net email is sponsored by Windows:
>> >
>> > Build for Windows Store.
>> >
>> > http://p.sf.net/sfu/windows-dev2dev
>> > _______________________________________________
>> > Kaldi-developers mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>> >
>
>

Hi Dan,

Thanks for quick reply!

The training automatically figures out where it is
      Can you be little more in detail. I have to train another corpus, and
I need to know the internal mechanism.
some source or reference will be helpful.

Best regards,
Arif

On Wed, Jun 26, 2013 at 9:57 PM, Daniel Povey <dp...@gm...> wrote:

> The silence is optional between each word, by default Kaldi scripts do
> this in the lexicon.
> The training automatically figures out where it is.
> Dan
>
>
> On Wed, Jun 26, 2013 at 3:53 PM, Arif Khan <ife...@gm...> wrote:
> > Hi,
> >
> > In the WSJ0 transcripts specification, I did not find any thing for
> > representing SILENCE,
> > neither it is clear to me from the wsj example scripts  of kaldi that
> tells
> > us how silence
> > handled is training.
> >
> > My question is how we train the SILENCE for a corpus if it is not
> explicitly
> > written
> > in the transcripts of the training data.
> >
> > Best regards,
> > Arif
> >
> >
> ------------------------------------------------------------------------------
> > This SF.net email is sponsored by Windows:
> >
> > Build for Windows Store.
> >
> > http://p.sf.net/sfu/windows-dev2dev
> > _______________________________________________
> > Kaldi-developers mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
> >
>

The silence is optional between each word, by default Kaldi scripts do
this in the lexicon.
The training automatically figures out where it is.
Dan

On Wed, Jun 26, 2013 at 3:53 PM, Arif Khan <ife...@gm...> wrote:
> Hi,
>
> In the WSJ0 transcripts specification, I did not find any thing for
> representing SILENCE,
> neither it is clear to me from the wsj example scripts  of kaldi that tells
> us how silence
> handled is training.
>
> My question is how we train the SILENCE for a corpus if it is not explicitly
> written
> in the transcripts of the training data.
>
> Best regards,
> Arif
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Hi,

In the WSJ0 transcripts specification, I did not find any thing for
representing SILENCE,
neither it is clear to me from the wsj example scripts  of kaldi that tells
us how silence
handled is training.

My question is how we train the SILENCE for a corpus if it is not
explicitly written
in the transcripts of the training data.

Best regards,
Arif

I've tried the instructions on http://kaldi.sourceforge.net/install.html on
3 different computers specifically this command:
svn co svn://svn.code.sf.net/p/kaldi/code/stable kaldi-stable

and each attempt gives me this error:

svn: Can't connect to host 'svn.code.sf.net': Connection refused

I can see the directory structure when I navigate to
http://svn.code.sf.net/p/kaldi/code/stable in Chrome, though.

What am I doing wrong?

Thanks,
Tina

If different data rate, then no.
If same data rate-- it's possible, but still difficult.
Dan

On Fri, Jun 21, 2013 at 3:23 PM, Arif Khan <ife...@gm...> wrote:
> Thanks Daniel for your quick response,
>
> One more question for my understanding, Is it possible to train on one
> corpus and than use this model and train it further on the second corpus.
> Off course both having data of different data rate (in my case).
>
> Some kind of fusing the two model.
>
> Best regards,
> Arif
>
>
> On Fri, Jun 21, 2013 at 9:12 PM, Daniel Povey <dp...@gm...> wrote:
>>
>> Hi,
>> In general this isn't possible-- you can't really share data from
>> different data rates.  People have tried various things for this in
>> the literature but it's not easy to do.  The most straightforward
>> thing is to down-sample to the same rate.
>> Dan
>>
>> On Fri, Jun 21, 2013 at 3:08 PM, Arif Khan <ife...@gm...> wrote:
>> > Dear Kaldi Developers,
>> >
>> >
>> > I am trying to train an acoustic model on different corpuses, but I at
>> > the
>> > end I want to have a single trained model.
>> >
>> > I want to train a model for WSJ and Aurora Corpus. The data rate of wav
>> > files is different and so I am having difficulty in
>> > computing the mfcc from single directory (which contains all the script
>> > files for both corpuses).
>> >
>> > My questions is:
>> >
>> > Is it possible to reuse the trained model of one corpus for other
>> > corpus.
>> >
>> > or
>> >
>> > Any other mechanism which can help me in finding out how to train a
>> > single
>> > model on two different corpuses have different
>> > characteristics of sound files like data rate.
>> >
>> > Thanks,
>> > Arif
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > This SF.net email is sponsored by Windows:
>> >
>> > Build for Windows Store.
>> >
>> > http://p.sf.net/sfu/windows-dev2dev
>> > _______________________________________________
>> > Kaldi-developers mailing list
>> > Kal...@li...
>> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>> >
>
>

Thanks Daniel for your quick response,

One more question for my understanding, Is it possible to train on one
corpus and than use this model and train it further on the second corpus.
Off course both having data of different data rate (in my case).

Some kind of fusing the two model.

Best regards,
Arif

On Fri, Jun 21, 2013 at 9:12 PM, Daniel Povey <dp...@gm...> wrote:

> Hi,
> In general this isn't possible-- you can't really share data from
> different data rates.  People have tried various things for this in
> the literature but it's not easy to do.  The most straightforward
> thing is to down-sample to the same rate.
> Dan
>
> On Fri, Jun 21, 2013 at 3:08 PM, Arif Khan <ife...@gm...> wrote:
> > Dear Kaldi Developers,
> >
> >
> > I am trying to train an acoustic model on different corpuses, but I at
> the
> > end I want to have a single trained model.
> >
> > I want to train a model for WSJ and Aurora Corpus. The data rate of wav
> > files is different and so I am having difficulty in
> > computing the mfcc from single directory (which contains all the script
> > files for both corpuses).
> >
> > My questions is:
> >
> > Is it possible to reuse the trained model of one corpus for other corpus.
> >
> > or
> >
> > Any other mechanism which can help me in finding out how to train a
> single
> > model on two different corpuses have different
> > characteristics of sound files like data rate.
> >
> > Thanks,
> > Arif
> >
> >
> ------------------------------------------------------------------------------
> > This SF.net email is sponsored by Windows:
> >
> > Build for Windows Store.
> >
> > http://p.sf.net/sfu/windows-dev2dev
> > _______________________________________________
> > Kaldi-developers mailing list
> > Kal...@li...
> > https://lists.sourceforge.net/lists/listinfo/kaldi-developers
> >
>

Hi,
In general this isn't possible-- you can't really share data from
different data rates.  People have tried various things for this in
the literature but it's not easy to do.  The most straightforward
thing is to down-sample to the same rate.
Dan

On Fri, Jun 21, 2013 at 3:08 PM, Arif Khan <ife...@gm...> wrote:
> Dear Kaldi Developers,
>
>
> I am trying to train an acoustic model on different corpuses, but I at the
> end I want to have a single trained model.
>
> I want to train a model for WSJ and Aurora Corpus. The data rate of wav
> files is different and so I am having difficulty in
> computing the mfcc from single directory (which contains all the script
> files for both corpuses).
>
> My questions is:
>
> Is it possible to reuse the trained model of one corpus for other corpus.
>
> or
>
> Any other mechanism which can help me in finding out how to train a single
> model on two different corpuses have different
> characteristics of sound files like data rate.
>
> Thanks,
> Arif
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Dear Kaldi Developers,

I am trying to train an acoustic model on different corpuses, but I at the
end I want to have a single trained model.

I want to train a model for WSJ and Aurora Corpus. The data rate of wav
files is different and so I am having difficulty in
computing the mfcc from single directory (which contains all the script
files for both corpuses).

My questions is:

Is it possible to reuse the trained model of one corpus for other corpus.

or

Any other mechanism which can help me in finding out how to train a single
model on two different corpuses have different
characteristics of sound files like data rate.

Thanks,
Arif

I think those are supposed to be the outputs of the script.  Perhaps
someone can send those to you but I don't have them conveniently right now.
Regardless, I don't recommend using that script.  I'm not sure if it's even
up to date.  It's better to start from scratch with Kaldi.

Dan

On Tue, May 21, 2013 at 1:44 AM, 蘇仲銘 <chu...@gm...> wrote:

> Hi,
> i have HTK acoustic model and phone.txt,and i want to convert it to
> kaldi format.
> i followed the file "convert_htk.sh" step by step.
> could you support me example file including
> "kaldi.topo","kaldi.am_gmm","kaldi.tree"?
> actually i had output the kaldi.topo and kaldi.am_gmm via acoustic
> model,but i don't need tree in my case.
> i have no idea what is the format of kaldi.tree.
>
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Hi,
i have HTK acoustic model and phone.txt,and i want to convert it to
kaldi format.
i followed the file "convert_htk.sh" step by step.
could you support me example file including
"kaldi.topo","kaldi.am_gmm","kaldi.tree"?
actually i had output the kaldi.topo and kaldi.am_gmm via acoustic
model,but i don't need tree in my case.
i have no idea what is the format of kaldi.tree.

Hi,
That sounds like a great idea.
You could make an interface called OptionsItf and put it in
itf/options-itf.h.  It would be necessary to use a sed command to replace
ParseOptions with OptionsItf throughout most of the code.  Send me a patch
that I can check, and your Sourceforge id, and I'll give you commit rights.
 I'll add you to the list.
Dan

On Fri, May 17, 2013 at 3:39 AM, Tanel Alumäe <tan...@ph...>wrote:

> Hello,
>
> I'm trying to write a Gstreamer plugin that wraps OnlineFasterDecoder.
> The plugin would allow to use OnlineFasterDecoder easily in all
> programming languages that have Gstreamer bingings (Python, Ruby, Java
> etc).
>
> The plugin is basically working but I was thinking how to make the
> decoding and feature extraction options available as plugin properties,
> so that they could be configures programmatically, instead of via the
> command line. For this, I need access to a list of options (name, type,
> help message) that have been "registered" in the code, and get/set
> access to the option values.
>
> My proposal is to introduce a new interface "Options" with only one
> method:
>
>   template<typename T>
>   void Register(const std::string &name,
>                 T *ptr, const std::string &doc);
>
>
> Current ParseOptions would be one implementation, but I would add
> another implementation that lets to set/get options via code.
>
> Also, please add me to the mailing list.
>
> Regards,
> Tanel Alumäe
> Tallinn University of Technology
>
>
>
> ------------------------------------------------------------------------------
> AlienVault Unified Security Management (USM) platform delivers complete
> security visibility with the essential security capabilities. Easily and
> efficiently configure, manage, and operate all of your security controls
> from a single console and one unified framework. Download a free trial.
> http://p.sf.net/sfu/alienvault_d2d
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>

Hello,

I'm trying to write a Gstreamer plugin that wraps OnlineFasterDecoder.
The plugin would allow to use OnlineFasterDecoder easily in all
programming languages that have Gstreamer bingings (Python, Ruby, Java
etc).

The plugin is basically working but I was thinking how to make the
decoding and feature extraction options available as plugin properties,
so that they could be configures programmatically, instead of via the
command line. For this, I need access to a list of options (name, type,
help message) that have been "registered" in the code, and get/set
access to the option values.

My proposal is to introduce a new interface "Options" with only one
method:

  template<typename T>
  void Register(const std::string &name,
                T *ptr, const std::string &doc);

Current ParseOptions would be one implementation, but I would add
another implementation that lets to set/get options via code.

Also, please add me to the mailing list.

Regards,
Tanel Alumäe
Tallinn University of Technology

The "power" parameter allocates Gaussians to states proportional to the
occupation count of the state, taken to this power.

Dan

On Fri, May 10, 2013 at 12:42 PM, Arif Khan <ife...@gm...> wrote:

> Hi,
>
> Thanks you very much for your reply. I have already found the scripts
> where one can introduce more than two topologies for hmm.
>
> Can we some how control or specify the #Gaussian per state, or How to find
> out the #Gaussian per state used in a given model file.
>
> Can you give me some brief explanation of the following option
>
> po.Register("power", &power, "If mixing up, power to allocate Gaussians
> to"" states.");
>
> which is found in the following two modules.
>
> 1. gmmbin/gmm-est.cc File Reference
> http://kaldi.sourceforge.net/gmm-est_8cc.html
> 2. gmmbin/gmm-mixup.cc File Reference
> http://kaldi.sourceforge.net/gmm-mixup_8cc.html
>
> Note:
>
> You sense very right, I have to redo the Aurora 2.0 experiments with
> Kaldi, Unfortunately the default implementation of Kaldi gave me
> greater WER, than
> reported in the paper. So that's why I have to come closer to the
> technique which they have adopted in Aurora 2.0 experiments with HTK.
>
> Their technique:
> Hmm Model: Word based hmm model not phone based.
> No of emitting states in HMM:  16 states for each digit, 3 states for
> Silence and 1 state for short pause (sp).
> # Gaussian per state: From 6 to 16 GM per state for digits, silence and sp.
>
> Any advice or suggestion will be welcomed and acknowledged.
>
> Regards,
> Arif
>
>
> On Fri, May 10, 2013 at 5:57 PM, Daniel Povey <dp...@gm...> wrote:
>
>> To use more than two HMM topology entries you would have to change the
>> scripts.
>> Kaldi automatically allocates the #Gaussians in each state.
>> I sense that you are trying to exactly duplicate an HTK system, which is
>> not a very good idea.
>> Dan
>>
>>
>>
>> On Fri, May 10, 2013 at 11:25 AM, Arif Khan <ife...@gm...> wrote:
>>
>>> Hi,
>>>
>>> I have question two question:
>>>
>>> 1. Can we use more than two hmm topology entries in Kaldi. For
>>> example "Silence phones", "Nonsilence phones" and "Short pause" if each one
>>> of them has different
>>> no of hmm states.
>>>
>>> 2. How can we define/control the no. of Gaussians per state for
>>> different topology entries. For example if I want to use
>>> for "Silence phones", "Nonsilence phones" and "Short pause" different no
>>> of Gaussians per state during the training.
>>>
>>> Any help will be highly appreciated.
>>>
>>> Regards,
>>> Arif
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>> _______________________________________________
>>> Kaldi-developers mailing list
>>> Kal...@li...
>>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>>
>>>
>>
>

Hi,

Thanks you very much for your reply. I have already found the scripts where
one can introduce more than two topologies for hmm.

Can we some how control or specify the #Gaussian per state, or How to find
out the #Gaussian per state used in a given model file.

Can you give me some brief explanation of the following option

po.Register("power", &power, "If mixing up, power to allocate Gaussians
to"" states.");

which is found in the following two modules.

1. gmmbin/gmm-est.cc File Reference
http://kaldi.sourceforge.net/gmm-est_8cc.html
2. gmmbin/gmm-mixup.cc File Reference
http://kaldi.sourceforge.net/gmm-mixup_8cc.html

Note:

You sense very right, I have to redo the Aurora 2.0 experiments with Kaldi,
Unfortunately the default implementation of Kaldi gave me greater WER, than
reported in the paper. So that's why I have to come closer to the technique
which they have adopted in Aurora 2.0 experiments with HTK.

Their technique:
Hmm Model: Word based hmm model not phone based.
No of emitting states in HMM:  16 states for each digit, 3 states for
Silence and 1 state for short pause (sp).
# Gaussian per state: From 6 to 16 GM per state for digits, silence and sp.

Any advice or suggestion will be welcomed and acknowledged.

Regards,
Arif

On Fri, May 10, 2013 at 5:57 PM, Daniel Povey <dp...@gm...> wrote:

> To use more than two HMM topology entries you would have to change the
> scripts.
> Kaldi automatically allocates the #Gaussians in each state.
> I sense that you are trying to exactly duplicate an HTK system, which is
> not a very good idea.
> Dan
>
>
>
> On Fri, May 10, 2013 at 11:25 AM, Arif Khan <ife...@gm...> wrote:
>
>> Hi,
>>
>> I have question two question:
>>
>> 1. Can we use more than two hmm topology entries in Kaldi. For
>> example "Silence phones", "Nonsilence phones" and "Short pause" if each one
>> of them has different
>> no of hmm states.
>>
>> 2. How can we define/control the no. of Gaussians per state for different
>> topology entries. For example if I want to use
>> for "Silence phones", "Nonsilence phones" and "Short pause" different no
>> of Gaussians per state during the training.
>>
>> Any help will be highly appreciated.
>>
>> Regards,
>> Arif
>>
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>>
>>
>

To use more than two HMM topology entries you would have to change the
scripts.
Kaldi automatically allocates the #Gaussians in each state.
I sense that you are trying to exactly duplicate an HTK system, which is
not a very good idea.
Dan

On Fri, May 10, 2013 at 11:25 AM, Arif Khan <ife...@gm...> wrote:

> Hi,
>
> I have question two question:
>
> 1. Can we use more than two hmm topology entries in Kaldi. For
> example "Silence phones", "Nonsilence phones" and "Short pause" if each one
> of them has different
> no of hmm states.
>
> 2. How can we define/control the no. of Gaussians per state for different
> topology entries. For example if I want to use
> for "Silence phones", "Nonsilence phones" and "Short pause" different no
> of Gaussians per state during the training.
>
> Any help will be highly appreciated.
>
> Regards,
> Arif
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>

Hi,

I have question two question:

1. Can we use more than two hmm topology entries in Kaldi. For
example "Silence phones", "Nonsilence phones" and "Short pause" if each one
of them has different
no of hmm states.

2. How can we define/control the no. of Gaussians per state for different
topology entries. For example if I want to use
for "Silence phones", "Nonsilence phones" and "Short pause" different no
of Gaussians per state during the training.

Any help will be highly appreciated.

Regards,
Arif

Dear Kaldi Developers,

I still couldn't success to read the alignment file.
I run the wsj s3 recipe, and try to read the alignment result
under directory: tri3b_ali_si284
using:

./bin/show-alignments phones.txt final.mdl ark:9.ali

I still couldn't find the time alignment.
Is there anyway to read both word and phoneme with
time alignment?

Thank you.
Sincerely yours,
Sakriani Sakti

On 2013/04/26 20:14, Arnab Ghoshal wrote:
> On Fri, Apr 26, 2013 at 8:52 AM, Sakriani Sakti <ss...@is...> wrote:
>> Dear Kaldi Developers,
>>
>> I'm very new to Kaldi. Currently, I finished re-do the wsj recipe.
>> My questions are:
>> - How to generate the Kaldi result in ctm format?
> Look at trunk/egs/swbd/s5b/local/score_sclite.sh
>
>> - How to read the *.ali files?
> Use ./bin/show-alignments
>
>> - If I want to try is to re-train the WSJ model with new data, and
>> we already have mlf files (phone and word with time information) of
>> new data generated from HTK, is there any way to convert it to Kaldi
>> alignment files format?
> Your best bet will be to extract just the raw word transcript (forget
> about the time information) and run the kaldi recipe from start. You
> could use ./featbin/copy-feats to convert HTK features to Kaldi
> format.
>
>> It would be great if you could help me for this.
>> Thank you.
>> Sincerely yours,
>> Sakriani Sakti
>>
>> ------------------------------------------------------------------------------
>> Try New Relic Now & We'll Send You this Cool Shirt
>> New Relic is the only SaaS-based application performance monitoring service
>> that delivers powerful full stack analytics. Optimize and monitor your
>> browser, app, & servers with just a few lines of code. Try New Relic
>> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
>> _______________________________________________
>> Kaldi-developers mailing list
>> Kal...@li...
>> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

Hi All,
it is also important to re-run ./configure script, since the kaldi.mk 
has to be updated
to put the "-lcublas" and "-lcudart" at the end of the tool 
building/linking g++ calls.

K.

Dne 9.5.2013 4:07, Daniel Povey napsal(a):
> Thanks.
> This issue has been fixed.  I suspect you are using an out-of-date 
> version of Kaldi.  If you have done "svn up" and are still at revision 
> 2256 or thereabouts, it means you have not updated your repository to 
> the "new" sourceforge.  See instructions at kaldi.sf.net 
> <http://kaldi.sf.net> on how to change.
> Dan
>
>
>
> On Wed, May 8, 2013 at 9:51 PM, Chao Weng <cw...@gm... 
> <mailto:cw...@gm...>> wrote:
>
>     Hi All,
>
>     I just finished setting up the GPGPU environment and running the
>     experiments in the Kaldi. In the beginning, I could not compile
>     through all the cuda related programs with "undefined references"
>     link errors, now I found a workaround by putting the cuda related
>     ld flags "-lcublas" and "-lcudart" in the end (i.e. after
>     -lpthread). My system is Ubuntu 12.04 with gcc 4.6.3.  Just
>     sending this in case you have the same problem.
>
>     Bests,
>     Chao
>
>     ------------------------------------------------------------------------------
>     Learn Graph Databases - Download FREE O'Reilly Book
>     "Graph Databases" is the definitive new guide to graph databases and
>     their applications. This 200-page book is written by three acclaimed
>     leaders in the field. The early access version is available now.
>     Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>     _______________________________________________
>     Kaldi-developers mailing list
>     Kal...@li...
>     <mailto:Kal...@li...>
>     https://lists.sourceforge.net/lists/listinfo/kaldi-developers
>
>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>
>
> _______________________________________________
> Kaldi-developers mailing list
> Kal...@li...
> https://lists.sourceforge.net/lists/listinfo/kaldi-developers

2011	Jan	Feb	Mar	Apr	May	Jun (4)	Jul	Aug	Sep (1)	Oct (4)	Nov (1)	Dec (14)
2012	Jan (1)	Feb (8)	Mar	Apr (1)	May (3)	Jun (13)	Jul (7)	Aug (11)	Sep (6)	Oct (14)	Nov (16)	Dec (1)
2013	Jan (3)	Feb (8)	Mar (17)	Apr (21)	May (27)	Jun (11)	Jul (11)	Aug (21)	Sep (39)	Oct (17)	Nov (39)	Dec (28)
2014	Jan (36)	Feb (30)	Mar (35)	Apr (17)	May (22)	Jun (28)	Jul (23)	Aug (41)	Sep (17)	Oct (10)	Nov (22)	Dec (56)
2015	Jan (30)	Feb (32)	Mar (37)	Apr (28)	May (79)	Jun (18)	Jul (35)	Aug	Sep (1)	Oct	Nov	Dec

kaldi-developers Mailing List for Kaldi (Page 30)

kaldi-developers — Kaldi Developers