I wanted to know that where i can found the script to extract pitch features only .
I know there is script for mfcc + pitch and plp + pitch features .
But i am unable to find the script to extract only pitch features .
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
kaldi doesn't live on sourceforge anymore.
There isn't a script in steps/, but you can easily figure out how to
write one if you understand Kaldi I/O mechanisms, with reference to
the existing scripts.
I wanted to know that where i can found the script to extract pitch features
only .
I know there is script for mfcc + pitch and plp + pitch features .
But i am unable to find the script to extract only pitch features .
I had run mfcc + pitch script with multiple available options
--add-pov-feature , --add-normalized-pitch etc .
but i am getting a WER % of about 55-60 %
Please suggest what can i do ?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I had run mfcc + pitch script with multiple available options
--add-pov-feature , --add-normalized-pitch etc .
but i am getting a WER % of about 55-60 %
Please suggest what can i do ?
i am not using any standard database .
I am having my own dataset of Punjabi language which is tonal language . So i thought it would be good to add pitch features with mfcc but the results are not good with or without pitch features .
What can i do ?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i am not using any standard database .
I am having my own dataset of Punjabi language which is tonal language .
So i thought it would be good to add pitch features with mfcc but the
results are not good with or without pitch features .
What can i do ?
28 speakers for training and 4 for testing
90 minutes training data .
vocab size is 1400 words training .
I had trained using mono , tri1 , tri2 , tri3 and sgmm models but all are giving wer in range 55-65
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
28 speakers for training and 4 for testing
90 minutes training data .
vocab size is 1400 words training .
I had trained using mono , tri1 , tri2 , tri3 and sgmm models but all are
giving wer in range 55-65
170 phonemes for punjabi language i can see a max of 45 phonemes including
silence for this ( if you use position dependent it will be 140 max), how
you got 170 phone set. if you really have 170 phonemes then the phonetic
coverage is too low for any phoneme in the training set.
Rohit: thanks for responding. For your guys' info, kaldi-help is the
primary location for these discussions, see kaldi-asr.org/forums.html.
There may be forums for Indian users of Kaldi too, which I am not
aware of, and these, if they exist, would very very suitable for new
users like Vaibhav.
I actively follow kaldi forums. I don't think we have anything specially
for indian or any other tonal languages, but if we have something for it i
would like to help the guys who are researching into them. I have been
doing on these from kaldi beginning (proud to say first comment on kaldi is
mine when released), tried almost all experiments on tonal languages on
huge datasets collected by ourselves.
Rohit: thanks for responding. For your guys' info, kaldi-help is the
primary location for these discussions, see kaldi-asr.org/forums.html.
There may be forums for Indian users of Kaldi too, which I am not
aware of, and these, if they exist, would very very suitable for new
users like Vaibhav.
On Wed, Jul 4, 2018 at 4:34 AM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Add More data from more speakers and get good phone tic coverage
On Wed, 4 Jul 2018, 1:33 pm Vaibhav, ervaibhavkumar@users.sourceforge.net
wrote:
Cool.
I know that in China they have various dedicated lists.
If you were to start one for Kaldi researchers in India it might be
helpful. If you do, try to set it up so the archives are searchable
(like kaldi-help) so that people can find it from google.
Dan
I actively follow kaldi forums. I don't think we have anything specially
for indian or any other tonal languages, but if we have something for it i
would like to help the guys who are researching into them. I have been
doing on these from kaldi beginning (proud to say first comment on kaldi is
mine when released), tried almost all experiments on tonal languages on
huge datasets collected by ourselves.
On Wed, 4 Jul 2018, 11:50 pm Daniel Povey, danielpovey@users.sourceforge.net
wrote:
Rohit: thanks for responding. For your guys' info, kaldi-help is the
primary location for these discussions, see kaldi-asr.org/forums.html.
There may be forums for Indian users of Kaldi too, which I am not
aware of, and these, if they exist, would very very suitable for new
users like Vaibhav.
On Wed, Jul 4, 2018 at 4:34 AM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Add More data from more speakers and get good phone tic coverage
On Wed, 4 Jul 2018, 1:33 pm Vaibhav, ervaibhavkumar@users.sourceforge.net
wrote:
Cool.
I know that in China they have various dedicated lists.
If you were to start one for Kaldi researchers in India it might be
helpful. If you do, try to set it up so the archives are searchable
(like kaldi-help) so that people can find it from google.
Dan
On Wed, Jul 4, 2018 at 2:34 PM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Hi dan,
I actively follow kaldi forums. I don't think we have anything specially
for indian or any other tonal languages, but if we have something for it i
would like to help the guys who are researching into them. I have been
doing on these from kaldi beginning (proud to say first comment on kaldi is
mine when released), tried almost all experiments on tonal languages on
huge datasets collected by ourselves.
On Wed, 4 Jul 2018, 11:50 pm Daniel Povey,
danielpovey@users.sourceforge.net
wrote:
Rohit: thanks for responding. For your guys' info, kaldi-help is the
primary location for these discussions, see kaldi-asr.org/forums.html.
There may be forums for Indian users of Kaldi too, which I am not
aware of, and these, if they exist, would very very suitable for new
users like Vaibhav.
On Wed, Jul 4, 2018 at 4:34 AM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Add More data from more speakers and get good phone tic coverage
On Wed, 4 Jul 2018, 1:33 pm Vaibhav, ervaibhavkumar@users.sourceforge.net
wrote:
Hello,
Can Any one provide the limitations for kaldi usage such as, how it works on realtime data?
can we stream data for 6-8 hours?
speaking recoginization based on time window?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
Can Any one provide the limitations for kaldi usage such as, how it works
on realtime data?
can we stream data for 6-8 hours?
speaking recoginization based on time window?
Hi
I wanted to know that where i can found the script to extract pitch features only .
I know there is script for mfcc + pitch and plp + pitch features .
But i am unable to find the script to extract only pitch features .
Thanks
kaldi doesn't live on sourceforge anymore.
There isn't a script in steps/, but you can easily figure out how to
write one if you understand Kaldi I/O mechanisms, with reference to
the existing scripts.
On Tue, Jul 3, 2018 at 6:00 AM, Vaibhav
ervaibhavkumar@users.sourceforge.net wrote:
Ok Thanks for your help sir
I am facing one more issue sir
I had run mfcc + pitch script with multiple available options
--add-pov-feature , --add-normalized-pitch etc .
but i am getting a WER % of about 55-60 %
Please suggest what can i do ?
Thanks
Hi,
If you are using a standard speech database, can you mention it. Its easy
to compare.
On Wed, 4 Jul 2018 at 01:46, Vaibhav ervaibhavkumar@users.sourceforge.net
wrote:
i am not using any standard database .
I am having my own dataset of Punjabi language which is tonal language . So i thought it would be good to add pitch features with mfcc but the results are not good with or without pitch features .
What can i do ?
Hi vaibhav,
What is your dataset size and how many speakers, what is your training and
testing vocabulary. Which model you have used for testing.
To answer about wer we need to know these atleast. And how many phones in
your lexicon for punjabi
On Wed, 4 Jul 2018, 11:47 am Vaibhav, ervaibhavkumar@users.sourceforge.net
wrote:
28 speakers for training and 4 for testing
90 minutes training data .
vocab size is 1400 words training .
I had trained using mono , tri1 , tri2 , tri3 and sgmm models but all are giving wer in range 55-65
and i don't think we get better accuracy with just 90 minutes of indian
languages data
On Wed, Jul 4, 2018 at 12:13 PM Vaibhav ervaibhavkumar@users.sourceforge.net wrote:
--
Best regards,
K.Rohith Gowtham
testing speakers are different
phone set size = 170
Yes , testing words exist in the the 1400 words
What can be done ?
Also i had tried training and testing on same speakers but again the WER was in range 60-75 %
Last edit: Vaibhav 2018-07-04
170 phonemes for punjabi language i can see a max of 45 phonemes including
silence for this ( if you use position dependent it will be 140 max), how
you got 170 phone set. if you really have 170 phonemes then the phonetic
coverage is too low for any phoneme in the training set.
On Wed, Jul 4, 2018 at 12:24 PM Vaibhav ervaibhavkumar@users.sourceforge.net wrote:
--
Best regards,
K.Rohith Gowtham
I saw wrong file Sorry .
It was 40
How can i improve accuracy ?
Add More data from more speakers and get good phone tic coverage
On Wed, 4 Jul 2018, 1:33 pm Vaibhav, ervaibhavkumar@users.sourceforge.net
wrote:
Rohit: thanks for responding. For your guys' info, kaldi-help is the
primary location for these discussions, see kaldi-asr.org/forums.html.
There may be forums for Indian users of Kaldi too, which I am not
aware of, and these, if they exist, would very very suitable for new
users like Vaibhav.
On Wed, Jul 4, 2018 at 4:34 AM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Hi dan,
I actively follow kaldi forums. I don't think we have anything specially
for indian or any other tonal languages, but if we have something for it i
would like to help the guys who are researching into them. I have been
doing on these from kaldi beginning (proud to say first comment on kaldi is
mine when released), tried almost all experiments on tonal languages on
huge datasets collected by ourselves.
On Wed, 4 Jul 2018, 11:50 pm Daniel Povey, danielpovey@users.sourceforge.net wrote:
Cool.
I know that in China they have various dedicated lists.
If you were to start one for Kaldi researchers in India it might be
helpful. If you do, try to set it up so the archives are searchable
(like kaldi-help) so that people can find it from google.
Dan
On Wed, Jul 4, 2018 at 2:34 PM, rohit kodali
rohitgowtham@users.sourceforge.net wrote:
Sure, I will do that.
On Thu, 5 Jul 2018, 12:14 am Daniel Povey, danielpovey@users.sourceforge.net wrote:
Thanks for helping out
Is there any other thing i can try to reduce WER % ?
Hello,
Can Any one provide the limitations for kaldi usage such as, how it works on realtime data?
can we stream data for 6-8 hours?
speaking recoginization based on time window?
see kaldi-asr.org/forums.html for how to ask questions, but your questions
are very unclear.
On Mon, Sep 10, 2018 at 7:12 AM shashi shashi1020@users.sourceforge.net
wrote:
Sorry Dan..
I'll make my question clear to you.
you'd want to break it up into smaller pieces, like 15 segments as the
decoder doesn't handle too-long segments, but thaat's trivial.
On Tue, Sep 11, 2018 at 3:43 AM shashi shashi1020@users.sourceforge.net
wrote: