Ali - 2018-02-10

I am making a continous speech recogntion system for Urdu

0 ) Can we use content of transcription file in building a langugae model . If No why ?

1)

1 hour of recording for command and control for a single speaker
5 hours of recordings of 200 speakers for command and control for many speakers
10 hours of recordings for single speaker dictation
50 hours of recordings of 200 speakers for many speakers dictation

Is this apply to both transcription and language model file ??

2) We must have to declare Urdu phonmes in English alphabetic letter (right) in both an4.phone and an4.dic file ??

3)Can we use urdu word in an4.dic file for example
ایک E YK

4) If I am recording 10 hours of recordings for single speaker dictation . Can I use one wav file having 10 hours of recording OR Do I need to split ? If Yes then what is the right duration and right way to split ?

5) I used same data for training and testing purpose ? Why I am not geeting 0% WER ??

6)
reference to it prefers to use letter-only phone names without special symbols. can we use _ in phone like in Urdu we have CISAMPA representation of one of the phone is T_SH_h
Thanks in advance

 

Last edit: Ali 2018-02-10