From: Tony R. <to...@ca...> - 2014-10-25 10:12:21
|
Hi Dan, I can contribute from both the research and the commercial perspectives. I'll keep the research brief as that's been the main theme of contributions so far. I think Kaldi should incorporate RNN acoustic models and a very efficient decoder (but I guess you could guess that I'd say this :-) . That's our plan anyway - more detail is at http://cantabresearch.com/openings From a commercial perspective I think that the Kaldi scripts could really benefit from a refactoring. At the moment it's set up from an academic viewpoint where you know all your data at the outset including the split into train and test - that's not the way things work for us. Vassil commented on this earlier. * Kaldi code. I think it would be cleaner if there was a single bin directory where all scripts and binaries were installed into (no more symlinks). Running Kaldi would then be as simple as putting this directory into $PATH. * Separate out data preparation, acoustic model generation,. decoder build and test. In the commercial world you don't know the test set until it happens, so the same acoustic model will get used with many language models in many different deployments. Personally I like sourcing a config file which just has VARIABLE=VALUE definitions which sets everything up for the rest of the script. Ideally there would be one set up and data preparation script and then the main acoustic train/decoder build/test would be three standard scripts that didn't need changing (the config file would tell you if you were going to stay at a GMM ML stage or progress to DNNs, etc). * A stable release. I deal with many people who want to switch from HTK to Kaldi, I encourage them to do it but not use kaldi-stable but to make their own. An official stable release would lower the barrier to entry for many people, even if it's just called kaldi-2015H1 and kaldi-2015H2 and doesn't really change except very serious bugs (have there been any of those lately? I think not). If completed someone would be able to check out the latest stable release, run different sorts of models from kaldi-asr.org on different data, easily be able to adapt to different domains by changing the language model, etc. Someone suggested to me that all this would be a good project for a Google Summer of Code student, I agree. Tony -- ** Cantab is hiring: www.cantabResearch.com/openings ** Dr A J Robinson, Founder, Cantab Research Ltd Phone direct: 01223 778240 office: 01223 794497 Company reg no GB 05697423, VAT reg no 925606030 51 Canterbury Street, Cambridge, CB4 3QG, UK |