[Kaldi-users] Poll about Kaldi

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Dan,

regarding your poll for new stuff in the kaldi toolkit: 

1. DNN-based, robust VAD. Integrated with the online decoders
2. More decoder optimisations for speed (RTF): 
	- I have seen some benchmarks <http://on-demand.gputechconf.com/gtc/2014/video/S4732-deep-learning-networks-automatic-speech-recognition.mp4> where Python/Theano achieves 3x speedup over kaldi (for DNNs), both on CPU and GPU. 
	- perhaps make use of techniques like caching to save on computation
	- optimisations for online decoding
3. integrate RNN LMs deeper into the decoder, i.e. incorporate RNNs during the 1st pass of the decoder (there is an interesting paper by Microsoft here <http://www.google.gr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CCUQFjAA&url=http://research.microsoft.com/pubs/210168/rnnFirstPass.pdf&ei=7_BJVKnpBYLCPKH2gbAG&usg=AFQjCNHbMC8rdUZUskZ9oPIKHrW093KuZQ&sig2=F9RBm2R0oTCrcFR8-BNJCA&bvm=bv.77880786,d.ZWU>)

Thanks,

Dimitris