Could anybody tell what parts of the algorithms implemented are processor intensive and might be able to use hardware features to accelerate them? Perhaps some profiling data would help. I am interested because I would like to create an open source hardware implementation with application to a real problem.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
There is no need to run profiling, speech decoder algorithms and their hardware accelerations are well studied. Decoding is essentially a bestpath search in a large graph combined with frame classification. Both search and classification are resource intensive but they are of a different nature.
You can probably get some insights from recent papers, one of them is
Efficient Automatic Speech Recognition on the GPU
Jike Chong, Ekaterina Gonina and Kurt Keutzer
Could anybody tell what parts of the algorithms implemented are processor intensive and might be able to use hardware features to accelerate them? Perhaps some profiling data would help. I am interested because I would like to create an open source hardware implementation with application to a real problem.
Hello David
There is no need to run profiling, speech decoder algorithms and their hardware accelerations are well studied. Decoding is essentially a bestpath search in a large graph combined with frame classification. Both search and classification are resource intensive but they are of a different nature.
You can probably get some insights from recent papers, one of them is
Efficient Automatic Speech Recognition on the GPU
Jike Chong, Ekaterina Gonina and Kurt Keutzer
http://www.eecs.berkeley.edu/~egonina/docs/gpucg_speech.pdf
but I suggest you to search for more since the subject is quite well covered.