Intent to write a tool that can do gender (male, female -- say for adults
only) / age (infant, teen, adult) / emotion (irritation, fear, rage,
normal...) - determination / classification of speakers, from recorded /
streamed voice (not in real-time). Would CMU Sphinx be a good place to start,
and could be used for such tasks ? Of course, the idea is not to use Sphinx
as-is, but certain key components.
The requirement is such that 70% accuracy, and 30% flagged with appropriate
confidence levels, is quite acceptable.
If someone here is aware of prior work in this field, especially open-source,
would appreciate points towards the same.
Got to admit that this isn't a field of my specialization, so appreciate all
help.
Mike D.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Intent to write a tool that can do gender (male, female -- say for adults
only) / age (infant, teen, adult) / emotion (irritation, fear, rage,
normal...) - determination / classification of speakers, from recorded /
streamed voice (not in real-time). Would CMU Sphinx be a good place to start,
and could be used for such tasks ? Of course, the idea is not to use Sphinx
as-is, but certain key components.
The requirement is such that 70% accuracy, and 30% flagged with appropriate
confidence levels, is quite acceptable.
If someone here is aware of prior work in this field, especially open-source,
would appreciate points towards the same.
Got to admit that this isn't a field of my specialization, so appreciate all
help.
Mike D.
Sorry.
s/intent/intend/g
CMUSphinx currently doesn't provide any tools for that. You can use MARF for
example
http://marf.sourceforge.net/
Or LIUM speaker diarization
http://lium3.univ-lemans.fr/diarization/doku.php/welcome
Thanks for the pointers, @nshmyrev. LIUM page doesn't open up for me. Will
check out MARF to being with.
Finally managed to reach LIUM page. Seems pretty much what I need, and plan to
play with it extensively. Thanks again.