I am currently looking into a proof of concept transcription system for
doctors in an A&E department. There are a dozen doctors with various accents
of English both British and Non-British such as Indian English.
Firstly, how would the setup be for catering for diverse accents? For example
would I need multiple acoustic and language models?
Secondly, should I create a new acoustic model bearing in mind that the
acoustic models that are available for CMU Sphinx are for US English or should
I adapt one of the available models for British English?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For example would I need multiple acoustic and language models?
For the proof of concept no, for the futher operation maybe yes
Secondly, should I create a new acoustic model bearing in mind that the
acoustic models that are available for CMU Sphinx are for US English or should
I adapt one of the available models for British English?
The design of the system must be considered in the context project you want to
try to accomplish. You can both train user-specific models, use adaptation
profiles or just use generic model. User-specific model requires you to have
more than 20 hours of the accurate transcription for that specific user. It's
up to you to decide if you want to record each of your doctor. Adaptation
improves with 20 minutes of dictated text. Generic model will not require any
enrollment. Accuracies will be different too. The implementation complexity is
also different.
For the proof of concept you can go with a generic British model. You will
have way more issues with the language model for medical dictation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am currently looking into a proof of concept transcription system for
doctors in an A&E department. There are a dozen doctors with various accents
of English both British and Non-British such as Indian English.
Firstly, how would the setup be for catering for diverse accents? For example
would I need multiple acoustic and language models?
Secondly, should I create a new acoustic model bearing in mind that the
acoustic models that are available for CMU Sphinx are for US English or should
I adapt one of the available models for British English?
Hello
For the proof of concept no, for the futher operation maybe yes
The design of the system must be considered in the context project you want to
try to accomplish. You can both train user-specific models, use adaptation
profiles or just use generic model. User-specific model requires you to have
more than 20 hours of the accurate transcription for that specific user. It's
up to you to decide if you want to record each of your doctor. Adaptation
improves with 20 minutes of dictated text. Generic model will not require any
enrollment. Accuracies will be different too. The implementation complexity is
also different.
For the proof of concept you can go with a generic British model. You will
have way more issues with the language model for medical dictation.
Thank you for your response. With regard to the issues with medical dictation
do you mind elaborating on the issues I may encounter?
Hello
There are quite some things to consider. Contact me if you want to discuss
this
nshmyrev@nexiwave.com