i am using the open-source acoustic model provided for sphinx3 decoders as base model.
i have a bunch of utterances with transcriptions to adapt with the base model. i used sphinx3_align to
force align these utterances against the base model and i did mllr+map adaptation described in http://www.cs.cmu.edu/~archan/presentation/MAP.pdf
and now i have another set of utterances with transcriptions for adaptation.
Do i force align my new set of utterances against the base acoustic model? or do i force align it
against the new/updated acoustic model from my previous adaptation?
thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i see. thank you.
in my case i have the generic model -> adapted to male utterances from one speaker -> then adapted to more utterances from the same speaker. from what you said, everytime i collect new utterances from the same speaker, i force align them with that speaker's most recent adapted model. is that correct? thank you. :D
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hi.
i am using the open-source acoustic model provided for sphinx3 decoders as base model.
i have a bunch of utterances with transcriptions to adapt with the base model. i used sphinx3_align to
force align these utterances against the base model and i did mllr+map adaptation described in http://www.cs.cmu.edu/~archan/presentation/MAP.pdf
and now i have another set of utterances with transcriptions for adaptation.
Do i force align my new set of utterances against the base acoustic model? or do i force align it
against the new/updated acoustic model from my previous adaptation?
thank you.
If I correctly understood I'd say it depends. Consider too cases:
Generic model -> adopted female model -> male model for adaptation. Here probably it's better to align with generic model.
Generic model -> adopted female model -> larger female model for the same speaker. Here you probably should try adopted model instead.
Actually you can try both :)
i see. thank you.
in my case i have the generic model -> adapted to male utterances from one speaker -> then adapted to more utterances from the same speaker. from what you said, everytime i collect new utterances from the same speaker, i force align them with that speaker's most recent adapted model. is that correct? thank you. :D
> is that correct?
it is