From: Kirill K. <kir...@sm...> - 2015-05-11 19:39:04
|
> -----Original Message----- > From: Daniel Povey [mailto:dp...@gm...] > Sent: 2015-05-11 1158 > > > Many recipes use per-speaker adapted features. I am working in a > field where we rarely see the same speaker again. To be able to compare > apples to apples, I want to avoid the dependency on per-speaker > adaptation. So one thing I want to do is drop utt <-> spk maps and use > global LDA and per-sentence fMLLR adaptation only. > > fMLLR applied globally does not really make sense, as there would be no > adaptation going on. I recommend to make the utt<->spk maps one-to-one > (make the utterance-id and speaker-id identical). You can see whether > fMLLR helps, and drop it if not; or try basis fMLLR. Right, that was what I meant by per-utterance fMMLR. I mistyped "per-sentence," too much NLP recently. :) > (Note: by > default we don't enable variance normalization, so CMVN is a bit of a > misnomer). Thanks for pointing this out. I did not understand the V part was not there. -kkm |