CMU Sphinx / Forums / Help: Difference between retraining and adaptation.

ben - 2008-07-14

Hi, there,

any insights on the difference between re-train and adaptation? I suppose re-train the model is a more thorough use of new data that was collected after the initial training was done? I also suppose it re-caluclates the parameters in the mode from ground up.

btw: I note the over-train was mentioned quite a few times in the forum, but I still don't have a clear meaning of it yet. Does it mean the model was too specific to a speaker or a ground of speakers, and therefore not balanced for the other speakers? Or, it means that too many iteration was used in the training process (as mentioned the SphinxTrain tutorial)?

Thanks in advance,

Ben

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-07-15
  
  > any insights on the difference between re-train and adaptation? I suppose re-train the model is a more thorough use of new data that was collected after the initial training was done? I also suppose it re-caluclates the parameters in the mode from ground up.
  
  Something like that. More details depend on the type of adaptation you are using. For example MLLR adaptation is just a matrix that modifies features with a linear transform. With retrained model you get more complicated transformation of the original model but you need to have a lot of data to train the proper distribution.
  
  > btw: I note the over-train was mentioned quite a few times in the forum, but I still don't have a clear meaning of it yet. Does it mean the model was too specific to a speaker or a ground of speakers, and therefore not balanced for the other speakers?
  
  The training is the process of approximation of the training data with the model that converges to the perfect state. Once you train, the performance of the recognition of the training data increases up to 100%. But if the training data is small or not representative enough it doesn't mean that the same model will decode test data as well. Test data performance can even decrease. For example if you train on the speech of a single speaker, most probably it will not decode others at all.
  
  The only way to solve this issue is to get train data that is representative enough.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - ben - 2008-07-15
    
    As always, thanks, Nickolay.
    
    For balanced and representative, I suppose you mean the speaker composition in the training corpus. This composition should be close to the target speaker in real environment. Am I right?
    
    One another question. I haven't checked out this(, but I think you can answer this in no time): in order to re-train, do I need to the feature files? Note many open source AM models don't come with the feature files. I suppose it would be nice to grow (re-train) my mode from the existing models. How about adaptation? Do I need the feature files for that? I suppose it also depends on the type of adaptation?
    
    thanks,
    
    Ben
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2008-07-15
      
      > This composition should be close to the target speaker in real environment. Am I right?
      
      Yes
      
      > in order to re-train, do I need to the feature files?
      
      Yes, how can you imagine training without files?
      
      > Note many open source AM models don't come with the feature files.
      
      To be honest there is no good freely available database, but you can buy WSJ for example, it's not very expensive.
      
      > How about adaptation? Do I need the feature files for that?
      
      For adaptation you only need target data and a model you will adapt to target data.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - ben - 2008-07-15
        
        thanks for clarifying.
        
        Ben
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Difference between retraining and adaptation.

Speech Recognition Toolkit

Forums

Help

Difference between retraining and adaptation. document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Difference between retraining and adaptation.