CMU Sphinx / Forums / Sphinx4 Help: Why use 8kHz samples for telefone and not use 16kHz ?

Toine db - 2014-10-31

I'm finalizing/optimizing the acoustic model I need but I'm wondering what the idea is behind not using 16kHz samples for creating the model for telefone.

I have created the WindowsPhone sample for PocketSphinx, and I think its implemented with 16kHz and it works.
What is the reason for using 8kHz for telefone?

And a some litle follow up question:
Does the 8kHz sample need to be 8 bits?
And wouldn't it be better to use 32bits samples if you have them?

Last edit: Toine db 2014-10-31

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Alexander Solovets - 2014-10-31
  
  This is because telephone audio is cut at frequencies above 4KHz.
  
  On Fri, Oct 31, 2014 at 6:57 PM, Toine db tony_mortana@users.sf.net wrote:
  
  I'm finalizing/optimizing the acoustic model I need but I'm wondering what the idea is behind not using 16kHz samples for creating the model for telefone.
  
  I have created the WindowsPhone sample for PocketSphinx, and I think its implemented with 16kHz and it works.
  What is the reason for using 8kHz for telefone?
  
  Why use 8kHz samples for telefone and not use 16kHz ?
  
  Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/cmusphinx/discussion/sphinx4/
  
  To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
  
  --
  Sincerely, Alexander
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Toine db - 2014-10-31
    
    Tnx for the reply.
    
    It's cut of above 4kHz, but when I record with the Windows Phone microphone I get 16 kHz.....
    
    Do you know the reason why 16 bits samples are required for the model? (and not 8 or 32?)
    (http://cmusphinx.sourceforge.net/wiki/tutorialam#setting_up_the_training_scripts)
    
    Again tnx for the replay
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Alexander Solovets - 2014-10-31
      
      Telephone speech is transmitted over a cellular network or a PSTN
      line. If you record audio with microphone regardless of whether it's a
      mobile phone or a computer, you should use 16kHz audio models.
      
      On Fri, Oct 31, 2014 at 10:50 PM, Toine db tony_mortana@users.sf.net wrote:
      
      Tnx for the reply.
      
      It's cut of above 4kHz, but when I record with the Windows Phone microphone I get 16 kHz.....
      
      Do you know the reason why 16 bits samples are required for the model? (and not 8 or 32?)
      (http://cmusphinx.sourceforge.net/wiki/tutorialam#setting_up_the_training_scripts)
      
      Again tnx for the replay
      
      Why use 8kHz samples for telefone and not use 16kHz ?
      
      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/cmusphinx/discussion/sphinx4/
      
      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
      
      --
      Sincerely, Alexander
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Toine db - 2014-10-31
        
        Great, that was the answer I was looking for :-)
        
        Many Thanks Alexander.
        
        PS: do you know if the bit ratio has any effect? 16/32 bits?
        (sounds like 32 bit is better, but if pocketsphinx wont work with this....)
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2014-11-01
        
        PS: do you know if the bit ratio has any effect? 16/32 bits?
        
        32 bits is not a "bit ratio" but "sample bit width".
        
        http://music.columbia.edu/cmc/musicandcomputers/chapter2/02_05.php
        
        32 bits sample width is not required for speech recognition
        
        (sounds like 32 bit is better, but if pocketsphinx wont work with this....)
        
        Yes
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Toine db - 2014-11-05
        
        Thanks for the link
        
        I'm sorry to ask again; does pocketsphinx support 32bit and does it work better with 32bit?
        
        The real question is;
        Would you downsample 32bit samples to 16bit or just use the 32bit?
        
        Thanks for the support
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Nickolay V. Shmyrev - 2014-11-05
        
        I'm sorry to ask again; does pocketsphinx support 32bit
        
        No
        
        and does it work better with 32bit?
        
        No
        
        Would you downsample 32bit samples
        
        This process is not named "downsampling", it's just format conversion. Downsampling is when you change sample rate from 16khz to 8khz for example.
        
        to 16bit or just use the 32bit?
        
        Just use 16bit in recorder
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Toine db - 2014-11-07
        
        Great answer, everything I need to know to make the best models.
        
        So all 16 bits.
        
        (sorry for the name mixup downsampling/conversions, I'm glad you understood the question)
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Why use 8kHz samples for telefone and not use 16kHz ?

Speech Recognition Toolkit

Forums

Help

Why use 8kHz samples for telefone and not use 16kHz ? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Why use 8kHz samples for telefone and not use 16kHz ?