CMU Sphinx / Forums / Help: Multiple recognizer configuration for threads

Jim - 2004-10-14

I am preparing a config file defining multiple instances of the same recognizer configuration for use in a multi-threaded environment.

I have been advised that I should multiply define everything from the recognizer to the frontend including the linguist.

My questions is whether the items refered to by the frontend (datasource, speechclassifier) must also be multiply defined or are they "thread-safe" or in a more general form, what must be multiply defined and what need not be?

This raises the more general question of the "thread-safeness" of Sphinx4.

Anyone have any experience with this?

If it isn't "thread-safe" are there plans to make it so?

Jim

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Philip Kwok - 2004-10-14
  
  Jim,
  
  The front end components are NOT thread-safe, and therefore you should multiple instances of ALL of them (including the StreamDataSources). I believe all the other Sphinx-4 components are not thread-safe as well. As far as I know, there are currently no plans to make them so.
  
  philip
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jim - 2004-10-15
  
  If the acoustic model is "thread-safe" should I consider unitManager, sphinx3Loader, and logMath "safe" as well?
  
  Jim
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Paul Lamere - 2004-10-15
    
    Yep, when I say acoustic model, I mean anything that is managed by the AcousticModel interface. Also, when I say thread safe, I am referring to the model after it is loaded. You should make sure that all of the configuring and loading occurs in a single thread.
    
    HTH
    Paul
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Paul Lamere - 2004-10-14
  
  Jim:
  
  There are some components that are definitely state-less and therefore, once created, configured and initialized are thread safe. This includes the acoustic models, language models (although there is some caching in these that could be affected by multiple separate recognizer clients), grammars, and linguists can be thread safe, although we have not tested this, and there may be some issues that need to be addressed.
  
  The search manager itself maintains quite a bit of state during an utterance so it is not thread safe and will not likely every be threadsafe.
  
  Paul
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jim - 2004-10-15
  
  Understood and as I expected.
  
  To be safe, should I just take a single "instance" config file, create "n" instances with each component named uniquely with proper references, and build a new config file containing all?
  
  This then raises the following question. Given "n" instances and the WSJ 8K support for numbers only, how big might each "instance" be ("ballpark", SWAG, WAG)?
  
  I assume the total would be "n" * "instance-size".
  
  Jim
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Paul Lamere - 2004-10-15
    
    >To be safe, should I just take a single "instance" config file, >create "n" instances with each component named uniquely with > proper references, and build a new config file containing all?
    
    That will be the safest. You share the immutable components (acoustic model etc.) of course without having to duplidate them. The largest component of the search is the acoustic model. Since the AM can be shared, the incremental per 'recognizer stack' is on the order of 5MB or less (a definite WAG).
    
    Paul
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jim - 2004-10-16
  
  BTW, it would appear that each recognizer takes 28MB. Could someone take a look at my config file to insure that I got the separation of shared and unique compnents correct?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Paul Lamere - 2004-10-16
    
    Sure thing .... just email it to me.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Multiple recognizer configuration for threads

Speech Recognition Toolkit

Forums

Help

Multiple recognizer configuration for threads document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Multiple recognizer configuration for threads