CMU Sphinx / Forums / Help: Model file formats

cerisara - 2007-06-14

Sorry for my last unfinished post - I write too fast and
hit the wrong key from time to time :-)

I just wanted to say that I don't want to make competitions
between HTK and S4, but I think this kind of interoperability
is really important, especially as HTK is a kind of standard now,
and if S4 supports the HTK format, then I think much more people will
use it, and benefit from the advantages of both systems (adaptation
for HTK, large vocabulary for S4, ...)

Thanks again !
Best regards,
Christophe

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nagendra Kumar Goel - 2007-06-14
  
  I fully support your argument. Will be happy to help if I can.
  I feel that Sphinx class of decoders (2, 3, 4) are great,
  and have good optimizations, though need more work - like everything else needs work.
  
  HTK interface will only help relaxing effort pressure on training,
  so one can focus better on the decoder.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - cerisara - 2007-06-19
    
    Hello,
    
    This is a follow-up about HTK models loading...
    
    Some news: it looks like my HTKLoader is working, but...
    (there's always a but !) I had to make several "assumptions",
    especially with regard to 1ph and 2ph, which are not used in the same way
    in HTK and S4. Actually, it is not very clear for me what's the best
    solution how to handle these 1ph and 2ph, but at least I have a baseline framework that's working.
    
    There are certainly lots of improvements possible of this framework, but I think that if I try and do them alone, I might miss lots of details.
    
    Also, I have not yet done a real accuracy comparison, because:
    1- it requires a lot of work: weights tuning, output normalisation, ...
    2- I'm pretty sure S4 results will be lower than julius' results,
    because of these "assumptions"
    I have just checked that I get reasonnable results on a few sentences
    of a large vocabulary task.
    
    In my code, I tried to respect the S4 philosophy, and it can be applied as a patch, but it would be better to have it in SVN, otherwise
    it will inevitably "diverge" from the S4 code and become sooner or later
    not compatible any more.
    But of course, this requires first at least some unit testing...
    I might think about that if you believe it can be integrated in SVN.
    
    I don't know how and where to upload/post this patch so that you can have
    a look if you're interested, just let me know please.
    
    Thank you !
    Christophe
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Nickolay V. Shmyrev - 2007-06-21
      
      Amazing work. But what stops you from publishing the patch? For example this page has a link Tracker/Patches in the header where patches to sphinx could be uploaded in theory.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - cerisara - 2007-06-21
        
        OK, I uploaded it in the "tracker-patch" area.
        I was reluctant to do it because the soft is not
        satisfying yet: accuracy is not as good as expected,
        but I need the help of others to find out why anyway.
        
        You can look at the beginning of the file HTKLoader.java,
        I tried to write down there all the "assumptions" I had to do.
        
        Regards,
        Christophe
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Model file formats

Speech Recognition Toolkit

Forums

Help

Model file formats document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Model file formats