Menu

PocketSphinx diagrams - do these make sense?

Help
Mathijs
2010-04-27
2012-09-22
  • Nickolay V. Shmyrev

    Hello

    Nice diagrams, are they automatically generated from source/runtime execution
    trace? It would be nice to put something like this into documentation indeed,
    probably in more compressed form without minor insignificant branches like
    error branches.

     
  • Mathijs

    Mathijs - 2010-04-28

    Thanks for your suggestion. Removing error branches might be a good idea, as
    it simplifies the diagrams a great deal. The diagrams were drawn in Dia (a
    charting application which I wouldn't recommend because it's quite buggy).

    Would you (or anyone else familiar with PocketSphinx) care to look through the
    comments we placed on the right in the diagrams linked to in my previous post?
    I'm especially interested in conceptual errors I might have made.

     
  • Nickolay V. Shmyrev

    I'm especially interested in conceptual errors I might have made.

    Well, its not a conceptual errors, but it would be nice to avoid several
    things:

    1. Mix important things (core functions) with not important (counter names, does it really matter what counter do you increment?)

    2. Follow target domain. ASR in pocketsphinx is actually done by following scheme

    Audio Input -> Endpointer Calibration

    Audio INput -> Endpointing -> Frame Sequencing -> DFT -> MEL Fiters -> Ceptra
    -> Normalization -> Feature Calculation -> Feature Set

    Feature Set -> Tree Search -> Flat Search -> Bestpath Search

    Tree Search = Init -> Find Best Tokens -> Grow -> Prune

    Flat Search = Same, but using tree result from the first path

    Bestpath Search = Lattice Viterbi Forward + Backward

    Diagrams should have these concepts, not counters and strangely looking loops

     
  • Mathijs

    Mathijs - 2010-04-29

    Thanks for your feedback Nickolay. You are right in saying that these diagrams
    capture the design of PocketSphinx at a lower level than is desirable.

    The scheme you posted (augmented by explanation) would be very valuable for
    people who are trying to get a grip on how ASR works in the case of
    PocketSphinx. While I've got a basic understanding of what most of these steps
    are, it would take some time for me to get familiar enough with PocketSphinx
    to be able to draw them in a high level diagram. I would love to do this
    (speech recognition is awesome!) but it would be too far out of the scope of
    my project and internship.

    Thanks again for your input. We will let something know on this forum when we
    get some results from our project.

     

Log in to post a comment.