Menu

problem in prune_tree

Help
2004-08-03
2012-09-22
  • danial ibrahim

    danial ibrahim - 2004-08-03

    Hi,
    i have ran the prune_tree executable and nothing error happened..but, not all senones that i want to train can be defined by the program..is that normal?

    here is my prune_tree.log file:

    bin/prunetree \ -itreedir dtree_full \ -nseno 240 \ -otreedir dtree_prune \ -moddeffn model_architecture/cd_untied_3s.mdef \ -psetfn etc/questions.list \ -minocc 0.00001

    [Switch]  [Default] [Value]
    -moddeffn           model_architecture/cd_untied_3s.mdef
    -psetfn             etc/questions.list
    -itreedir           dtree_full
    -otreedir           dtree_prune
    -nseno              240   
    -minocc   0.0       1.000000e-05
    INFO: main.c(82): Reading: model_architecture/cd_untied_3s.mdef
    INFO: model_def_io.c(593): Model definition info:
    INFO: model_def_io.c(594): 80 total models defined (27 base, 53 tri)
    INFO: model_def_io.c(595): 320 total states
    INFO: model_def_io.c(596): 240 total tied states
    INFO: model_def_io.c(597): 81 total tied CI states
    INFO: model_def_io.c(598): 27 total tied transition matrices
    INFO: model_def_io.c(599): 4 max state/model
    INFO: model_def_io.c(600): 20 min state/model
    INFO: main.c(88): Reading: etc/questions.list
    INFO: main.c(218): AA-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): AA-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): AA-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): AH-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): AH-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): AH-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): AX-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): AX-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): AX-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): AY-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): AY-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): AY-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): B-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): B-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): B-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): CH-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): CH-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): CH-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): D-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): D-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): D-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): EH-0    6 [0 < 1.000000e-05]
    INFO: main.c(218): EH-1    6 [0 < 1.000000e-05]
    INFO: main.c(218): EH-2    6 [0 < 1.000000e-05]
    INFO: main.c(218): EY-0    3 [0 < 1.000000e-05]
    INFO: main.c(218): EY-1    3 [0 < 1.000000e-05]
    INFO: main.c(218): EY-2    3 [0 < 1.000000e-05]
    INFO: main.c(218): F-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): F-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): F-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): IY-0    9 [0 < 1.000000e-05]
    INFO: main.c(218): IY-1    9 [0 < 1.000000e-05]
    INFO: main.c(218): IY-2    9 [0 < 1.000000e-05]
    INFO: main.c(218): JH-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): JH-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): JH-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): K-0    3 [0 < 1.000000e-05]
    INFO: main.c(218): K-1    3 [0 < 1.000000e-05]
    INFO: main.c(218): K-2    3 [0 < 1.000000e-05]
    INFO: main.c(218): L-0    2 [0 < 1.000000e-05]
    INFO: main.c(218): L-1    2 [0 < 1.000000e-05]
    INFO: main.c(218): L-2    2 [0 < 1.000000e-05]
    INFO: main.c(218): M-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): M-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): M-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): N-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): N-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): N-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): OW-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): OW-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): OW-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): P-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): P-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): P-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): R-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): R-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): R-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): S-0    3 [0 < 1.000000e-05]
    INFO: main.c(218): S-1    3 [0 < 1.000000e-05]
    INFO: main.c(218): S-2    3 [0 < 1.000000e-05]
    INFO: main.c(218): T-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): T-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): T-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): UW-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): UW-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): UW-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): V-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): V-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): V-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): W-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): W-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): W-2    1 [0 < 1.000000e-05]
    INFO: main.c(218): Y-0    3 [0 < 1.000000e-05]
    INFO: main.c(218): Y-1    3 [0 < 1.000000e-05]
    INFO: main.c(218): Y-2    3 [0 < 1.000000e-05]
    INFO: main.c(218): Z-0    1 [0 < 1.000000e-05]
    INFO: main.c(218): Z-1    1 [0 < 1.000000e-05]
    INFO: main.c(218): Z-2    1 [0 < 1.000000e-05]
    INFO: main.c(235): Prior to pruning n_seno= 159
    WARNING: "main.c", line 239: n_seno_wanted= 240, but only 159 defined by trees
    INFO: main.c(243): n_twig= 45
    INFO: main.c(342): AA-0    1
    INFO: main.c(342): AA-1    1
    INFO: main.c(342): AA-2    1
    INFO: main.c(342): AH-0    1
    INFO: main.c(342): AH-1    1
    INFO: main.c(342): AH-2    1
    INFO: main.c(342): AX-0    3
    INFO: main.c(342): AX-1    3
    INFO: main.c(342): AX-2    3
    INFO: main.c(342): AY-0    3
    INFO: main.c(342): AY-1    3
    INFO: main.c(342): AY-2    3
    INFO: main.c(342): B-0    3
    INFO: main.c(342): B-1    3
    INFO: main.c(342): B-2    3
    INFO: main.c(342): CH-0    1
    INFO: main.c(342): CH-1    1
    INFO: main.c(342): CH-2    1
    INFO: main.c(342): D-0    3
    INFO: main.c(342): D-1    3
    INFO: main.c(342): D-2    3
    INFO: main.c(342): EH-0    11
    INFO: main.c(342): EH-1    11
    INFO: main.c(342): EH-2    11
    INFO: main.c(342): EY-0    5
    INFO: main.c(342): EY-1    5
    INFO: main.c(342): EY-2    5
    INFO: main.c(342): F-0    1
    INFO: main.c(342): F-1    1
    INFO: main.c(342): F-2    1
    INFO: main.c(342): IY-0    17
    INFO: main.c(342): IY-1    17
    INFO: main.c(342): IY-2    17
    INFO: main.c(342): JH-0    3
    INFO: main.c(342): JH-1    3
    INFO: main.c(342): JH-2    3
    INFO: main.c(342): K-0    5
    INFO: main.c(342): K-1    5
    INFO: main.c(342): K-2    5
    INFO: main.c(342): L-0    3
    INFO: main.c(342): L-1    3
    INFO: main.c(342): L-2    3
    INFO: main.c(342): M-0    1
    INFO: main.c(342): M-1    1
    INFO: main.c(342): M-2    1
    INFO: main.c(342): N-0    1
    INFO: main.c(342): N-1    1
    INFO: main.c(342): N-2    1
    INFO: main.c(342): OW-0    1
    INFO: main.c(342): OW-1    1
    INFO: main.c(342): OW-2    1
    INFO: main.c(342): P-0    1
    INFO: main.c(342): P-1    1
    INFO: main.c(342): P-2    1
    INFO: main.c(342): R-0    1
    INFO: main.c(342): R-1    1
    INFO: main.c(342): R-2    1
    INFO: main.c(342): S-0    5
    INFO: main.c(342): S-1    5
    INFO: main.c(342): S-2    5
    INFO: main.c(342): T-0    1
    INFO: main.c(342): T-1    1
    INFO: main.c(342): T-2    1
    INFO: main.c(342): UW-0    1
    INFO: main.c(342): UW-1    1
    INFO: main.c(342): UW-2    1
    INFO: main.c(342): V-0    1
    INFO: main.c(342): V-1    1
    INFO: main.c(342): V-2    1
    INFO: main.c(342): W-0    1
    INFO: main.c(342): W-1    1
    INFO: main.c(342): W-2    1
    INFO: main.c(342): Y-0    5
    INFO: main.c(342): Y-1    5
    INFO: main.c(342): Y-2    5
    INFO: main.c(342): Z-0    1
    INFO: main.c(342): Z-1    1
    INFO: main.c(342): Z-2    1

    can anyone here explain to me about this?

    thanks.

     
    • Roger Wellington-Oguri

      I think something has gone wrong in a previous step, but there was nothing to alert you.  (This happened to me many times.)

      In the lines for the three states for a phoneme, e.g.

      INFO: main.c(342): AA-0 1
      INFO: main.c(342): AA-1 1
      INFO: main.c(342): AA-2 1

      the number 1 at the end of the line means that the trainer knows about only 1 triphone context for the phoneme AA.  This suggests that the AA phoneme only appeared once in the transcript for your training data.  (Or, if more than once, it was always preceded and followed by the same pair of phonemes.)  All your counts are very low, so either something seems to be going wrong with your transcript file, or you are trying to train with way too little data.

      Roger

       
      • Willie Walker

        Willie Walker - 2004-08-03

        I recently started learning how to use SphinxTrain as well (I still have a ways to go), but it occurred to me that there really is a need for some pre-verification and analysis of the data:

          o Is the dictionary valid?  Is it readable and does it contain just one pronunciation per word?

          o Are all the phones used by the dictionary in the phonelist?

          o Is there "good" coverage of the phone list by the training data?

        In addition, what does "good" coverage mean?  For example, is there a rule of thumb for things such as how many instances of each phone are in the training data, how many instances of each triphone are there, how many instances of the phones in various contexts (e.g., beginning/middle/end) are there?

        I started borrowing from the SphinxTrain/scripts_pl/00.verify/verify_all.pl script to do some of this, but it's mostly a nighttime/weekend thing for me.

        Has anyone else done work in this space?

        Will

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.