Menu

ERROR: "sphinx_fe.c", line 933: Failed to open control file

Help
2017-01-26
2017-02-01
  • Imran Rajjad

    Imran Rajjad - 2017-01-26

    I am using CGYWIN on Windows 10,I am trying to create an accoustic model. However the sphinxtrain script is unable to generate mfc files for the provided wav files. The error in the log files for both test and train is similar

    ERROR: "sphinx_fe.c", line 933: Failed to open control file /cygdrive/d/nlp/accoustic_model/russ/etc/russ_test.fileids: No such file or directory

    ERROR: "sphinx_fe.c", line 933: Failed to open control file /cygdrive/d/nlp/accoustic_model/russ/etc/russ_train.fileids: No such file or directory

    It is unable to find the file containing list of wav files. The file name and path in the error are correct. Could CYGWIN be causing this?

    Earlier tried installing python scripts on win10 but apparently there seems to be some compatability issues given the fact the I have installed Active perl and Active python as instrucuted by the tutorial page

    regards,
    Imran

     
    • Arseniy Gorin

      Arseniy Gorin - 2017-01-26

      If you say that the path is accessible from cgywin environment, there should be no problem in the script. I must say though that the most stable way to work from windows is to install a virtual machine with linux.

      Anyway, if you want I can try to reproduce your run. But you should provide the full training directory.

       
      • Imran Rajjad

        Imran Rajjad - 2017-01-27

        Sphinxtrain path: /cygdrive/d/nlp/sphinxtrain
        Sphinxtrain binaries path: /cygdrive/d/nlp/sphinxtrain/bin/Release/Win32
        Running the training
        MODULE: 000 Computing feature from audio files
        Extracting features from segments starting at (part 1 of 1)
        ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check the log file for details.
        Extracting features from segments starting at (part 1 of 1)
        ERROR: This step had 1 ERROR messages and 0 WARNING messages. Please check the log file for details.
        < I handled these errors by mannually generating mfc files using some other accoustic model>
        Feature extraction is done
        MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
        WARNING: The phonelist (/cygdrive/d/nlp/accoustic_model/zuazu/etc/zuazu.phone) has duplicated phones
        Found 5661 words using 61 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
        Estimated Total Hours Training: 1.16042777777778
        This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 5656
        Words in filler dictionary: 5
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once

        the sample database has been uploaded at
        https://drive.google.com/open?id=0ByCBQceVv2USOXhqdTZMM1NUcDg

         
  • Imran Rajjad

    Imran Rajjad - 2017-01-27

    the sample database has been uploaded at

    https://drive.google.com/open?id=0ByCBQceVv2USOXhqdTZMM1NUcDg

     
    • Arseniy Gorin

      Arseniy Gorin - 2017-01-27

      Not sure what this mean

      < I handled these errors by mannually generating mfc files using some other accoustic model>

      You do not need a model for feature extraction.

      Anyway, you have a duplicate SIL in the zuazu.phone file. Just remove it, and then follow the training tutorial (you should also specify the LM name without DMP at the end, otherwise the decoder does not find it)

          Aligning results to find error rate
          SENTENCE ERROR: 37.1% (263/708)   WORD ERROR RATE: 4.0% (579/14662)
      
       
      • Imran Rajjad

        Imran Rajjad - 2017-01-27

        thanks a lot for the help. Yes that a dumb mistake, did not see that SIL.

        About the DMP, I beleive the lm file name is zuazu.lm and I did not end the file name with DMP. Is there something else I need to know as a rookie?

         
        • Arseniy Gorin

          Arseniy Gorin - 2017-01-27

          I mean by default config file will probably have DMP at the end. Yes, you change it with just lm

           
  • Arseniy Gorin

    Arseniy Gorin - 2017-01-27

    here is the model and the example config file

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.