Menu

#233 Lattice files not g'zipped?

open
nobody
5
2012-09-21
2010-10-15
Daktari3
No

Training step 61. lattice pruning, crashes with errors (end of ...latprune.log):

process sent: 085738000001956600107152009172200066-0001660-0008260
load lattice ...
Traceback (most recent call last):
File "C:/MatchXMLDocProject/python/cmusphinx/lattice_prune.py", line 60, in <module>
dag = lattice.Dag(os.path.join(denlatdir, c + ".lat.gz"))
File "/cygdrive/c/MatchXMLDocProject/python/cmusphinx/lattice.py", line 182, in init
self.sphinx2dag(sphinx_file)
File "/cygdrive/c/MatchXMLDocProject/python/cmusphinx/lattice.py", line 341, in sphinx2dag
for spam in fh:
File "/usr/local/lib/python2.6/gzip.py", line 438, in next
line = self.readline()
File "/usr/local/lib/python2.6/gzip.py", line 393, in readline
c = self.read(readsize)
File "/usr/local/lib/python2.6/gzip.py", line 219, in read
self._read(readsize)
File "/usr/local/lib/python2.6/gzip.py", line 255, in _read
self._read_gzip_header()
File "/usr/local/lib/python2.6/gzip.py", line 156, in _read_gzip_header
raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
Thu Oct 14 15:51:25 2010


I checked all of my lattices files, denlat/....lat.gz, and none of them are gzipped. Instead, they are all plain text. Surely they should have been gzipped in step 60 when they were created.

So far as I can tell, python is installed with the gzip module, and everything seems OK. I would send more data for debugging, but I can't tell how my files are not being gzipped.

Windows, Cygwin, python 2.6, sphinx code is up to date.

LAT

Discussion

  • Daktari3

    Daktari3 - 2010-10-21

    The lattice files, in the denlat directory, are not written by python routines, but by sphinx 3 module sphinx3decode.exe. I can not see any code in the c source to write gzipped files, and I haven't found a perl script to convert mass files after file writing.
    With the default extension for lattice files being 'lat.gz', the python lattice routines try to read them as gzipped files and crashes.
    You either have to make this work with both writing and reading gzipped files, or both writing and reading text files.
    Maybe this would work, in the style I find in Sphinx:
    - Add line to sphinx_train.cfg:
    $CFG_LATEXT = "lattice";
    This variable would be the suffix for lattice files instead of 'lat.gz', the default
    - Change perl script s3decode to add a parameter to the call of sphinx3_decode to include:
    -latext => $ST::CFG_LATEXT,
    sphinx3_decode already has code to handle command line parameter -latext.
    - Change python module to use suffix from config file.
    LAT

     
  • Nickolay V. Shmyrev

    I can not see any code
    in the c source to write gzipped files, and I haven't found a perl script
    to convert mass files after file writing.

    Hello Larry

    The corresponding code is located in sphinxbase, see the fopen_comp function. It automatically uses gzip with popen in the case the extension ends with .gz. I suppose since Cygwin doesn't support popen well, this method doesn't work.

    Not sure how to solve this issue. Probably we need to implement fopen_comp for Cygwin.

     
  • Nobody/Anonymous

    I checked with the file sphinxbase/config.log, and it says that configure thinks my environment can handle popen:

    configure:5817: checking for popen
    configure:5817: gcc -std=gnu99 -o conftest.exe -g -O2 -Wall conftest.c >&5
    configure:5817: $? = 0
    configure:5817: result: yes

    Later, in the log, it says

    | #define HAVE_POPEN 1

    indicating, I think, that popen works.

    The file include/config.h is written that popen works with:

    / Define to 1 if you have the `popen' function. /

    define HAVE_POPEN 1

    However, the MS Visual studio project points instead to include/win32/config.h, which says no:

    / We don't have popen, but we do have _popen /
    / #define HAVE_POPEN 1 /

    Is it possible that the VS project points to the wrong config? Is there something else wrong with Windows popen that configure doesn't test, so the Windows config.h is just stuck on 'no' on purpose?

    (I can use either VS C++ or g++, and I wonder if there is a difference in configuration.)

    LAT

     
  • Nickolay V. Shmyrev

    Is it possible that the VS project points to the wrong config?

    No, popen support is just not implemented and not tested on win32 platform

    Is there something else wrong with Windows popen that configure doesn't test, so the
    Windows config.h is just stuck on 'no' on purpose?

    Win32 platform with VisualC doesn't have popen, I think it has _popen with underscore. Cygwin on the other side has it. Don't mix the platforms in this bug because C libraries are different.

    (I can use either VS C++ or g++, and I wonder if there is a difference in configuration.)

    Right and you need to stick with one of them. Either stick with Cygwin or with Win32. If you have popen and application calls popen and popen doesn't work, it's more like Cygwin issue.

    If you stick with Win32, the popen is just not implemented.

     

Log in to post a comment.