Menu

#712 fstat CUR_PTR tag reports uncorrect value for compressed files

v1.0 (example)
closed-fixed
nobody
None
5
2018-04-09
2016-10-28
giloo
No

just run the new version of testsuite/test_point_lun. The pointer position reported in fstat() is wrong when option COMPRESS is active, even when the file is not actually compressed.

Related

Bugs: #712

Discussion

  • GregJung

    GregJung - 2017-10-02

    This issue is not going away. To use fits in astron library the point_lun routine needs to be avoided for compressed files. To rewind a compressed file,

    Free_lun,unit
    openr,unit,filename, /compressed

    is used. To patch this into the astron fits library, a "point_lun_forward" routine is used. This applies to fits_read, fits_open, and fits_info. The attached directory, called "new", contains the appropriate modifications and can be extracted to the fits/ directory where these are located.

     
  • Alain C.

    Alain C. - 2017-10-03

    Thanks for reporting this regression, again.
    File "test_point_lun.pro" is not tested in regression tests :(
    One time again we have to clean up the files and ensuring they are tested :(

    See from my side, the regression was introduce by the change done on 2016 Oct 29

    http://gnudatalanguage.cvs.sourceforge.net/viewvc/gnudatalanguage/gdl/src/basic_pro.cpp?r1=1.206&r2=1.207&sortby=date

    because the versions before are OK for me.
    Before going further, may be we just have to clarify this change (related to bug 708 !)

    A.

     
  • Alain C.

    Alain C. - 2017-10-03

    Greg

    I have no time to go in your patch, sorry. I was just verifying that we had a test (yes !) really tested in regression tests (no !) and it was working in some version in the past (yes !)

    A.

     
  • GregJung

    GregJung - 2017-10-03

    I cannot run test_point_lun from windows but I have verified equivalent behavior in Linux of what I see,
    attached is "sampledat.gz", a fits file which is large enough to bring out the behavior. Perhaps the test file generated from test_point_lun is too small to snag the inconsistencies. What I observe, on both systems, is that the compressed and uncompressed data do not match once the access is attempted that isn't serial. In the windows case at least, I also find that the "rewind command", either

    "point_lun,lun,0"
    or
    close,lun & openr,lun, filename, /compressed

    Were also insufficient to get point_lun_forward to work: instead I needed to use
    free_lun , unit & openr, unit, filename, /compressed

    I had thought this was fatal behavior for the use of the astron/fits library on large collections, but in fact the rewind should only occur during casual inspection - not so much in batch processing - and since most disk devices are buffered anyway, the lag from another read/decompress may not be significant.

    (The code in bug#708 will not affect this as there is no active heap for pointers or for objects.)

     

    Last edit: GregJung 2017-10-03
    • giloo

      giloo - 2017-10-03

      Hi,

      I've looked quite deeply into this problem since Greg's mail.
      My tests seem to show that GDL does not use the correct routine when writing or reading compressed fluxes (files). So the position is lost. If one does only successive read or write, only fstat (which is not the system fstat call) fails. If you mix
      point_lun and reads, at some point they become desyncronized:
      with the point_lun example file where the file contains 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
      GDL> filegz = '/tmp/file.txt.gz' & openr, lun, filegz, /get_lun, /compress& s="    " & readu,lun,s & print,s
      abcd
      GDL> readu,lun,s&print,s
      efgh
      GDL> point_lun,lun,3L
      GDL> readu,lun,s&print,s
      ijkl
      GDL> point_lun,lun,10L
      GDL> readu,lun,s&print,s
      mnop
      GDL> point_lun,lun,0L
      GDL> point_lun,lun,10L
      GDL> readu,lun,s&print,s
      klmn

      only by rewinding ( point_lun,lun,0L) the synchro is ok again.

      This because the compiler uses the standard read() routine regardless of the nature of the flux (compressed, not compressed, etc).
      It's a bit tricky and the corresponding code is not clean at all.
      Up to the point that we may need Marc.
      Best
      Gilles

       
    • Alain C.

      Alain C. - 2017-10-03

      Sorry, I mix some old/new gz files ("test_point_lun.pro" changed), and yes, no good version :(

      I tested your code, which is working well

      It would be much better to correct GDL code than to ask AstronLib to add a patch, isn't it ?

      A.

       
  • Alain C.

    Alain C. - 2017-10-04

    Concering the point by Gilles, I think I have a suggestion :

    in point_lun function in "basic_pro_jmg.cpp" I put

    else {
    DLong64 pos, zero;
    actUnit.Seek(0);
    zero=actUnit.Tell();
    e->AssureLongScalarPar( 1, pos);
    actUnit.Seek( zero+pos);
    }

    it is working on my tests on both compress and not compress files

    GDL> filegz = '/tmp/file2.txt.gz' & openr, lun, filegz, /get_lun, /compress& s=" "
    GDL> readu,lun,s & print,s
    abcd
    GDL> point_lun,lun,3L
    GDL> readu,lun,s & print,s
    defg
    GDL> filegz = '/tmp/file1.txt' & openr, lun, filegz, /get_lun& s=" "
    GDL> readu,lun,s & print,s
    abcd
    GDL> point_lun,lun,3L
    GDL> readu,lun,s & print,s
    defg

     
  • Alain C.

    Alain C. - 2017-10-04

    but for FSTAT(), I tried the same trick
    (same file, line ~775)

    DLong64 pos, zero;
    pos=actUnit.Tell();
    actUnit.Seek(0);
    zero=actUnit.Tell();
    actUnit.Seek(pos-zero);
    
    //if (big) fstat->InitTag("CUR_PTR", DLong64GDL( actUnit.Tell()));
    //else fstat->InitTag("CUR_PTR", DLongGDL( actUnit.Tell()));
    
    if (big) fstat->InitTag("CUR_PTR", DLong64GDL( pos-zero));
    else fstat->InitTag("CUR_PTR", DLongGDL( pos-zero));
    

    but is is not working :(

     
    • giloo

      giloo - 2017-10-04

      Yes, I have also the experience that it is quite easy to cure the read/write discrepancy I reported, your code is OK, but this does not change the returns of Tell() which are deep in the zlib.h
      Probably we can keep track of the "uncompressed" position in the igzstream object, provided we keep track any time we do something with this object (open, close, read, write, seek, tell etc). tedious.
      Or, there is a bug somewhere, people do not use zlib.h as we do (no complaints ever seen on web), and our code seems unnecessary complicated. Just my guess.
      In which case a patch in astronlib and putting the patched procedure in gdl is a good temporary solution?
      Best
      Gilles

       
  • giloo

    giloo - 2017-10-12

    patched and should work. passes test_point_lun and test_zip (the latter was falsely reporting a problem)

     
  • Sylwester Arabas

    Is this ready to be closed then?

     
  • Sylwester Arabas

    • status: open --> closed-fixed
     

Log in to post a comment.

MongoDB Logo MongoDB