#2106 dirent.h functions don't work under _USE_32BIT_TIME_T

WSL
pending
None
Bug
fixed
Feature_in_WSL_4.0
True
2014-12-14
2013-10-13
No

The layout of 'struct dirent' in WSL 4.0 uses the time_t type. However, the width of time_t can be either 32 bits or 64 bits, with the latter being the default in WSL 4.0.
Presumably, when the dirent functions were compiled for inclusion in the libmingwex library, the compilation used the 64-bit default definition of time_t.

This means that applications that define _USE_32BIT_TIME_T will use an incompatible definition of 'struct dirent', which will break any code that calls the dirent functions.

I attach a test program, due to Diego Casorran dcasorran@gmail.com. When compiled with _USE_32BIT_TIME_T defined, it outputs bogus information.

1 Attachments

Related

Issues: #2098
Issues: #2112
Issues: #2113
Issues: #2125
Issues: #2180

Discussion

  • Eli Zaretskii

    Eli Zaretskii - 2013-10-13

    Here's a possible patch to fix this.

     
  • Keith Marshall

    Keith Marshall - 2013-10-14
    • status: unread --> assigned
    • assigned_to: Keith Marshall
    • Patch attached: False --> True
     
  • Keith Marshall

    Keith Marshall - 2013-10-14

    Thanks for the report, and the patch. It seems sensible for me to follow this up, alongside [#2098]. I'll take a look, later this week.

     

    Related

    Issues: #2098

    • Earnie Boyd

      Earnie Boyd - 2013-10-15

      I'll do this one Keith and create a point release.

       
  • Keith Marshall

    Keith Marshall - 2013-10-15

    Well, please don't rush in headlong, and release another screw-up. I'm not convinced that Eli's proposed patch is entirely sufficient.

     
  • Keith Marshall

    Keith Marshall - 2013-10-16

    I'm not convinced that Eli's proposed patch is entirely sufficient.

    After further consideration, I'm becoming more convinced that it is not. Here are some of my concerns:

    1. The time_t fields should be explicitly declared as __time64_t; while long long may be equivalent, it doesn't convey the same sense of purpose.
    2. A complementary change is needed within the declaration of struct _wdirent.
    3. Simply declaring these time_t fields to represent 64-bit explicit types neglects any likelihood that the particular implementations of _findfirst(), _findnext(), _wfindfirst(), and _wfindnext(), within the particular version of MSVCRT.DLL on any end user's platform, may return a 32-bit time_t representation; dirent.c also needs to call variants of these, which guarantee to return __time64_t. A perusal of MSDN suggests that the best choice, for maximum backward compatibility (at least as far as Win98/WinNT4), would be _tfindfirst64(), (which maps as either _findfirst64() or _wfindfirst64()), and _tfindnext64(), (which maps as either _findnext64() or _wfindnext64()).
    4. If we do elect to use _tfindfirst64() and _tfindnext64(), in place of the ambiguous _tfindfirst() and _tfindnext(), then the struct dirent and struct _wdirent declarations must be adapted to match explicit __finddata64_t and __wfinddata64_t respectively, in place of ambiguous _finddata_t and _wfinddata_t struct types; this adds an additional implied requirement that the file size field must change from _fsize_t, to become a uint64_t compatible entity.
    5. io.h also requires a correction, to provide a declaration for struct __wfinddata64_t; it is currently declared as struct _wfinddata64_t, (missing an underscore), which according to this MSDN reference, isn't a data type known to MSVC.

    And yes, in performing this analysis, I've also worked up the necessary implementation; however, it is currently intermingled with my proposed solution for [#2098].

     

    Related

    Issues: #2098

    • Earnie Boyd

      Earnie Boyd - 2013-10-16

      I agree except that the given test case does DTRT with the applied patch. Can you separate the intermingled patch to let us have a look at what affects this issue? The decision needs to be made for 4.1 or a point release of 4.0.

      This brings up a point that we need to modify anything in libmingwex that uses the dirent structure to take the size of the structure passed into parameters into consideration and perhaps break out into *32() and *64() functions and mapped accordingly in the headers. I'm talking things like src/libcrt/tchar/dirent.c that provides us with _topendir and friends. _USE_32BIT_TIME_T is a maddening monster created by the MS mad scientists to cause damage to those who just need to use functions opendir(), readdir() and the like.

       
      • Keith Marshall

        Keith Marshall - 2013-10-16

        It may DTRT on your box, Eli's box, and my box, but can we guarantee that it will DTRT on Joe Soap's old box, with an MSVCRT.DLL version which predates the _USE_32_BIT_TIME_T insanity? I don't think we can, unless we use _find*() variants which use an unambiguous _finddata_t implementation. Of these, _findfirst64() and friends suit well, and are historically longest lived.

        I don't think we need to complicate it with *32() and *64() variants; a 64-bit implementation should be sufficient for all cases.

        Although intermingled, the two patches don't intersect; separation wasn't particularly difficult. The section specific to this ticket is attached; notice that it does adapt src/libcrt/tchar/dirent.c, as required to always use the __finddata64_t and __wfinddata64_t data representations; (it also includes my corrections for __wfinddata64_t misspelling, in io.h).

         
        • Eli Zaretskii

          Eli Zaretskii - 2013-10-16

          Wouldn't using _findfirst64 limit the usability of libmingwex to old versions of MSVCRT.DLL, which did not have this function?

           
          • Keith Marshall

            Keith Marshall - 2013-10-16

            According to this MSDN reference, _findfirst64() and friends are available in all MSVCRT.DLL versions from Win98 and WinNT (which I assume means WinNT4) onwards; (the _w*() variants are on the WinNT derivatives only). I'm not too bothered about supporting anything which predates those; the use of 64-bit types for the time_t fields will break the implementation on anything older anyway.

             
    • Eli Zaretskii

      Eli Zaretskii - 2013-10-16

      It's your call, but I used long long deliberately, for the following reasons:

      1. __time64_t is a "cooked" type, it is not known to the compiler unless other headers are included. long long is built-in, so it will always DTRT.
      2. The meaning of these fields is sufficiently conveyed by their names.
      3. The purpose of including these fields was an implementation detail; the comments sound like saying "this is internal to implementation, don't touch it". Maybe you should make them opaque, they are non-standard and will not be used widely anyway.
       
  • Keith Marshall

    Keith Marshall - 2013-10-17

    FTR: with my patch applied, and the provided dirent_bug-2.c modified, (to account for the increased size of the d_size field, and to make it easier to compare returned size values with ls output):

    #include <stdio.h>
    #include <dirent.h>
    
    int main(void) {
        struct dirent * entry;
        DIR * dp;
        int max = 7;
    
        fprintf(stderr,"time_t = %db\n", sizeof(time_t) * 8);
    
        dp = opendir("C:\\MinGW\\bin");
        if (dp == NULL) {
            perror("opendir");
            return -1;
        }
        while (--max && (entry = readdir(dp))) {
    
            fprintf( stderr,"%s: %I64u B\n", entry->d_name, entry->d_size );
        }
    
        closedir(dp);
        return 0;
    }
    

    and with reference data, (as shown by MSYS ls):

    $ ls -l /c/mingw/bin | head
    total 57582
    -rwxr-xr-x 1 keith Administrators 1732622 Nov 15  2012 addr2line.exe
    -rwxr-xr-x 1 keith Administrators 1758222 Nov 15  2012 ar.exe
    -rwxr-xr-x 1 keith Administrators 2226190 Nov 15  2012 as.exe
    -rwxr-xr-x 1 keith Administrators   11776 Oct 16  2012 c++.exe
    -rwxr-xr-x 1 keith Administrators 1721870 Nov 15  2012 c++filt.exe
    -rwxr-xr-x 1 keith Administrators   11776 Oct 16  2012 cc.exe
    -rwxr-xr-x 1 keith Administrators 1598478 Oct 16  2012 cpp.exe
    -rw-r--r-- 1 keith Administrators  164468 May 31 09:36 depends.chm
    -rwxr-xr-x 1 keith Administrators    9216 May 31 09:36 depends.dll
    

    for the _USE_32BIT_TIME_T case, I see (correct) output:

    $ gcc -o foo.exe -D_USE_32BIT_TIME_T dirent_bug-2.c
    $ ./foo.exe
    time_t = 32b
    .: 0 B
    ..: 0 B
    addr2line.exe: 1732622 B
    ar.exe: 1758222 B
    as.exe: 2226190 B
    c++.exe: 11776 B
    

    and without _USE_32BIT_TIME_T, equally correct:

    $ gcc -o foo.exe dirent_bug-2.c
    $ ./foo.exe
    time_t = 64b
    .: 0 B
    ..: 0 B
    addr2line.exe: 1732622 B
    ar.exe: 1758222 B
    as.exe: 2226190 B
    c++.exe: 11776 B
    
     
  • Earnie Boyd

    Earnie Boyd - 2013-10-17
    • assigned_to: Keith Marshall --> Earnie Boyd
    • Resolution: none --> accepted
    • Category: Unknown --> Feature_in_WSL_4.0
     
  • Earnie Boyd

    Earnie Boyd - 2013-10-17

    I will use Keith's patch to apply the fix for this issue and do a release of 4.0.4 hopefully before Monday. I will merge the fix to 4.1-dev as well; fix in 4.0-dev, merge 4.0-dev fix in master, tag master with 4.0.4 and then merge master fix to 4.1-dev.

     
    • Keith Marshall

      Keith Marshall - 2013-10-17

      Hmm. I'm bewildered. I don't know how I got it to compile, but yesterday it did; today it didn't. Investigating, I find a bunch more bogus references to _wfinddata64_t, in include/tchar.h and include/wchar.h; fixing those resolves the issue.

      Updated patch is attached.

       
  • Jan Nijtmans

    Jan Nijtmans - 2013-11-06

    Just wanted to let you know that I agree with Keith's solution. The only remark to the patch is the following in "dirent.c":

    #undef  _finddata_t
    #define _finddata_t  __finddata64_t
    

    Why not simply search/replace all occurrences of _finddata_t in this file
    with __finddata64_t (and the same for _wfinddata_t)?

     
    • Keith Marshall

      Keith Marshall - 2013-11-06

      Why not simply search/replace all occurrences of _finddata_t in this file with __finddata64_t (and the same for _wfinddata_t)?

      Of course, I could have done so. I considered it, but decided against; there are two reasons, neither especially compelling, (but then neither is there a compelling reason to require the search and replace):--

      1. Microsoft tell us that __finddata_t and _wfinddatat are macros, so why not use that to our advantage, by ensuring that they are defined to suit our requirements? Doing so obviates any possibility, however slight, of missing any substitution.
      2. Cosmetic, but perhaps a little more compelling: the original source is tabulated for a "pretty-printable" layout; substitution of longer type names would have wrecked that, and I preferred to keep it neat, without expending effort to tidy it up.
       
      Last edit: Keith Marshall 2013-11-06
  • Keith Marshall

    Keith Marshall - 2014-12-14

    According to this MSDN reference, _findfirst64() and friends are available in all MSVCRT.DLL versions from Win98 and WinNT (which I assume means WinNT4) onwards...

    Hmm. MSDN may say this, but it appears to be untrue. Thus, as discussed in this mailing list thread, I've reimplemented dirent.c to eliminate all references to any function in the _findfirst()/_findnext() family, using the Windows API functions, FindFirstFile() and FindNextFile() instead. I have committed this replacement implementation to the legacy branch of the mingw-wsl git repository; it will be included in the next mingwrt-3.x release.

    As a side effect of this reimplementation, I've also eliminated all references to _finddata_t structures, and to any time_t members within the dirent and DIR structures. While this causes an ABI change, and is not something I do lightly, I believe it is justified in this case, because the existing structure layout is potentially ambiguous anyway, due to Microsoft's _USE_32BIT_TIME_T insanity; as a consequence, users must be advised to recompile all third party libraries which they use in conjunction with the new mingwrt.

    Additionally, in this reimplementation, I have added the prefix __mingw_ to each publicly visible function name symbol within dirent.c, (thus providing the necessary infrastructure to address issue [#2098]). All references to these functions, within mingwrt itself, will use the prefixed name; however, I have also added separate trampoline functions, as stubs in libmingwex.a, to ensure that user code may continue to access the implementation using the original function names.

     

    Related

    Issues: #2098

  • Keith Marshall

    Keith Marshall - 2014-12-14
    • status: assigned --> pending
    • Resolution: accepted --> fixed
     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks