#256 Full implementation of realpath() for Win32

closed-wont-fix
nobody
2010-01-09
2005-09-17
Nach M. S.
No

I noticed the other day that Win32 is the only port one
of my applications doesn't have access to realpath()
on. So I wrote realpath() for it, and I might as well
submit it so it becomes part of MinGW and I don't have
to make a special ifdef __WIN32__ for it ;)

The realpath() I wrote should be standards conforming.
It uses GetFullPathNameA() from Win32 as the underlying
means as getting the resolved path. It sets errno
appropriately as needed and even translates Win32
errors to the appropriate errno.

It also makes sure the path exists (via a call to
stat()) and supports the glibc extension to allocate
the resolved path buffer if it was passed in as null to
the call to realpath().

My tests show the function works quite well, but if
someone finds an error with it, I'll be happy to fix it.

I give full permission to MinGW and it's users to use
it, if I messed up on the usage rights in the file,
I'll be happy to change it too.

-Nach

Related

Issues: #2207

Discussion

1 2 3 > >> (Page 1 of 3)
  • Nach M. S.
    Nach M. S.
    2005-09-17

    realpath() for Win32

     
    Attachments
  • Keith Marshall
    Keith Marshall
    2005-12-02

    Alternative implementation

     
    Attachments
  • Keith Marshall
    Keith Marshall
    2005-12-02

    Logged In: YES
    user_id=823908

    Hmm. I normally just use

    #define realpath(N,R) _fullpath((R),(N),_MAX_PATH)

    if I need realpath().

    _fullpath() is a standard Microsoft function, which somewhat
    more closely resembles realpath() than GetFullPathNameA()
    does, although it still doesn't confirm that the referenced
    entity either exists, or is accessible.

    If you want a somewhat more SUSv3 conformant implementation
    of realpath(), you might consider the more compact version
    I've attached.

    Thanks for your patch, but I will not add it to MinGW in its
    present form. Note that for formal consideration, patches
    should be accompanied by an appropriate ChangeLog entry; in
    the case of a function such as realpath(), which is not a
    standard Microsoft function, and therfore not documented on
    MSDN, I would also like to see documentation, preferably in
    the form of a manpage, describing the function's usage, and
    any peculiarities of its behaviour in it's Win32 adaptation.
    You should also include a patch, in `diff -u' format, to
    add a suitable function prototype to an appropriate header file.

    Also note that, when you assign software to the public
    domain, then you implicitly relinquish any claim to
    copyright over that code. Thus, the preamble to your code
    should not claim copyright, while also asserting assignment
    to the public domain; rather say `Public domain software:
    written by M.S. Nach', or words to that effect, if you wish
    to be identified as the author, or simply donate it anonymously.

    Regards,
    Keith.

     
  • Nach M. S.
    Nach M. S.
    2005-12-14

    Logged In: YES
    user_id=695796

    >Hmm. I normally just use
    >
    > #define realpath(N,R) _fullpath((R),(N),_MAX_PATH)
    >
    >if I need realpath().
    >
    >_fullpath() is a standard Microsoft function, which >somewhat
    >more closely resembles realpath() than GetFullPathNameA()
    >does, although it still doesn't confirm that the
    >referenced
    >entity either exists, or is accessible.
    >
    >If you want a somewhat more SUSv3 conformant implementation
    >of realpath(), you might consider the more compact version
    >I've attached.
    That's very nice, but your implementation isn't standards
    conformant, close doesn't cut it. The one I've written makes
    sure the entity does exist and is accessable. Mine also sets
    errno properly.

    >in the case of a function such as realpath(), which is not
    >a standard Microsoft function, and therfore not documented
    >on MSDN, I would also like to see documentation, preferably
    >in the form of a manpage

    I thought you have access to the manpage for realpath, I'm
    attaching the Linux one I have which is basically what I cloned.

    >and any peculiarities of its behaviour in it's Win32
    >adaptation.

    In the case of Win32, the ELOOP won't ever be returned by my
    implementation since it doesn't do any checks for symbolic
    links. As far as I know, Win32 currently doesn't use
    symbolic links.

    >You should also include a patch, in `diff -u' format, to
    >add a suitable function prototype to an appropriate header
    >file.
    Not a problem.

    --- /usr/i586-mingw32/include/stdlib.h 2004-09-28
    22:56:17.000000000 +0200
    +++ stdlib.h 2005-12-14 15:25:07.000000000 +0200
    @@ -407,6 +407,7 @@
    _CRTIMP void __cdecl _makepath (char*, const char*, const
    char*, const char*, const char*);
    _CRTIMP void __cdecl _splitpath (const char*, char*,
    char*, char*, char*);
    _CRTIMP char* __cdecl _fullpath (char*, const char*, size_t);
    +_CRTIMP char* __cdecl realpath(const char *path, char
    resolved_path[PATH_MAX]);

    _CRTIMP char* __cdecl _itoa (int, char*, int);
    _CRTIMP char* __cdecl _ltoa (long, char*, int);

    >Also note that, when you assign software to the public
    >domain, then you implicitly relinquish any claim to
    >copyright over that code. Thus, the preamble to your code
    >should not claim copyright, while also asserting assignment
    >to the public domain; rather say `Public domain software:
    >written by M.S. Nach', or words to that effect, if you
    >wish to be identified as the author, or simply donate it
    >anonymously.

    Fine, I'll resubmit the file with a different header.

    -Nach

     
  • Nach M. S.
    Nach M. S.
    2005-12-14

    Realpath Manpage

     
    Attachments
  • Nach M. S.
    Nach M. S.
    2005-12-14

    realpath() for Win32 with corrected license

     
    Attachments
  • Logged In: YES
    user_id=1040098

    I'd just like to comment on your implementation n-a-c-h.

    * Firstly, the comment about ELOOP is interesting. Yes,
    current NTFS has limited support for both soft and hard
    links, which almost no Windows applications handle
    correctly, leading to all sorts of weird problems,
    including security problems. I wonder if realpath() for
    Win32 should do anything special. Furthermore, Vista will
    have more pronounced symlink support; I wonder what this
    will change. Anyway, there should be some documented
    rationale, at the very least, for whatever decision is
    made here. On the other hand, POSIX guarantees no
    particular semantics with regard to symbolic links.
    Still, something reasonable should happen.

    * The 'glibc extension' to handle a null resolved_filename
    is inviolation of the standard. POSIX states that it
    shall return EINVAL in that case.

    * You may consider adding 'restrict' as appropriate,
    conditionalized on C99.

    * Is PATH_MAX really the right constant to use here? In
    POSIX, there is no particular compile-time constant
    corresponding to the maximum path length, which may be
    infinity.

    * The behavior to malloc() a new buffer when PATH_MAX is
    not big enough is incorrect. This will almost certainly
    cause memory leaks. Outside of doing strlen() on the
    return value, how can client code determine whether malloc
    has been used, and consequently if free() needs to be
    called? Besides, theres no particular guarantee that the
    client pay attention to the return value of realpath() at
    all.

     
  • Nach M. S.
    Nach M. S.
    2005-12-15

    Logged In: YES
    user_id=695796

    >* The 'glibc extension' to handle a null
    >resolved_filename
    >is inviolation of the standard. POSIX states that it
    >shall return EINVAL in that case.
    Yes that is correct, I only added it because I saw an
    argument on the MinGW mailing list somewhere about
    implementing this a few months back, and one of the devs
    mentioned they wanted the glibc extension.
    It should be noted in the case where the extension fails, it
    does return EINVAL in my implementation.

    >* Is PATH_MAX really the right constant to use here? In
    >POSIX, there is no particular compile-time constant
    >corresponding to the maximum path length, which may be
    >infinity.
    That I'm not sure about, hence why I asked people to comment
    on errors and possible solutions with it.

    >* You may consider adding 'restrict' as appropriate,
    >conditionalized on C99.
    Where here is restrict appropriate? I honestly never looked
    much into that C99 addition. I'd be happy to hear more
    information on it.

    >* The behavior to malloc() a new buffer when PATH_MAX is
    >not big enough is incorrect. This will almost certainly
    >cause memory leaks. Outside of doing strlen() on the
    >return value, how can client code determine whether malloc
    >has been used, and consequently if free() needs to be
    >called? Besides, theres no particular guarantee that the
    >client pay attention to the return value of realpath() at
    >all.
    Perhaps you didn't understand what I did.
    The behavior to malloc a new buffer is when malloc(PATH_MAX)
    wasn't big enough. That is the behavior I saw somewhere on
    MSDN for using GetFullPathName*(). The user knows he has to
    call free, because he called realpath() with a 0 as his
    second parameter. I don't see how this would cause a memory
    leak. The worry about him requesting to allocate memory then
    not checking or freeing it is the same as if someone were to
    write:
    malloc(5000);
    Or
    strdup(mystr);
    We never worry about that, so I don't see the problem here.

    I'll be happy to see further link dicussions. Although the
    available C runtime for Windows seems very limited in it's
    support for things in these areas. Things like stat()'s
    st_ino are always 0 preventing checks to see if two files
    are the same, etc...

    -Nach

     
  • Logged In: YES
    user_id=1040098

    > Where here is restrict appropriate? I honestly
    > never looked much into that C99 addition. I'd be
    > happy to hear more information on it.

    In the general case, where (by the interface) two pointer
    parameters shall not alias. It provides a performance
    boost in some cases.

    In this specific case, POSIX (the modern version which is
    based on C99) specifies that both parameters shall be
    restrict-qualified.

    > Perhaps you didn't understand what I did.

    You're right; I didn't. Nonetheless, I am still concerned
    about how this behavior that differs from standard-
    mandated behavior.

    Another interesting point is that most software that needs
    this function is broken or legacy. Without some operating
    system-specific guarantee, the standard version of this
    function is difficult to use in a strictly-conforming
    manner. This, combined with how past implementations
    (including glibc's) have invariably had multiple serious
    bugs, means most software ships their own implementation
    of this. I am curious what software there is that depends
    specifically on glibc's extensions.

     
  • Nach M. S.
    Nach M. S.
    2005-12-15

    Logged In: YES
    user_id=695796

    >Another interesting point is that most software that needs
    >this function is broken or legacy. Without some operating
    >system-specific guarantee, the standard version of this
    >function is difficult to use in a strictly-conforming
    >manner. This, combined with how past implementations
    >(including glibc's) have invariably had multiple serious
    >bugs, means most software ships their own implementation
    >of this. I am curious what software there is that depends
    >specifically on glibc's extensions.

    I have a program which I currently have it running and
    tested on at least MS-DOS, Windows, Linux, FreeBSD, and Mac
    OS X. It doesn't use the glibc extension (which personally
    makes no sense - unless it becomes part of POSIX), and works
    fine everywhere. My only problem was that I had to make my
    own for Windows, everything else worked fine.

    As to why I use it; basically my program sometimes needs to
    launch another program passing it a file, since the program
    to launch could be located in one directory and the file in
    another and the current directory of the running program
    somewhere else, I pass the absolute path of the file to the
    program being launched.

    -Nach

     
1 2 3 > >> (Page 1 of 3)