Menu

#3423 Tcl should support Unicode versions of Win32 file APIs

obsolete: 8.4.13
closed-fixed
7
2007-02-20
2006-05-01
Chris Rose
No

The unicode filename format as described in
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp
is not property handled in Tcl.

The file normalize routines should allow the following
to work:

file exists {\\?\D:\existingfile.txt}

which is equivalent to

file exists {D:\existingfile.txt}

The only difference is that the former form uses the
Unicode file APIs in windows, allowing the creation of
long pathnames (longer than 260 characters) which is
extremely useful when dealing with certain Java tools'
output.

Discussion

  • Benjamin Riefenstahl

    Logged In: YES
    user_id=143885

    "Chris R" writes on c.l.t:
    > There is one issue with this, though -- what *should*
    > the results of a file normalize be,

    [file normalize] for normal file names should not
    change IMO.

    > The two forms (with and without \\?\ prefix) are
    > equivalent to the API, even if not all frontend
    > applications support them.

    No they are not. Opening "\\?\c:\tmp\com1" will create
    a file "com1" which the user will than be unable to
    delete. Opening "c:\tmp\com1" will open the first
    serial port. There are probably other
    incompatibilities and Microsoft might introduce more.
    Also "\\?\..." doesn't mean anything to W9x/Me.

    Tcl handles UNC paths fine right now. Last time I
    tested this, the error result for opening
    \\?\c:\tmp\... took some time, which hints to me that
    some code probably tried to resolve the host "?". That
    would run into a timeout, because that host doesn't
    exist, of course.

    From testing it seems that the code already handles
    ".", which is used for devices. I think we might just
    need a special case in the code that handles UNC paths
    for the special hostname "?".

     
  • Chris Rose

    Chris Rose - 2006-05-01

    Logged In: YES
    user_id=32671

    Granted, although the com1 example can be deleted the same
    way as the file was created.

    Assuming that someone can tell me where the code that
    handles that is, I could have a look at special casing it.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2006-05-01

    Logged In: YES
    user_id=72656

    Tcl uses the W (wide) Win32 file APIs throughout, if you are
    on an NT-based system. I believe there is a bit more to it
    than that to support this feature.

     
  • Don Porter

    Don Porter - 2006-05-04

    Logged In: YES
    user_id=80530

    Perhaps I'm completely off the
    mark, but shouldn't a set of
    pathnames marked by a completely
    separate prefix get handled by
    an additional Tcl_Filesystem ?

     
  • Vince Darley

    Vince Darley - 2006-05-09

    Logged In: YES
    user_id=32170

    As remarked in other comments here, the definition of 'file
    normalize' precludes us from making Tcl itself interpret a
    leading '\\?\' at the Tcl level (of course as dgp says,
    someone can happily write another Tcl_Filesystem to do
    that), but there's nothing to stop us simply checking the
    length of paths in the core and putting in the appropriate
    prefix before calling any of the native routines.

    That, to me, would be the Tcl way -- it would just work.

    Vince.

     
  • Benjamin Riefenstahl

    Logged In: YES
    user_id=143885

    vincentdarley:
    > of course as dgp days, someone can happily write
    > another Tcl_Filesystem to do that

    Is that how other UNC paths work? Basically \\?\ and
    \\.\ are just variations UNC paths AFAICS.

    > That, to me, would be the Tcl way -- it would just
    > work.

    For the programmer. It would not work for the user,
    because files that can only be created with this
    syntax can not be manipulated by the user outside
    of the program that created the file. Users would
    start to format their disks or re-install their
    system to get rid of those files.

    I might well be in the minority with this view,
    but if this gets implemented transparently,
    I'd want to add system-specific code to some of
    my programs to check that I don't use it
    inadvertently. It would make life more difficult
    for me.

     
  • Nobody/Anonymous

    Logged In: NO

    Are there really _no_ programs that could manipulate these
    files if Tcl created them? These are standard win32 APIs
    which have been around a fairly long time so it would seem
    somewhat bizarre if nothing else could access them. Surely
    if someone is asking to be able to create these long paths
    there must be some _other_ applications which are going to
    be involved in some way?

    Does Windows 'explorer' handle them?

    Vince.

    Note: if we wanted we could have a C variable linked to some
    tcl::unsupported::errorOnLongPaths Tcl variable which could
    be checked by the core filesystem code and not allow long
    paths. This wouldn't have any significant performance
    impact at all and might solve things for Benny.

     
  • Benjamin Riefenstahl

    Logged In: YES
    user_id=143885

    Vince:
    > Are there really _no_ programs that could
    > manipulate these files if Tcl created them?

    I never looked for them so I can't say for sure.
    I doubt that the average user would have any
    though.

    > Does Windows 'explorer' handle them?

    Not as of W2K.

    CMD.EXE (again in W2K) can use the explicit syntax
    \\?\... in some circumstances but not in others,
    e.g. in the experiment I did for the discussion on
    tcl-core MKDIR worked but RMDIR didn't :-((. See
    <http://sourceforge.net/mailarchive/forum.php?thread_id=9833535&forum_id=3854>.

    As I understand Microsoft, if you use "normal"
    paths you get normal behaviour for all your
    programs and all garantees and functionality that
    Microsoft always gave for such paths. If you use
    \\?\ you can do what you want, but you are
    responsible yourself for every incompatibility and
    for informing the user about potential problems.

    I don't want that responsibility wherever I can
    avoid it. And I find it difficult to see how Tcl
    can use this transparently without encouraging
    programs that do what I consider bad things.

    Of course explicit (and documented) support for
    \\?\ or //?/ has my vote. I consider it a bug
    that this doesn't just work in Tcl.

    benny

     
  • Chris Rose

    Chris Rose - 2006-05-14

    Logged In: YES
    user_id=32671

    The gist of discussion here is, I think, the right direction
    to go. Basically, remove the restriction that *prevents*
    this functionality working, and ensure that the underlying
    API calls are ones that support both forms (a case which I
    believe to be true for all standard file APIs).

    There's really no sense in adding an artificial restriction.
    The risk of problems with temp files is one best handled by
    the application developer.

     
  • Don Porter

    Don Porter - 2006-05-14

    Logged In: YES
    user_id=80530

    If folks have the idea I'm against
    expanding the paths recognized by
    Tcl to match those recognized by
    the OS, that's incorrect.

    My only opinion on the subject is
    that if the set of pathnames recognized
    by Tcl is to be expanded, the correct
    way to do it is by adding on an
    additional Tcl_Filesystem, and not
    by adding in more hackery within
    the core cener of the VFS system.

    If we had Tcl_Filesystems back in
    the days when UNC support was added,
    I'd have said the same thing about
    them.

     
  • Pat Thoyts

    Pat Thoyts - 2007-01-08

    patch to support extended path prefix

     
  • Pat Thoyts

    Pat Thoyts - 2007-01-08

    Logged In: YES
    user_id=202636
    Originator: NO

    I'm attaching a patch that supports the extended path prefix in file normalize and in the creation of the native form of the path. I have been successful in handling local files using the \\?\ syntax but MSDN says that UNC paths can have the same extension but using \\?\UNC\ as the prefix. I've not found a Microsoft application that supports this yet.
    File Added: 1479814-extpath.patch

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-02-20

    Logged In: YES
    user_id=72656
    Originator: NO

    I give a +1 to adding it.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2007-02-20
    • priority: 5 --> 7
     
  • Don Porter

    Don Porter - 2007-02-20

    Logged In: YES
    user_id=80530
    Originator: NO

    I suppose there's a case to be
    made that a bad solution is better
    than no solution at all, but it
    still seems seriously wrong to me
    that we multiply the hacks in
    the core generic code instead of
    addressing this via the Tcl_Filesystem
    interfaces.

    That said, I obviously don't
    care enough to implement a
    better answer (at least not
    today) so HOSOGOTP rules.

    still,.... Ick!

     
  • Pat Thoyts

    Pat Thoyts - 2007-02-20
    • status: open --> closed-fixed
     
  • Pat Thoyts

    Pat Thoyts - 2007-02-20

    Logged In: YES
    user_id=202636
    Originator: NO

    At least the additional hackery has a fairly small footprint.
    Patch applied to 8.5. I don't see reason to apply this to 8.4

     
MongoDB Logo MongoDB