#4760 NUL in filenames

current: 8.6.0
open
3
2012-11-21
2010-11-25
Andy Goth
No

Some file commands treat NUL (\0) as a string terminator. Others treat it as a normal character. This inconsistency makes it possible to spoof commands that filter based on filename, particularly extension. Example:

proc access {name} {
if {[file extension $name] eq ".secret"} {
error denied
} else {
puts granted
open $name
}
}
% access file.secret
denied
% access file.secret\0
granted
file5

Generally, it's the commands that talk to the operating system that treat NUL as the string terminator. (More likely it's the operating system itself that does this!) For example, [open] and [cd] do this:

% cd .
% cd .\0
% cd \0.
couldn't change working directory to "NUL.": no such file or directory
# (in the previous line, NUL is a real NUL character; to see it, use tkcon instead of xterm)
% file exists .
1
% file exists .\0
1
% file exists \0.
0

Discussion

  • Jeffrey Hobbs

    Jeffrey Hobbs - 2010-11-25

    I have a hard time understanding why this would be important. Adding more "cleanliness" on top of the direct OS layers would be ill advised imo. I think these are indeed valid on unix.

    Why are you tossing nulls in? If you are taking user input, why not strip nuls?

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2010-11-25
    • priority: 5 --> 3
     
  • Andy Goth

    Andy Goth - 2010-11-25

    The problem is that "internal" commands are not consistent with "operating system" commands. Either make the internal commands, like [file split], strip NULs and subsequent characters; or make the operating system commands treat filenames with NULs as nonexistent. I prefer the latter approach, because Tcl can't always distinguish between filenames and other strings: filenames are manipulated not only by [file] but also [string] and variable interpolation and other methods. Also it might be acceptable to have embedded NULs in the names of files on a VFS, just not for the native filesystem.

    Yes, NULs come from user input. I found this while testing Wibble: http://wiki.tcl.tk/27378#pagetocdd31402f . The user can easily insert a NUL by typing %00 in the URL; this convinces the [string match] that the requested file doesn't match any of the forbidden patterns, but then [open] drops the NUL and [chan copy] happily sends the forbidden file to the user.

    I can't easily strip NUL from filenames in Wibble because the core can't tell what parts of the URI or POST will ultimately correspond to an on-disk file, what parts are names of objects that aren't disk files and can legitimately have NULs, and what parts are arbitrary binary data.

    What can I do? Here's a script to strip NUL and subsequent:

    proc stripnul {str} {
    if {[set index [string first \0 $str]] == -1} {
    return $str
    } else {
    string range $str 0 [expr {$index - 1}]
    }
    }

    Then I must use this every time I make any decisions on the basis of a filename that's assembled from user input. That's a pain and there's a lot of room for error. Or, fix the problem everywhere for everybody by making the native filesystem code treat files whose names have embedded NULs as nonexistent. That makes sense to me, because there *aren't* any files with NULs in their names!

    Very old versions of Tcl used to strip NUL everywhere, whether you liked it or not. ;^)

     
  • Andy Goth

    Andy Goth - 2012-11-21
    • milestone: 897381 --> current: 8.6.0
     
  • Jan Nijtmans

    Jan Nijtmans - 2012-11-21
    • assigned_to: vincentdarley --> nijtmans
     
  • Jan Nijtmans

    Jan Nijtmans - 2012-11-21

    The most easy solution is the one used by cygwin: Map all invalid
    characters to a character in the Unicode private area range U+F000 to U+F0FF

    See:
    <http://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-specialchars>

    On UNIX it could mean mapping \0 to \uF000 (or EF 80 80 in UTF-8))
    in internal/external filename translation.

    After 8.6.0 is released, I'll have a further look whether this is practical
    or not. So 8.6.1 is the earliest chance to get it in, if all works out well.
    No high prio.

     
  • Donal K. Fellows

    Quite apart from the fact that protecting files by extension is stupid (yeah, its just an example ;-)) it's arguably the case that NUL should always be treated as the end of the filename on all platforms. I've never encountered an OS that allowed it, though there's the prospect of its denormalized form turning up on systems that use UTF-8 (oh, what a tangled web we weave!).

    Anyway, my point is that we should fix [file]'s subcommands to do the right thing; if someone is abusing them with non-filenames, that's their fault.