#34 unzip strips trailing semicolon from filename when extractin

open-fixed
nobody
None
5
2010-11-23
2010-07-24
Anonymous
No

This was first reported in the Ubuntu bugtracker: https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/609254

If a file name in a zip archive ends with a ";", the extracted file name will have lost this character, e.g.
"this is a new file&22;" will be extracted as "this is a new file&22"
The ubuntu bug contains a test.zip with the above mentioned filename: http://launchpadlibrarian.net/52387252/test.zip

This is with "UnZip 6.00 of 20 April 2009"

Discussion

  • Steven Schweda
    Steven Schweda
    2010-07-25

    This is almost a feature, but not quite. The code in mapname()
    (in various system-specific .c files) which strips VMS versions (";nnn")
    from file names has been interpreting ";" (with no digits) as a VMS
    version, which it really isn't. (While a VMS user can specify something
    like "name.type;" to refer to the highest version of "name.type", a real
    VMS version in an archive will always have at least one digit.)

    A simple fix for this specific case looks like this (in most places):

    ALP $ gdiff -u unix.c_orig unix.c
    --- unix.c_orig 2009-01-23 17:31:26 -0600
    +++ unix.c 2010-07-24 18:47:41 -0500
    @@ -672,10 +672,12 @@
    /* if not saving them, remove VMS version numbers (appended ";###") */
    if (!uO.V_flag && lastsemi) {
    pp = lastsemi + 1;
    - while (isdigit((uch)(*pp)))
    - ++pp;
    - if (*pp == '\0') /* only digits between ';' and end: nuke */
    - *lastsemi = '\0';
    + if (*pp != '\0') { /* At least one digit is required. */
    + while (isdigit((uch)(*pp)))
    + ++pp;
    + if (*pp == '\0') /* only digits between ';' and end: nuke */
    + *lastsemi = '\0';
    + }
    }

    /* On UNIX (and compatible systems), "." and ".." are reserved for

    This should avoid the misinterpretation for names like "fred;", but a
    name like "fred;123" will still be interpreted as having a VMS version
    number, which may not give the desired result.

    A work-around would be to specify "-V" ("retain VMS version numbers")
    on the UnZip command line. One might argue that the default should have
    been to leave names like this as-is on non-VMS systems, but the original
    expectation almost certainly was that no one would be using ";" in a
    file name, except on VMS. it's not obvious that changing this now would
    be a good idea, but I'm open to a good argument (either way).

     
  • Ed Gordon
    Ed Gordon
    2010-11-23

    I believe this should make the next UnZip beta which should be going out shortly.

     
  • Ed Gordon
    Ed Gordon
    2010-11-23

    • status: open --> open-fixed