Menu

#3265 stat ino, nlink fields not set

obsolete: 8.5a4
closed-fixed
5
2005-10-23
2005-10-13
Anonymous
No

When support for file links was added to Tcl, the
'stat' command was not updated on Windows to add
unix-like support for the ino and nlink fields.

Here's a relevant email transcript:

Jeff Hobbs
to Oscar, me
More options Sep 28
Hi Oscar,

You may well be right that the 'file stat' stuff was not
updated to properly reflect the status of hard links. I
am cc'ing Vince Darley, author of that TIP to comment on
whether that was an oversight or intentional.

Regards,

Jeff

Oscar Bonilla wrote:
> On Sep 27, 2005, at 6:16 PM, Jeff Hobbs wrote:
>
> > Oscar Bonilla wrote:
> >
> >> I was talking with Larry about hard links on
Windows and how we don't
> >> support them even though NTFS supports them, and
he brought up that
> >> ActivePython supports them.
> >>
> >> I was looking at the ftp site of activestate, and
it seems the newest
> >> source is for version 2.3.1 from Nov 2003. Do you
know where I could
> >> get the latest source?
> >>
> >> Or better yet, do you know how they implemented
the stat(2) syscall
> >> (in terms of which win32 APIs)? Does Tcl handle this?
> >>
> >
> > Tcl supports hard links as well:
> >
> >
http://aspn.activestate.com/ASPN/docs/ActiveTcl/tcl/TclCmd/
> > file.htm#M20
>
> Well, unless I'm reading
>
http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/win/tclWinFile.c?
> rev=1.77&view=auto
> incorrectly, 'file stat' always returns st_ino = 0
and st_nlink = 1
> on Windows.
>
> So even though you can create hard links, you can't
tell if a file is
> hard linked to another or if it has more than one
link. Right?
>
> Cygwin seems to get away with using nFileIndexHigh |
nFileIndexLow
> for the inode and you can get the real hard link
count from
> GetFileInformationByHandle() in the nNumberOfLinks field.
>
> > You would get the latest python from SourceForge.
>
> I couldn't find the hard link stuff there... Tcl
source seems more
> organized ;-)
>
> Regards,
>
> -Oscar

ReplyReply to allForwardInvite Jeff to Gmail

Vince Darley
to Jeff, Oscar
More options Sep 28
Oscar,

Indeed 'ino' and 'nlink' were not updated when I added
support for
hard links to Tcl. This was simply an oversight due to
my lack of use
of these fields (and it would seem most people's lack
of use of them,
given yours is the first bug report on this!). It would
be good to
make these as similar as possible to their Unix
interpretations.

Can you perhaps provide a patch and/or new tests for
the test suite?

Discussion

  • Vince Darley

    Vince Darley - 2005-10-13

    Logged In: YES
    user_id=32170

    Added first attempt at a patch.

    Vince.

     
  • Vince Darley

    Vince Darley - 2005-10-13

    Logged In: YES
    user_id=32170

    And here's a better patch with tests.

     
  • Vince Darley

    Vince Darley - 2005-10-13

    second patch

     
  • Oscar Bonilla

    Oscar Bonilla - 2005-10-14

    Logged In: YES
    user_id=219610

    Make sure st_ino is at least defined as an unsigned. I don't know where
    you're getting your definition of struct stat from, but the one from MS's
    headers and the one in MSYS default st_ino to short, and both
    nFileIndexHigh and nFileIndexLow are DWORDs. You can imagine all
    the collisions you can get by casting a quad word to a short ;-)

    The other thing you guys might want to consider (although it might be
    too much work and you might just decide to punt on it) is that you can
    actually fake the st_nlink for directories and have it behave just like
    Unix. I wrote a little proc that goes something like:

    int
    dirEntries(const char *dir)
    {
    WIN32_FIND_DATA f;
    HANDLE h;
    char buf[MAXPATH];
    int len, count = 0;

    strcpy(buf, dir);
    len = strlen(buf);
    if (len == 0 || buf[len - 1] == '\\' || buf[len - 1] == '/') {
    strcat(buf, "*");
    } else {
    strcat(buf, "/*");
    }

    if ((h = FindFirstFileA(buf, &f)) == INVALID_HANDLE_VALUE) return
    (0);
    count++;
    while (FindNextFileA(h &f)) {
    if (f.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) count++;
    }
    FindClose(h);
    return (count);
    }

    You might want to tweak it to conform to Tcl-style and not use the *A
    API which is just for ASCII and doesn't handle Unicode. As I said, I
    don't know how useful this would be... it allows some optimizations
    when walking directory structures, but I don't know if anyone else uses
    those...

     
  • Vince Darley

    Vince Darley - 2005-10-14

    Logged In: YES
    user_id=32170

    'st_ino' will come from whatever the compile environment
    sys/stat.h or sys/types.h says, so, as you correctly point
    out, it's "unsigned short". Nothing Tcl can do about that, I
    believe.

    I don't know how 'nlink' is defined on unix, but no 'man
    stat' that I've found talks about nlink for directories as
    the number of files within in, so I'd be interested in a
    clear definition (does it include subdirs, symlinks?).

     
  • Oscar Bonilla

    Oscar Bonilla - 2005-10-14

    Logged In: YES
    user_id=219610

    There should be a warning in the release notes or somewhere that st_ino
    can have collisions on Windows and should then not be used for
    determining whether a file is the same as another. A typical idiom (in C)
    for determining if two files are links to the same file is:

    if (stat(filea, &sa) || stat(fileb, &sb)) return (error);
    if ((sa.st_ino == sb.st_ino) && (sa.st_dev == sb.st_dev) &&
    sa.st_nlink >= 2)) return (same);

    In Tcl it would be more verbose, but it would follow the same pattern.
    This is dangerous because on Windows that code could very well say
    yeah, they're the same when in fact they are not. Obviously, the current
    code just returns 0 for st_ino, so it would fail that test...

    In Unix, the '.' and '..' directories, are links to self and parent
    respectively. So if you create an empty directory, it will have 2 links
    (self, and the link from the parent directory). If the directory doesn't have
    any subdirectories, it will have 2 links (same as before). If you create a
    subdirectory, that subdirectory also has . (self) and .. (to directory), so
    now, our directory will have 3 links.

    $ mkdir foobar
    $ ls -lad foobar
    drwxr-xr-x 2 ob ob 68 Oct 14 10:34 foobar
    $ mkdir foobar/baz
    $ ls -lad foobar
    drwxr-xr-x 3 ob ob 102 Oct 14 10:34 foobar
    $

    A common optimization for determining whether you have to recourse
    into a directory or not, is to use hard links to see if the directory has any
    subdirectories or not... but I have to admit it's kind of a special case...
    maybe not worth implementing.

     
  • Vince Darley

    Vince Darley - 2005-10-23

    Logged In: YES
    user_id=32170

    Committed fix to cvs head.

     
  • Vince Darley

    Vince Darley - 2005-10-23
    • status: open --> closed-fixed