Unable to read files from network drive

2010-04-22
2013-04-18
  • Hello,

    I'm using Staden package 2.0.0b6 (iolib-1.12.2) on an Opensuse 11.2 linux box. The programs (Trev, Pregap4, Gap4) fail to read any files located on a shared network drive mounted using cifs. When copied to my local hard drive the files can be processed without problems. Reading from and writing to the mounted network drive with other software works fine as well. The user account running Staden definitely has read, write and execute permissions in the network folder.

    The error messages are:

    • Trev: Unable to load /mynetworkdrive/myfile.ab1 with format any

    • Pregap4: Failed files:  {/mynetworkdrive/myfile.ab1} (UNK) 'init: Unknown file type'

    • Gap4: Thu 22 Apr 13:14:32 2010 Database not found

    Can anybody give me some advice what the problem might be?

    Thanks in advance,
    Matthias

     
  • James Bonfield
    James Bonfield
    2010-04-23

    This has me baffled as it shouldn't be able to tell them apart. Are you using the binary release or did you build from source? Do you know how the cifs mount is implemented? I assume it's a proper unix mount, but if it's some LD_PRELOAD hack or library diversion of some sort then it's possible a prebuilt binary wouldn't work. Seems unlikely though and I'm grasping at straws here.

    It would be good to know whether the code can "see" the files or whether it's simply an inability to read that that is at fault. If you have the strace program installed you can do use strace to list the system calls generated and their return values. Eg:

      strace -s 500 -o tr -f trev xb95b11.s1.ztr     [and then exit it]
      egrep xb95b11.s1.ztr tr
      => 5400  stat("xb95b11.s1.ztr", {st_mode=S_IFREG|0644, st_size=9363, ...}) = 0
            5400  open("xb95b11.s1.ztr", O_RDONLY)  = 6
    

    If you could send me the full 'tr' trace output it'd help a lot, thanks. If it's too large then just a grep for your filename is a sufficient start.

     
  • First I tried a built from the source and then, after encountering the problem, I tried also the binary release. Makes no difference. The cifs mount is a standard unix mount on system startup and the libraries should be as they shipped with opensuse 11.2 - no LD hacks.

    I did the tracing as you suggested. The grep gave:

    12637 execve("/usr/local/bin/trev", , ) = 0
    12637 execve("/usr/bin/tclsh", , ) = 0
    12637 stat64("Platte1375_G07_026.ab1", {st_mode=S_IFREG|0755, st_size=248624, …}) = 0
    12637 stat64("Platte1375_G07_026.ab1.gz", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.bz2", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.sz", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.Z", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.bz2", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1", {st_mode=S_IFREG|0755, st_size=248624, …}) = 0
    12637 stat64("Platte1375_G07_026.ab1.gz", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.bz2", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.sz", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.Z", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 stat64("Platte1375_G07_026.ab1.bz2", 0xbf989bf0) = -1 ENOENT (No such file or directory)
    12637 write(2, "'Platte1375_G07_026.ab1': couldn't open\n", 40) = 40

    st_size=248624 is the actual file size as I see it with ls -l
    (The tr file has 890 kB so I think I shouldn't inlcude it here, but if it would help you I can send it to you by email).

    I have one more finding which might help here: Some time ago I wanted my apache webserver to access documents on the network shares and it failed. The solution was to put the directive "EnableSendfile Off" in the httpd.conf section for the respective directory.  See https://issues.apache.org/bugzilla/show_bug.cgi?id=42751 for details on this. I don't know if the Staden software makes use of something similar to sendfile or memory-mapping.

    Thanks for your help!

    Matthias

     
  • James Bonfield
    James Bonfield
    2010-04-23

    (Mutter mutter sourceforge just dropped my reply on the floor.)

    Thanks for the strace output, it's quite interesting. I also think
    that this page may be of help for you:

    http://lists.samba.org/archive/linux-cifs-client/2009-December/005445.html

    Your problem occurs in io_lib, so one of the smaller programs within
    that, such as extract_seq, would probably be a better one to
    debug. Can you verify that extract_seq also fails, and if so send me
    (jkb at sanger ac uk) a copy of the strace output please?

    The strace you quote above is interesting. My gut feeling is that the
    stat64 worked, but was somehow interepreted as a 32-bit stat struct
    (or vice versa) leading to incorrect interpretation. The code
    responsible for this is find_file_dir() in io_lib/open_trace_file.c
    and is_file() in io_lib/files.c:

    int is_file(char * fn)
    {
        struct stat buf;
        if ( stat(fn,&buf) ) return 0;
        return S_ISREG(buf.st_mode);
    }
    /* ... */
    static mFILE *find_file_dir(char *file, char *dirname) {
        char path[PATH_MAX+1], path2[PATH_MAX+1];
        size_t len = strlen(dirname);
        char *cp;
        if (dirname[len-1] == '/')
            len--;
        /* Special case for "./" or absolute filenames */
        if (*file == '/' || (len==1 && *dirname == '.'))
            sprintf(path, "%s", file);
        else
            sprintf(path, "%.*s/%s", (int)len, dirname, file);
        if (is_file(path)) {
            return mfopen(path, "rb");
        }
      /* ... */
    

    I don't know how familier you are with coding or debugging, but I'm
    running out of ideas for testing this. If you're comfortable with
    debugging then I'd suggest rebuilding io_lib with CFLAGS=-g to enable
    debugging and then seeing what happens in the is_file function. The
    fact that the strace shows the stat occurs, the file exists, but then
    we do not try to open it implies that is_file thinks it's not a
    regular file (ie it's a directory or something more exotic). This is
    clearly not true, but it's why I'm starting to suspect a 32-bit vs
    64-bit issue.

    James