Looks like in the latest git code it is "ab" instead of "wb+".  Could you test this out?

Nils

On Mon, Feb 7, 2011 at 8:07 AM, Jonathan Hoser <jonathan.hoser@helmholtz-muenchen.de> wrote:
Hi guys,

this is not entirely on topic, but my guess is, that this is among/close to/the root-cause:

I just had a few hours of debugging, until I resolved my issue that BFAST wouldn't create a gziped temp-file;
e.g in the match-step, it tries to write the brg file to such a temp-file.

My uname -a
###########################
Linux xxx 2.6.35.6-45.fc14.x86_64 #1 SMP Mon Oct 18 23:57:44 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
############################
>yum info zlib
############################
Loaded plugins: langpacks, presto, refresh-packagekit, rhnplugin
Adding en_US to language list
*Note* Red Hat Network repositories are not listed below. You must run this command as root to access RHN repositories.
Installed Packages
Name        : zlib
Arch        : i686
Version     : 1.2.5
Release     : 2.fc14
Size        : 163 k
Repo        : installed

Name        : zlib
Arch        : x86_64
Version     : 1.2.5
Release     : 2.fc14
Size        : 173 k
Repo        : installed
############################


Turns out, that a simple '+' is/was the culprit in bfast/BLib.c:

*** bfast/Blib.c.orig    2011-02-07 16:54:01.521542982 +0100
--- bfast/BLib.c    2011-02-07 16:54:12.110478505 +0100
***************
*** 547,553 ****
      /* Copy over the tmp name */
      strcat((*tmpFileName), BFAST_TMP_TEMPLATE);
     
!     if(-1 == (fd = mkstemp((*tmpFileName))) || NULL == (fp = gzdopen(fd, "wb+"))) {
          /* Check if the fd was open */
          if(-1 != fd) {
              /* Remove the file and close */
--- 547,553 ----
      /* Copy over the tmp name */
      strcat((*tmpFileName), BFAST_TMP_TEMPLATE);
     
!     if(-1 == (fd = mkstemp((*tmpFileName))) || NULL == (fp = gzdopen(fd, "wb"))) {
          /* Check if the fd was open */
          if(-1 != fd) {
              /* Remove the file and close */

As the zlib documentation says: zlib.net/manual.html:

ZEXTERN gzFile ZEXPORT gzopen OF((const char *path, const char *mode));
Opens a gzip (.gz) file for reading or writing. The mode parameter is as in fopen ("rb" or "wb") but can also include a compression level ("wb9") or a strategy: 'f' for filtered data as in "wb6f", 'h' for Huffman-only compression as in "wb1h", 'R' for run-length encoding as in "wb1R", or 'F' for fixed code compression as in "wb9F". (See the description of deflateInit2 for more information about the strategy parameter.) Also "a" can be used instead of "w" to request that the gzip stream that will be written be appended to the file. "+" will result in an error, since reading and writing to the same gzip file is not supported.

gzopen can be used to read a file which is not in gzip format; in this case gzread will directly read from the file without decompression.

gzopen returns NULL if the file could not be opened, if there was insufficient memory to allocate the gzFile state, or if an invalid mode was specified (an 'r', 'w', or 'a' was not provided, or '+' was provided). errno can be checked to determine if the reason gzopen failed was that the file could not be opened.


ZEXTERN gzFile ZEXPORT gzdopen OF((int fd, const char *mode));
gzdopen() associates a gzFile with the file descriptor fd. File descriptors are obtained from calls like open, dup, creat, pipe or fileno (in the file has been previously opened with fopen). The mode parameter is as in gzopen. The next call of gzclose on the returned gzFile will also close the file descriptor fd, just like fclose(fdopen(fd), mode) closes the file descriptor fd. If you want to keep fd open, use fd = dup(fd_keep); gz = gzdopen(fd, mode);. The duplicated descriptor should be saved to avoid a leak, since gzdopen does not close fd if it fails.

gzdopen returns NULL if there was insufficient memory to allocate the gzFile state, if an invalid mode was specified (an 'r', 'w', or 'a' was not provided, or '+' was provided), or if fd is –1. The file descriptor is not used until the next gz* read, write, seek, or close operation, so gzdopen will not detect if fd is invalid (unless fd is –1).

So, removing the + makes bfast work on current Fedora14 Systems.

Sorry for the lengthy email, but I wanted to add an explanation,

Best
-Jonathan






On 02/03/2011 05:37 PM, Brent Pedersen wrote:
yes, and hg19 works for me on other machines. but i think something
about the fseek is not portable to all systems.

On Thu, Feb 3, 2011 at 9:06 AM, Jonathan Hoser
But then again, I'm running BFAST 0.6.4e,
indexing a merger of a small genome and hg19 in colorspace, with all the
'usual' masks as recommended.

I'm running Fedora 14, 64bit, my parameters:

bfast index -f $REF -w 14 -A 1 -n 32 -m $MASK -t -i $MASKID

Its terribly slow even using local disks, (6.5h, using 32threads,
reading from local sata Raid0) but its working.

Best
-Jonathan

On 02/03/2011 04:48 PM, Brent Pedersen wrote:
just to clarify, on the same cluster as nathan, i had the same errors
but was able to get it to run with the minimal changes (found by
nathan) of converting any fseek(FP, 0, SEEK_SET) to:

fclose(FP);
FP = fopen(FileName, "rb");

diff here:
https://gist.github.com/809634

some googling shows that even on a 64bit system with _FILE_OFFSET_BITS
64, there are problems with fseek and ftell on files over 2gb.
changing the fseek calls to fseeko didn't help.

-brentp

On Wed, Feb 2, 2011 at 3:45 PM, Nils Homer<nilshomer@gmail.com>  wrote:
So even with your changes it is still no working?  Sorry, I am a bit
confused,

Nils

On Wed, Feb 2, 2011 at 11:42 AM, Kummer, Nathan<KummerN@njhealth.org>
wrote:
The problem occurs after performing the "fread" operation following the
"fseek" operation.
The fseek operation reports success, and it does not set the errno, but
the file stream is no longer readable.

I am including a diff of the RGIndex.c the original version (no changes
made between 0.6.4e and dev) and the version that I have compiled and
executed to identify additional error codes and debugging information.  In
the event that the "diff" is difficult to read, here is a "cleaned-up" list
of the commands that I have added.
        fflush(tmpLowerFP);
        fclose(tmpLowerFP);
        tmpLowerFP = fopen(tmpLowerFileName, "rb+");
        if(tmpLowerFP == NULL) {
            fprintf(stderr, "Failed to open file:  %s\n",
tmpLowerFileName);
        }
These changes now make the tmpLowerFP available for reading.  I did not
make any changes to to the input stream for the tmp upper file, so the code
advances to tmp upper and failes on that file.  I am going to make the above
change for the tmp upper file as well and re-run.  The original 0.6.4e
version works well when I process a single chromosome or when I concatenate
some chromosomes together (22 and Y) and execute the same command.  It just
gives me what appears to be an EPERM error when I try to process the entire
human genome.

[nkummer@genetics-compute bfast]$ diff bfast-0.6.4e/bfast/RGIndex.c
bfast-dev/bfast/bfast/RGIndex.c
1357a1358
       fflush(tmpLowerFP);
1359,1360c1360,1371
<         fseek(tmpLowerFP, 0 , SEEK_SET);
<         fseek(tmpUpperFP, 0 , SEEK_SET);
---
       fprintf(stderr, "\ntmpLowerFP before fseek ferror():  %i\n",
ferror(tmpLowerFP));
       fprintf(stderr, "tmpUpperFP before fseek ferror():  %i\n",
ferror(tmpUpperFP));
       fprintf(stderr, "fseek(tmpLowerFP, 0, SEEK_SET):  %i\n",
fseek(tmpLowerFP, 0, SEEK_SET));
       fprintf(stderr, "tmpLowerFP after fseek ferror():  %i\n",
ferror(tmpLowerFP));
       fprintf(stderr, "fseek(tmpUpperFP, 0, SEEK_SET):  %i\n",
fseek(tmpUpperFP, 0, SEEK_SET));
       fprintf(stderr, "tmpUpperFP after fseek ferror():  %i\n",
ferror(tmpUpperFP));

       fclose(tmpLowerFP);
       tmpLowerFP = fopen(tmpLowerFileName, "rb+");
       if(tmpLowerFP == NULL) {
               fprintf(stderr, "Failed to open file:  %s\n",
tmpLowerFileName);
       }
1364,1367c1375,1388
<
<         if(1!=fread(&tmpLowerPosition, sizeof(uint32_t), 1, tmpLowerFP) ||
<                         1!=fread(&tmpLowerContig_8, sizeof(uint8_t), 1,
tmpLowerFP)) {
<                 PrintError(FnName, NULL, "Could not open file: " +
tmpLowerFileName, Exit, ReadFileError);
---
       int freadLowerPosition = fread(&tmpLowerPosition,
sizeof(uint32_t), 1, tmpLowerFP);
       int feof1 = feof(tmpLowerFP);
       int ferror1 = ferror(tmpLowerFP);
       int freadLowerContig = fread(&tmpLowerContig_8, sizeof(uint8_t),
1, tmpLowerFP);
       int feof2 = feof(tmpLowerFP);
       int ferror2 = ferror(tmpLowerFP);
       if(1!=freadLowerPosition ||
                       1!=freadLowerContig) {
               fprintf(stderr, "FReadLowerPosition:  %i %s\n",
freadLowerPosition, tmpLowerFileName);
               fprintf(stderr, "FReadLowerContig:  %i %s\n",
freadLowerContig, tmpLowerFileName);
               fprintf(stderr, "\tfeof():  %i\n", feof1);
               fprintf(stderr, "\tferror():  %i\n", ferror1);
               fprintf(stderr, "\tfeof():  %i\n", feof2);
               fprintf(stderr, "\tferror():  %i\n", ferror2);
Here is the output produced by my changed file:
Sorting by thread...
100.000 percent complete
tmpLowerFP before fseek ferror():  0
tmpUpperFP before fseek ferror():  0
fseek(tmpLowerFP, 0, SEEK_SET):  0
tmpLowerFP after fseek ferror():  0
fseek(tmpUpperFP, 0, SEEK_SET):  0
tmpUpperFP after fseek ferror():  0
************************************************************
In function "RGIndexMergeHelperFromDiskContig_8": Fatal
Error[ReadFileError]. Message: Could not read in tmp upper.
The file stream error was:: Bad file descriptor
  ***** Exiting due to errors *****
************************************************************


Nathan

NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information. Any
unauthorized review, use, disclosure or distribution is prohibited. If you
are not the intended recipient, please contact the sender by reply email and
destroy all copies of the original message.


------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better
price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Bfast-help mailing list
Bfast-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bfast-help
------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Bfast-help mailing list
Bfast-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bfast-help


------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Bfast-help mailing list
Bfast-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bfast-help
--
Jonathan Hoser
Institute of Bioinformatics and System Biology


Helmholtz Zentrum München
German Research Center for Environmental Health (GmbH)
Ingolstaedter Landstr. 1
D-85764 Neuherberg, Germany
www.helmholtz-muenchen.de
Chairman of Supervisory Board: MinDir'in Bärbel Brumme-Bothe
Board of Directors: Prof. Dr. Günther Wess and Dr. Nikolaus Blum
Register of Societies: Amtsgericht München HRB 6466


------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires
February 28th, so secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Bfast-help mailing list
Bfast-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bfast-help



-- 
Jonathan Hoser
Institute of Bioinformatics and System Biology
Phone: +49-89-3187-4556
Fax: +49-89-3187-3585
Email: jonathan.hoser@helmholtz-muenchen.de
WWW: http://mips.helmholtz-muenchen.de
Helmholtz Zentrum München German Research Center for Environmental Health (GmbH) Ingolstaedter Landstr. 1 D-85764 Neuherberg, Germany www.helmholtz-muenchen.de Chairman of Supervisory Board: MinDir'in Bärbel Brumme-Bothe Board of Directors: Prof. Dr. Günther Wess and Dr. Nikolaus Blum Register of Societies: Amtsgericht München HRB 6466

------------------------------------------------------------------------------
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world?
http://p.sf.net/sfu/oracle-sfdevnlfb
_______________________________________________
Bfast-devel mailing list
Bfast-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bfast-devel