From: Alec W. <al...@br...> - 2013-10-21 15:12:42
|
Hi Peter, When MarkDuplicates is accumulating information about mate pairs, it writes information to temporary files, one per reference sequence. If the reference has many sequences, there may be many temporary files. MAX_FILE_HANDLES_FOR_READ_ENDS_MAP controls how many of these files may be open at once. If this number is low, then the program needs to close and re-open files more frequently. This results in slower program execution, but the results should not change. -Alec On Oct 17, 2013, at 2:35 AM, Peter Johansson <tr...@gm...> wrote: > Hello, > > We've experienced some crashes in our pipeline recently with error message: > "/etc/profile: fork: retry: Resource temporarily unavailable". I am > quite convinced this is due to too many files open and that the files > are opened by MarkDuplicates. I suppose I should ask my admin to > increase the limit of opened files, but before doing so wanted to ask > here if anyone has experience of setting number of opened files with > option 'MAX_FILE_HANDLES_FOR_READ_ENDS_MAP' and how it effects the > performance. What is the advantage of having many FDs and what is the > role of each of them? Writing reads for each chromosome, or pair of > chromosomes, or...? Any input that can help me understand is greatly > appreciated. > > Thanks, > Peter Johansson > > -- > Peter Johansson > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk > _______________________________________________ > Samtools-help mailing list > Sam...@li... > https://lists.sourceforge.net/lists/listinfo/samtools-help |