Hi there,
The run_MaxBin.pl script performed the FragScan and when running the following command, the script failed. I understand that a Seg Fault likely has to do with memory which will be dependent on my server, but I was hoping you could give advice on what part of that script eats up the most memory. The machine has 128 Gb RAM and the .abund files produced by run_MaxBin are between 40-50Mb. and the FragGeneScan output is 1.1Gb. I'd be surprised if it was a RAM issue. The machine is configured that it has limited space in the root where /tmp/ stores files, so this could be another reason.
The command that breaks is as follows:
/home/roli/Software/MaxBin-2.2.1/src/MaxBin -fasta composite_Bin.contig.tmp -abund composite_Bin.contig.tmp.abund1 -abund2 composite_Bin.contig.tmp.abund2 -abund3 composite_Bin.contig.tmp.abund3 -abund4 composite_Bin.contig.tmp.abund4 -seed composite_Bin.seed -out composite_Bin -min_contig_length 1000 -thread 20
Loaded 100000 sequences
Loaded 200000 sequences
Loaded 300000 sequences
Loaded 400000 sequences
Loaded 500000 sequences
Loaded 600000 sequences
Loaded 700000 sequences
Loaded 800000 sequences
Loaded 900000 sequences
Loaded 1000000 sequences
Loaded 1100000 sequences
Segmentation fault
Dear Roland,
Thank you very much for trying MaxBin. I may need more information to resolve this issue. Do you mind to let me peek into the first 20 lines of your sequence file and one of your abundance file? Thank you.
Yu-Wei
My guess is that you probably used a delimiter other than tab (\t) in your abundance file, although I am not completely sure about that. But in your case the program should have aborted when it was dealing with the abundance files. Please let me know if this is not the case. I will be very happy to keep digging out the problem.
Hi Yu-Wei!
Thanks for your quick response. I've include the head of the contig file
(assembly.fa), the temporary output generated while run_MaxBin2 was
operating and a list of all the output files generated during the run.
I hope that helps get to the bottom of it! Don't hesitate to ask for any
additional information.
Roli
On Mon, May 8, 2017 at 12:35 PM, Yu-Wei Wu yuwwu@users.sf.net wrote:
I must admit that I do not know what caused your problem. I looked into your files but did not see problems. Personally I still suspect that the problem might be in the abundance files. I have updated the program so that it will abort and print out more messages if something's wrong with the abundance file. Could you please try downloading the newest MaxBin package and run again? I apologize for all the trouble that you encountered.
Thanks,
Yu-Wei
Hi Yu-Wei,
I have updated MaxBin2 and re-run the command. I received the following
error:
"Failed to get Abundance information for contig
[k141_3129010|uncontaminated.1001470] in file
[metabat/maxbin2/composite_Bin.contig.tmp.abund1]
Error encountered while running core MaxBin program. Error recorded in
metabat/maxbin2/composite_Bin.log.
Program Stop."
I will look into why there is no abundance information and correct the
problem. Unfortunately there is nothing written to the log file (is that
supposed to be the case?). Also, as a side note: I recommend changing the
error message: "Cannot write into specified output file/directory. Please
check your settings or disk space." to "Cannot write into specified output
file/directory. Please check your directory exists."
I received this error b/c I had miss-typed the full directory and I was
super confused looking for 'settings' and I had more than sufficient disk
space. Finally, I noticed the typo in the path to the output directory.
Also, why not have MaxBin2 create the output directory? That would prevent
any confusion all-together.
Thanks,
Roli
On Fri, May 12, 2017 at 7:39 AM, Yu-Wei Wu yuwwu@users.sf.net wrote:
I sincerely thank you for giving me valuable suggestions and describing your difficulties to me. I have revised the code so that it will print out where exactly the format goes wrong. I made some mock abundance files with wrong formats and successfully get the message. For your reference the message will look like this (for unknown reason some of the following lines are mis-formatted by sourceforge.com. Please ignore the formatting)
(omitted)
Failed to get abudance information from input file [low.out.contig.tmp.abund1].
The error lies in the following line:
===
k141_1809|NODE_9891_length_132_cov_1.931818:5.11827956989247
===
Please check your input abundance files.
Please see if this will possibly work for you. Again, I apologize for all difficulties you encountered.
For the other comments about creating a folder, indeed this is another option for outputting files. For legacy reason I have not considered it at the beginning of the program development, but I will think seriously about transforming the output format into a folder (or make it an option).
Thank you again for the time you spent on helping me improving MaxBin!
Sincerely,
Yu-Wei
Hi Yu-Wei,
I'm happy to help, but now that we've clarified the error, I'm still not
sure how to solve it. The error reads:
Failed to get Abundance information for contig
[k141_3129010|uncontaminated.1001470] in file
[/home/roli/MAGs/maxbin2.foo/.contig.tmp.abund1]
The problem contig is found in the following MaxBin2 files:
.contig.tmp:>k141_3129010|uncontaminated.1001470
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_1_1199_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_1260_2321_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_2461_3099_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_3138_3572_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_3637_4617_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_4618_5631_-
.contig.tmp.frag.faa:>k141_3129010|uncontaminated.1001470_5669_6029_-
The problem contig is found in my contig file (fasta) and my input and all
of the bam files I feed for abundance calculations.
Moving forward, I could just remove that specific contig from all of these
files (which would be a pain), but I'm not confident that will solve the
problem. Do you have any suggestions?
Thanks,
Roli
On Mon, May 15, 2017 at 5:27 AM, Yu-Wei Wu yuwwu@users.sf.net wrote:
(I don't know what went wrong, but this is the 3rd time I send this message via the ticket system. Sorry for the delay that the system caused)
I tried to find a universal solution for your problem but cannot make it. From your error message your problem arised from the format of the abundance files, in which at least one contig does not have the abundance information. MaxBin requires that all contigs should have an abundance (coverage) value, and that a zero should be set if no reads can be mapped to a contig at all. Could you please identity whether there are any contigs that do not have abundance values? Thanks.
Another option is to input reads instead of generating abundance files by yourself. In this case MaxBin will automatically identify the abundances from the reads. Please see if this option fit your needs.
No problems Yu-Wei!
Thanks for persisting with your assistance. I will go with the option to
let MaxBin identify abundances.
Roli
On Fri, May 19, 2017 at 10:53 PM, Yu-Wei Wu yuwwu@users.sf.net wrote:
Hello,
I am also having problems with running MaxBin2.
this is the error I get
...
Loaded 100000 sequences
Loaded 200000 sequences
Error encountered while running core MaxBin program. Error recorded in /home/administrator/storage/pipelines_mcgr/das_tool/binning_outputs/maxbin/maxbin_HPmin_bins.log.
Program Stop.
this is my run command
/home/administrator/storage/pipelines_mcgr/das_tool/maxbin2/MaxBin-2.2.4/run_MaxBin.pl -contig /home/administrator/storage/pipelines_mcgr/sra_mm/assembly/HPmin_scaffolds.fa -abund /home/administrator/storage/pipelines_mcgr/sra_mm/mapping_reads/HPmin_1_2_sam -thread 6 -out /home/administrator/storage/pipelines_mcgr/das_tool/binning_outputs/maxbin_HPmin_bins
my abund file
$ head HPmin_1_2_sam
@HD VN:1.0 SO:unsorted
@SQ SN:scaffold_0 LN:158399
@SQ SN:scaffold_1 LN:152211
@SQ SN:scaffold_2 LN:152017
@SQ SN:scaffold_3 LN:127878
@SQ SN:scaffold_4 LN:118669
@SQ SN:scaffold_5 LN:116318
@SQ SN:scaffold_6 LN:114647
@SQ SN:scaffold_7 LN:113072
@SQ SN:scaffold_8 LN:106563
my contig file
is a simple fasta file with scaffolds generated using IDBA-UD.
Please, it would be really nice to use this tool in my work and I need help to proceed as I really can't find any similar error anywhere. Any help would be appreciated.
Thanks,
Rodolfo
Dear Rodolfo,
Thank you for trying MaxBin. Your problem should be caused by the abundance file format, which is described in the README file. Currently MaxBin does not support SAM file input. Please consider inputting reads directly into MaxBin so that it can automatically parse the abundance information. Let me know if you need more information.
Thanks,
Yu-Wei