Menu

#6 Maxbin don't find enough marker genes

2.0
open
nobody
None
2022-09-22
2019-09-10
Javier
No

Hello

I have run into a problem when binning a big dataset with 1.5 M contigs. Maxbin.log file says:

Searching against 107 marker genes to find starting seed contigs for [/media/mcm/jtamames/polares/Arctic/arctic_seqmerge_500/temp/bincontigs.fasta]...
Try harder to dig out marker genes from contigs.
Marker gene search reveals that the dataset cannot be binned (the medium of marker gene number <= 1). Program stop.

Even discounting the contigs in the .tooshort file (1.2 M), still there are 200 K long contigs to work with. I cannot believe there are not enough markers on these. Any help would be very appreciated!

Best,

Javier

Discussion

  • dgg32

    dgg32 - 2019-09-13

    Same problem. I have a 4.2 Mb contig fasta.

    I have digged a little bit:

    Maxbin uses a set 106 marker genes and none of them could be identified according to Maxbin in my contigs. However, if I run checkm directly on the contig fasta, out of this, 568 single copy genes were identified. These two sets share 16 genes. That means, these 16 genes are positive in checkm but negative in Maxbin.

    So perhaps Maxbin occassionally has a problem to identify marker genes? Because I have only encountered this problem once (out of my many Maxbin runs).

    Greetings!

     

    Last edit: dgg32 2019-10-18
  • dgg32

    dgg32 - 2019-10-18

    After a new examination, it is now clear to me that maxbin has a default min_contig_length of 1000. For my case, I have set it to 200 and now I get my genome for 99% complete.

    @Javier, have a trial with run_MaxBin.pl -min_contig_length 200 please.

     
  • George Kitundu

    George Kitundu - 2022-09-22

    Hi, I am getting the same problem even after setting contig legnth to 200

     

Log in to post a comment.