|
From: Adam P. <aph...@gm...> - 2014-05-22 18:49:55
|
Hi Davide, dnadiff was primarily designed for microbial genome comparison and currently does not scale well for large genomes. The 'delta-filter' step is certainly one of the major bottlenecks. delta-filter scales by the number of matches it has to analyze, so you can speed things along by reducing the total number of matches. A few ways to do this: 1. Run nucmer in mum-reference mode to ignore repetitive alignment seeds 2. Increase the minimum match length and minimum cluster length (this will reduce sensitivity to low-identity alignments) 3. Run delta-filter with the -l and -i options to filter alignments by length and identity (these filters are quick, compared to -1/-m/-r/-q which all require a dynamic programming step) Once you have a filtered delta file using the above recommendations, you can pass it directly to dnadiff using the -d option and it will skip the alignment phase and process your delta filter directly--hopefully faster than before. Best, -Adam On Tue, May 20, 2014 at 1:34 AM, Davide VERZOTTO (GIS) < ver...@gi...> wrote: > Dear MUMmer users, > > We are trying to apply dnadiff for the analysis of breakpoints between our > de novo Human genome assembly and the Reference genome, the latter divided > into multiple chromosomes / separate files. > > We have already computed a NUCmer comparison between the two assemblies > and the related delta file. After this, we tried to compare all our > scaffolds versus hg19 chromosome 1 using dnadiff, and the tool lasted more > than 12 days (1 single core used, peak of 24 Gb RAM) before crashing (for > internal server reasons), without writing any temporary file (apart from > the log line "Filtering alignments") and presumably just trying to run > "delta-filter -1". Did you already face this problem? Is there a way or > script to speed up dnadiff for the Human genome comparison? > > Thanks and regards, > Davide > > ------------------------------- > This e-mail and any attachments are only for the use of the intended > recipient and may be confidential and/or privileged. If you are not the > recipient, please delete it or notify the sender immediately. Please do not > copy or use it for any purpose or disclose the contents to any other person > as it may be an offence under the Official Secrets Act. > ------------------------------- > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > MUMmer-help mailing list > MUM...@li... > https://lists.sourceforge.net/lists/listinfo/mummer-help > |