|
From: Adam P. <aph...@gm...> - 2014-05-27 20:27:32
|
Hi Davide, I'm not aware of a converter between Mummer formats and the UCSC format you referenced. However, all of the information required by that format is contained within the Nucmer delta format, so it would be relatively straightforward to write such a converter. Best, -Adam On Thu, May 22, 2014 at 11:20 PM, Davide VERZOTTO (GIS) < ver...@gi...> wrote: > Hi Adam, > > Thank you for your kind reply and your hint on mum-reference, I am > testing it now and seems indeed to dramatically reduce delta file sizes. I > was already increasing the minimum match length, but not the minimum > cluster length (what is actually the meaning of this field?). I also used > the the -l and -i options by applying a delta-filter first, and the -d > option in dnadiff. The slowness problem seems to be with the large number > of small contigs that we have, since it is not really affecting the big > scaffolds. > > Just another question: is it possible to use dnadiff (or another MUMmer > suite) output to make the annotation lift-over from the Reference genome to > a de novo Human genome assembly using UCSC liftOver tool, which > requires first to chain the alignments found (see the chain format: > https://genome.ucsc.edu/goldenPath/help/chain.html), or other tools that > you may know? > > Thanks and regards, > Davide > > > On May 23, 2014, at 2:49 AM, Adam Phillippy wrote: > > Hi Davide, > dnadiff was primarily designed for microbial genome comparison and > currently does not scale well for large genomes. The 'delta-filter' step is > certainly one of the major bottlenecks. delta-filter scales by the number > of matches it has to analyze, so you can speed things along by reducing the > total number of matches. A few ways to do this: > > 1. Run nucmer in mum-reference mode to ignore repetitive alignment seeds > 2. Increase the minimum match length and minimum cluster length (this will > reduce sensitivity to low-identity alignments) > 3. Run delta-filter with the -l and -i options to filter alignments by > length and identity (these filters are quick, compared to -1/-m/-r/-q which > all require a dynamic programming step) > > Once you have a filtered delta file using the above recommendations, you > can pass it directly to dnadiff using the -d option and it will skip the > alignment phase and process your delta filter directly--hopefully faster > than before. > > Best, > -Adam > > > > > On Tue, May 20, 2014 at 1:34 AM, Davide VERZOTTO (GIS) < > ver...@gi...> wrote: > >> Dear MUMmer users, >> >> We are trying to apply dnadiff for the analysis of breakpoints between >> our de novo Human genome assembly and the Reference genome, the latter >> divided into multiple chromosomes / separate files. >> >> We have already computed a NUCmer comparison between the two assemblies >> and the related delta file. After this, we tried to compare all our >> scaffolds versus hg19 chromosome 1 using dnadiff, and the tool lasted more >> than 12 days (1 single core used, peak of 24 Gb RAM) before crashing (for >> internal server reasons), without writing any temporary file (apart from >> the log line "Filtering alignments") and presumably just trying to run >> "delta-filter -1". Did you already face this problem? Is there a way or >> script to speed up dnadiff for the Human genome comparison? >> >> Thanks and regards, >> Davide >> >> ------------------------------- >> This e-mail and any attachments are only for the use of the intended >> recipient and may be confidential and/or privileged. If you are not the >> recipient, please delete it or notify the sender immediately. Please do not >> copy or use it for any purpose or disclose the contents to any other person >> as it may be an offence under the Official Secrets Act. >> ------------------------------- >> >> >> ------------------------------------------------------------------------------ >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> Instantly run your Selenium tests across 300+ browser/OS combos. >> Get unparalleled scalability from the best Selenium testing platform >> available >> Simple to use. Nothing to install. Get started now for free." >> http://p.sf.net/sfu/SauceLabs >> _______________________________________________ >> MUMmer-help mailing list >> MUM...@li... >> https://lists.sourceforge.net/lists/listinfo/mummer-help >> > > > > ------------------------------- > This e-mail and any attachments are only for the use of the intended > recipient and may be confidential and/or privileged. If you are not the > recipient, please delete it or notify the sender immediately. Please do not > copy or use it for any purpose or disclose the contents to any other person > as it may be an offence under the Official Secrets Act. > ------------------------------- > |