|
From: Adam P. <aph...@gm...> - 2014-06-18 20:10:34
|
Hi Davide, In this context, "unique" means that there is only one alignment covering a region. Since multiple alignments can overlap one another, this option looks at a particular alignment and computes the fraction of its length where it is the *only* alignment that exists at that reference/query position. Best, -Adam On Wed, Jun 18, 2014 at 3:55 AM, Davide VERZOTTO (GIS) < ver...@gi...> wrote: > Hi Adam, > > > > May I kindly ask you how the -u option is actually working in > delta-filter, more in detail than what is written in the manual (for > example, what do you mean exactly with 'unique reference' and 'unique > query')? > > > > Thanks and regards, > > Davide > > > > > > *From:* Adam Phillippy [mailto:aph...@gm...] > *Sent:* Wednesday, May 28, 2014 4:27 AM > *To:* Davide VERZOTTO (GIS) > *Cc:* mum...@li... > *Subject:* Re: [MUMmer-help] dnadiff very slow for Human > > > > Hi Davide, > > I'm not aware of a converter between Mummer formats and the UCSC format > you referenced. However, all of the information required by that format is > contained within the Nucmer delta format, so it would be relatively > straightforward to write such a converter. > > > > Best, > > -Adam > > > > > > On Thu, May 22, 2014 at 11:20 PM, Davide VERZOTTO (GIS) < > ver...@gi...> wrote: > > Hi Adam, > > > > Thank you for your kind reply and your hint on mum-reference, I am testing > it now and seems indeed to dramatically reduce delta file sizes. I was > already increasing the minimum match length, but not the minimum cluster > length (what is actually the meaning of this field?). I also used the the > -l and -i options by applying a delta-filter first, and the -d option in > dnadiff. The slowness problem seems to be with the large number of small > contigs that we have, since it is not really affecting the big scaffolds. > > > > Just another question: is it possible to use dnadiff (or another MUMmer > suite) output to make the annotation lift-over from the Reference genome to > a de novo Human genome assembly using UCSC liftOver tool, which > requires first to chain the alignments found (see the chain format: > https://genome.ucsc.edu/goldenPath/help/chain.html), or other tools that > you may know? > > > > Thanks and regards, > > Davide > > > > > > On May 23, 2014, at 2:49 AM, Adam Phillippy wrote: > > > > Hi Davide, > > dnadiff was primarily designed for microbial genome comparison and > currently does not scale well for large genomes. The 'delta-filter' step is > certainly one of the major bottlenecks. delta-filter scales by the number > of matches it has to analyze, so you can speed things along by reducing the > total number of matches. A few ways to do this: > > > > 1. Run nucmer in mum-reference mode to ignore repetitive alignment seeds > > 2. Increase the minimum match length and minimum cluster length (this will > reduce sensitivity to low-identity alignments) > > 3. Run delta-filter with the -l and -i options to filter alignments by > length and identity (these filters are quick, compared to -1/-m/-r/-q which > all require a dynamic programming step) > > > > Once you have a filtered delta file using the above recommendations, you > can pass it directly to dnadiff using the -d option and it will skip the > alignment phase and process your delta filter directly--hopefully faster > than before. > > > > Best, > > -Adam > > > > > > > > On Tue, May 20, 2014 at 1:34 AM, Davide VERZOTTO (GIS) < > ver...@gi...> wrote: > > Dear MUMmer users, > > We are trying to apply dnadiff for the analysis of breakpoints between our > de novo Human genome assembly and the Reference genome, the latter divided > into multiple chromosomes / separate files. > > We have already computed a NUCmer comparison between the two assemblies > and the related delta file. After this, we tried to compare all our > scaffolds versus hg19 chromosome 1 using dnadiff, and the tool lasted more > than 12 days (1 single core used, peak of 24 Gb RAM) before crashing (for > internal server reasons), without writing any temporary file (apart from > the log line "Filtering alignments") and presumably just trying to run > "delta-filter -1". Did you already face this problem? Is there a way or > script to speed up dnadiff for the Human genome comparison? > > Thanks and regards, > Davide > > ------------------------------- > This e-mail and any attachments are only for the use of the intended > recipient and may be confidential and/or privileged. If you are not the > recipient, please delete it or notify the sender immediately. Please do not > copy or use it for any purpose or disclose the contents to any other person > as it may be an offence under the Official Secrets Act. > ------------------------------- > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > MUMmer-help mailing list > MUM...@li... > https://lists.sourceforge.net/lists/listinfo/mummer-help > > > > > > > ------------------------------- > This e-mail and any attachments are only for the use of the intended > recipient and may be confidential and/or privileged. If you are not the > recipient, please delete it or notify the sender immediately. Please do not > copy or use it for any purpose or disclose the contents to any other person > as it may be an offence under the Official Secrets Act. > ------------------------------- > > > > ------------------------------- > This e-mail and any attachments are only for the use of the intended > recipient and may be confidential and/or privileged. If you are not the > recipient, please delete it or notify the sender immediately. Please do not > copy or use it for any purpose or disclose the contents to any other person > as it may be an offence under the Official Secrets Act. > ------------------------------- > |