From: James B. <jk...@sa...> - 2014-11-18 09:38:30
|
On Tue, Nov 18, 2014 at 10:40:23AM +0800, Colin Hercus wrote: > Hi James, > > I don't know if anyone uses these tags. I know a lot use the Novoalign > option to report multiple mappings. We do our selves but we've never had > the need to use the HI, IH, NH, CC or CP tags. Thinking about algorithms, the only flag I can see that is absolutely necessary is the one that doesn't exist! We need a way to know how many records in total have the same identifier. If we know that, then writing a read collation tool is trivial. Without it we either have to make assumptions (2 only) or do a full name sort in order to implement algorithms such as fixmates. A full sort is very CPU intensive though compared to simple collation. James -- James Bonfield (jk...@sa...) | Hora aderat briligi. Nunc et Slythia Tova | Plurima gyrabant gymbolitare vabo; A Staden Package developer: | Et Borogovorum mimzebant undique formae, https://sf.net/projects/staden/ | Momiferique omnes exgrabure Rathi. -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. |