From: Arjun P. <ap...@ma...> - 2012-04-24 17:53:45
|
Hi, I need to get a read-mapping with the actual read-names for an assembly that was created based on FASTQ input sequences. I noticed the iidtouid file in the 9-terminator directory, but it has numbers for fragments rather than read names. Looking at the reads from the 9-terminator/.frg file I matched up some by sequence, and it looks like the FRG numbers are alternating reads from each of the paired ends. e.g., FRG 1 110000000001 - first entry from read 1 No FRG 2 FRG 3 110000000003 - 2nd entry from read 1 FRG 4 120000000003 - 2nd entry from read 2 FRG 5 110000000005 - 3rd entry from read 1 FRG 6 120000000005 - 3rd entry from read 2 FRG 100000 120000099999 - Entry 50,000 from read 2 I'm guessing that I can figure out the read name to iid translation by counting into the fastq files by FRG # / 2 Has anyone else done this? Did I correctly interpret what the FRG numbers mean? Are there any gotchas at input file boundaries? Thanks, Arjun -- Genome Technology Branch National Human Genome Research Institute National Institutes of Health 5625 Fishers Lane Phone: 301-594-9199 Room 5N-01L Fax: 301-435-6170 Rockville, MD 20892-9400 E-Mail: ap...@nh... |