|
From: Michael J. C. <mic...@gm...> - 2010-04-12 20:48:41
|
Hi all, That¹s right, it¹s time for another whole genome sequencing paper by the Nelson Lab All-Stars! We have a summary of the SOLiD data here: https://secure.genome.ucla.edu/index.php/1102_Sequencing#Standard_Libs I didn¹t personally create those files, but there is a summary about PCR dup removal on that page. Does anyone know if that was done with samtools or Picard? If it was Samtools rmdup (and I¹m guessing it was), I want to re-run it using Picard MarkDuplicates (Samtools misses some percent of the duplicates, and Picard is really the standard tool for it now). The processed data is here: /home/solexa/abi_datasets/Reports/1102_whole_genome_bfast_alignment/ I know there was a variant database created by Brian for 1102 a while back, but from what I understand the variant database server is gone now. Does that mean we should re-run the variant calling/queryengine analysis? I want to get together numbers such as the dbSNP intersect and the coding consequences and so forth for the whole 1102 data set. Will the method I used previously to generate queryengine reports (from here: https://secure.genome.ucla.edu/index.php/Sequence_Analysis_HowTo#Step_4_-_Ge nerating_Reports ) still work? Thanks, Mike |