From: George G. <geo...@du...> - 2015-06-23 15:54:41
|
I am attempting to run Picard’s CollectRnaSeqMetrics from the command line with the following command (the java versions is 1.7.0_75): java -jar programs/picard-tools-1.134/picard.jar CollectRnaSeqMetrics REF_FLAT=genes2.ref CHART_OUTPUT=tmp".pdf" STRAND=NONE INPUT=new.sam OUTPUT=new.metrics.txt The issue I am encountering is that this will result in bases being counted as “Ribosomal” as well as reporting strand bias info; my understanding was CollectRnaSeqMetrics would not count these values under my stated options. I have checked the format of my refflat file and my .sam file and everything appears fine (no errors noted through ValidateSamFile). I have provided a few excerpts of my files below. Please help in anyway possible. George REFFLAT file: AT1G01010.1 AT1G01010.1 Chr1 + 3630 5899 3759 5630 6 3630,3995,4485,4705,5173,5438, 3913,4276,4605,5095,5326,5899, AT1G01020.1 AT1G01020.1 Chr1 - 5927 8737 6914 8666 10 5927,6436,7156,7383,7563,7761,7941,8235,8416,8570, 6263,7069,7232,7450,7649,7835,7987,8325,8464,8737, AT1G01020.2 AT1G01020.2 Chr1 - 6789 8737 7314 8666 8 6789,7156,7563,7761,7941,8235,8416,8570, 7069,7450,7649,7835,7987,8325,8464,8737, AT1G01030.1 AT1G01030.1 Chr1 - 11648 13714 11863 12940 2 11648,13334, 13173,13714, AT1G01040.1 AT1G01040.1 Chr1 + 23145 31227 23518 31079 20 23145,24541,24751,25040,25523,25824,26080,26291,26542,26861,27098,27371,27617,27802,28707,28889,29159,30146,30409,30901, 24451,24655,24962,25435,25743,25997,26203,26452,26776,27012,27281,27533,27713,28431,28805,29080,30065,30311,30816,31227, SAM file: @HD VN:1.0 SO:coordinate @SQ SN:Chr1 LN:30427671 @SQ SN:Chr2 LN:19698289 @SQ SN:Chr3 LN:23459830 @SQ SN:Chr4 LN:18585056 @SQ SN:Chr5 LN:26975502 @RG ID:L1 PL:illumina LB:library SM:sample @PG ID:TopHat VN:1.3.1 CL:/opt/apps/sdg/bin/tophat --no-novel-juncs -G ../../genes_new.name.gtf.txt --output-dir ./top ../../index/a_thaliana ./bt2/MM2_25_nocont.fq SEQCORE-1795804:191:C6HVDANXX:8:1101:8707:62378 256 Chr1 4293 0 27M * 0 0 ATATATATATATATATATTTGAGGATA BBBBBFFFFFFFFFFFFFFFFFFBBFF NM:i:2 NH:i:14 CC:Z:= CP:i:10911936 HI:i:0 RG:Z:L1 SEQCORE-1795804:191:C6HVDANXX:7:1201:13268:85214 16 Chr1 8288 255 29M * 0 0 ACCTTTGGTCTGTGAAGGATTAAATCGAT FFFFFFFFF<FFFBFFFBFFBFFBBBBBB NM:i:0 NH:i:1 RG:Z:L1 metrics.txt: # picard.analysis.CollectRnaSeqMetrics REF_FLAT=genes2.ref STRAND_SPECIFICITY=NONE CHART_OUTPUT=tmp.pdf INPUT=new.sam OUTPUT=new.metrics.txt REFERENCE_SEQUENCE=index/a_thaliana.fasta MINIMUM_LENGTH=500 RRNA_FRAGMENT_PERCENTAGE=0.8 METRIC_ACCUMULATION_LEVEL=[ALL_READS] ASSUME_SORTED=true STOP_AFTER=0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false ## htsjdk.samtools.metrics.StringHeader # Started on: Tue Jun 23 11:39:48 EDT 2015 ## METRICS CLASS picard.analysis.RnaSeqMetrics PF_BASES PF_ALIGNED_BASES RIBOSOMAL_BASES CODING_BASES UTR_BASES INTRONIC_BASES INTERGENIC_BASES IGNORED_READS CORRECT_STRAND_READS INCORRECT_STRAND_READS PCT_RIBOSOMAL_BASES PCT_CODING_BASES PCT_UTR_BASES PCT_INTRONIC_BASES PCT_INTERGENIC_BASES PCT_MRNA_BASES PCT_USABLE_BASES PCT_CORRECT_STRAND_READS MEDIAN_CV_COVERAGE MEDIAN_5PRIME_BIAS MEDIAN_3PRIME_BIAS MEDIAN_5PRIME_TO_3PRIME_BIAS SAMPLE LIBRARY READ_GROUP 22349250 22349250 16608264 4455239 266041 1019706 0 0 0 0.743124 0.199346 0.011904 0.045626 0.94247 0.94247 0 3.008882 0.070426 0 0 |