From: Alec W. <al...@br...> - 2012-01-17 15:52:19
|
Picard release 1.60 17 January 2012 - Modified ExtractIlluminaBarcodes to handle dual barcodes. Metrics file contains barcode read bases separate by slashes, barcode.txt files concatenate all the barcode read bases with no separator. Removed BUSTARD_DIR option, which has been deprecated in favor of BASECALLS_DIR. - Eliminate SamFileHeaderMerger.addIterator, and instead add another ctor that allows for passing iterators in the ctor. Use LinkedHashMap instead of ArrayList in SamFileHeaderMerger for speed. Patch courtesy of Matt Hanna. - FilteringIterator.java: assert that reads are in SortOrder.queryname when filtering pairs - SamToFastq.java: Added a little progress logging to SamToFastq. -Alec |
From: Alec W. <al...@br...> - 2012-02-13 18:05:49
|
Picard release 1.62 12 Febuary 2012 - IlluminaDataProvider now supports (and prefers) BCLs and other binary data types over QSeqs. If BCLs,Pos,Locs,Clocs, or filter files are present then IlluminaDataProvider will use these file formats. However, if one type of data requested requires the use of QSeqs then QSeqs will be used for all data provided (to avoid unecessarily parsing extra files). This changes affects ExtractIlluminaBarcodes and IlluminaBasecallstoSam. READ_STRUCTURE argument is now required for both these programs. - AbstractAlignmentMerger.java: Improve error report if no sequence dictionary found for reference fasta. - SAMRecord.java: Avoid String.intern in set(Mate)ReferenceName if the string can be obtained from the sequence dictionary, because intern is surprisingly slow. - SAMSequenceRecord.java: Don't use regex to truncate sequence name at first whitespace, to improve speed. - SAMTextReader.java: Don't use regex to validate read bases, to improve speed. - CalculateHsMetrics.java: Minor change to allow direct setting of the bait set name via a command line parameter instead of inferring it from the filename. Inference is still used if explicit name isn't passed. - SamToFastq.java: create zero length FASTQ/SECOND_END_FASTQ files when there are zero reads per read group(s) - DownsampleSam.java: Updated DownsampleBam to ignore all non-primary alignments so that pairs don't get mishandled. -Alec |
From: Alec W. <al...@br...> - 2012-02-27 15:31:59
|
Picard release 1.63 27 February 2012 - CollectGcBiasMetrics.java: Make coordinate-sort requirement explicit. - SamPairUtil.java: Fix off-by-one error in computation of insert size; and then: Fix an off-by-two bug introduced recently in calculation of insert size when read 1 has a higher mapped coordinate than read 2. - SequenceUtil.java: added areSequenceDictionariesEqual() - Implemented a tiny custom sublcass of GZIPOutputStream so that the compression level could be adjusted from the default compression level (5), and made it to that IoUtil.openFileForWriting() and friends that vector to our GZIP writing code override the default using Defaults.COMPRESSION_LEVEL which is settable via system property. - Added a new Defaults.BUFFER_SIZE settable from system property samjdk.buffer_size that replaces IOUtil.DEFAULT_BUFFER_SIZE. Default is still 128k. - Small performance tweak to how we identify adapter dimer in CollectAlignmentSummaryMetrics. - ExtractIlluminaBarcodes has been modified to support multi-threaded operation. - Add OutputFilePrefix option to BamToBfq and RunMaq -Alec |
From: Alec W. <al...@br...> - 2012-03-12 14:12:54
|
Picard release 1.64 12 March 2012 - Modified MeanQualityByCycle so that it does not try to generate an R plot if there is no data in the metrics file. - IlluminaUtil.java: change delimitor for multiple-barcode concatenation from slash to hyphen - CommandLineProgram.java: Adding documentation links * CommandLinePrograms now print web documentation link in Usage * The FAQ link is printed when exceptions are thrown - Fixed NPE in SAMReadGroupRecord.getRunDate(). - CollectAlignmentSummaryMetrics.java: Fix for problem where PF_ALIGNED_BASES AND PF_MISMATCH_RATE were being calculated on all aligned bases, not just PF. - IlluminaUtil.java: Added a set of "dual indexed" adapters. - IlluminaDataProvider and dependent classes (IlluminaBasecallsToSam, ExtractIlluminaBarcodes) will no longer load/copy files and data for skipped cycles - IlluminaDataProvider no longer parses position files to get tile/lane information. It get's this from any available parser (and therefore avoids unnecessary parsing of position files). - SamToFastq.java: Fix for bug whereby the FastqWriters were being close()'d multiple times if not outputting per read-group and multiple read groups were present in the input sam/bam file. -Alec |
From: Alec W. <al...@br...> - 2012-03-26 14:18:50
|
Picard release 1.65 26 March 2012 - Major refactoring of ClippingUtil underlying IlluminaBasecallsToSam. All existing adapterTrim... methods are deprecated and no longer return any warning strings. ClippingUtliity.AdapterPair has been removed and now their is an AdapterPair interface (in the util package) that is implemented by IlluminaAdapterPair. In IlluminaBasecallsToSam the MARK_ADAPTER paramter has been removed and replaced with ADAPTERS_TO_CHECK, which is a list of IlluminaAdapterPairs. If any adapter pairs are provided, adapter marking is done. New adapter pairs have been added and the defaults in use have changed. - IlluminaBasecallsToSam.java: Support multiple barcode reads - Remove validation that complains about CIGAR starting with P operator. - MergeSamFiles.java: Fix hang when USE_THREADING=true and exception reading input. - BAMFileReader.java: Print file name when error in BAMFileReader when possible. - Pass filename through to BAMRecordCodec to improve error messages. - Accept VN:1.4 in SAM header. -Alec |
From: Alec W. <al...@br...> - 2012-04-09 15:08:41
|
Picard release 1.66 9 April 2012 - BlockCompressedInputStream.java: Fix long-standing bug in read() method in which reading a valid 0xff from the stream was indistinguishable from EOF. Fortunately this method is not used very much. Fix courtesy of Bob Handsaker. - CollectGcBiasMetrics.java: Improve error message about absence of SO:coordinate in header. - IlluminaBasecallsToSam.java: Initialize ADAPTERS_TO_CHECK to a mutable array so it can be changed on the command line. - Made the random seed truly optional in DownsampleSam. - Add program CheckIlluminaDirectory. - Fix multithreading bug in IlluminaBasecallsToSam. - Make SamToFastq work if there are no read groups in the input. - Add standard options -h, -H, --version to HTML doc. - Speed up SamFileHeaderMerger in cases where we have many collisions. - Added a IOUtil. newTempFile() method that defaults the amount of space needed free on the tmp filesystem. - Make CloseableIterators on BAM files idempotent on close(). -Alec |
From: Alec W. <al...@br...> - 2012-04-23 14:08:50
|
Picard release 1.67 23 April 2012 - ReadStructure.java: Fix ArrayOutOfBounds exceptions when ReadStructure has a skip before its last ReadDescriptor - Log.java: Added a synchronized block around series of writes to System.out so that messages won't get interleaved when multiple threads log simultaneously. - Small changes to ExtractIlluminaBarcodes to allow the use of quality scores to perform more stringent barcode calling. - ClippingUtility.java: Fix bug in which one overload of adapterTrimIlluminaPairedReads was ignoring minMatchBases and maxErrorRate arguments and using default values. -Alec |
From: Alec W. <al...@br...> - 2012-05-07 15:02:22
|
Picard release 1.68 - FilterSamReads.java: changed command-line options: FILTER, READ_LIST_FILE. - ExtractIlluminaBarcodes bug fix for handling multiple barcodes. - QseqParser.java: Added more descriptive error message for the case of fewer bases available in Qseqs compared to cycles found in the read structure of a run. - BAMFileWriter.java: Allow writing a BAM file to a stream that is not associated with a File. -Alec |
From: Alec W. <al...@br...> - 2012-05-21 15:22:31
|
Picard release 1.69 21 May 2012 - SamToFastq.java: Close output files before throwing exception due to unpaired mates. - Fixed bug when dealing with dual barcodes. IlluminaBasecallsToSam now expects NO dashes(-) in matchedBarcode - Add support for adding arbitrary header tags to read groups via IlluminaBasecallsToSam.java -Alec |
From: Alec W. <al...@br...> - 2012-06-04 14:23:30
|
Picard release 1.70 4 June 2012 - Add option PRIMARY_ALIGNMENT_STRATEGY to MergeBamAlignment, which enables selection of the strategy for selecting the primary alignment for a read if there are multiple alignments marked as primary, or if none of the alignments for a read are marked as primary. In addition to the existing strategy, BestMapq, add a new strategy EarliestFragment which only works for fragments, and selects the alignment that maps the earliest base in a read. - Performance fix to EstimateLibraryComplexity. Previously it would attempt to find duplicates within all groups of reads with a common prefix which would bog down if a technical sequence like adapter dimer dominated the library. The fix is just to ignore groups that have > some large multiple of the expected average number of read pairs per group. The default is 500 times over the mean. - Switch SortingLongCollection from memory-mapped I/O to DataInput/OutputStream, in order to avoid problems due to JVM's delayed release of mapped memory. -Alec |
From: Alec W. <al...@br...> - 2012-06-18 15:10:23
|
Picard release 1.71 18 June 2012 - Abstract metric accumulation functionality into MultiLevelCollector and refactor all classes that use MultilevelMetrics - CreateSequenceDictionary.java: Complain if OUTPUT already exists. -Alec |
From: Alec W. <al...@br...> - 2012-07-02 14:14:24
|
Picard release 1.72 2 July 2012 - Clearer error message when paired reads but SECOND_END_FASTQ is not specified. - HsMetricCollector.java: Fix NullPointerException caused by broken copy ctor for PerUnitHsMetricCollector. -Alec |
From: Alec W. <al...@br...> - 2012-07-16 14:47:07
|
Picard release 1.73 16 July 2012 - AddOrReplaceReadGroups.java: Add ability to set DT tag in new read group. -Alec |
From: Alec W. <al...@br...> - 2012-07-30 14:13:56
|
Picard release 1.74 30 July 2012 - Added a new "ProgressLogger" class that facilitates more useful and standard progress logging for any program that iterates through a stream of SAMRecords. Adapted most command line programs to use it. - Add support for targetedPcrMetrics and collected common HsMetrics and TargetedPcrMetrics behavior into TargetMetricsCollector - New program CollectTargetedPcrMetrics - MultiHitAlignedReadIterator.java: Handle case where an alignment record has no cigar elements that consume both the read and the reference (e.g. the read is all soft-clipped) -Alec |
From: Alec W. <al...@br...> - 2012-08-13 13:41:51
|
Picard release 1.75 13 August 2012 - Added method to make a read unmapped and a method to check whether any cigar element consumes both read and reference bases to SAMUtils; used these methods in AbstractAlignment merger to catch reads with cigar strings that don't consume any read or reference bases (e.g. all soft-clipped) and make them unmapped - Augmented plots for GC Bias Metrics, Mean Quality by Cycle, Quality Score Distribution, and RNA Sequence Coverage. Specifically, if the plot was generated from a read group .bam, the title of the plot includes the read group's library. - IlluminaDataProvider.java: Produce more useful error output by printing out the class of the parser that's failing - Support SeekableStream BAM index. - build.xml: Added single-jar packaging target tasks. Consolidated package-commands task. -Alec |
From: Alec W. <al...@br...> - 2012-08-28 12:36:00
|
Picard release 1.76 28 August 2012 - IntervalListTools.java: Added scatter (to support scatter-gather parallelism) to IntervalList. - ProgressLogger.java: Synchronized record() method to make class thread-safe. - Significant code refactoring in IlluminaBasecallsToSam; overhaulded threading model to improve CPU utilization; added test for Tile comparator. - Set SortingCollection.destructiveIteration true by default (as was the original intent), in order to enable earlier GC of no-longer-needed SAMRecords. -Alec |
From: Alec W. <al...@br...> - 2012-09-10 14:20:27
|
Picard release 1.77 10 September 2012 - CleanSam.java: Add additional clean-up: set MAPQ to 0 if a read is unmapped. - MergeBamAlignment.java: Add option ALIGNER_PROPER_PAIR_FLAGS to MergeBamAlignment to tell it to keep the aligner's notion of proper pair flag rather than overwriting it with MergeBamAlignment's notion. -Alec |
From: Alec W. <al...@br...> - 2012-10-09 13:04:45
|
Picard release 1.78 9 October 2012 - IlluminaBaseCallsToSam bug fixes: - Fix hang if a worker thread ran out of memory. - Fix bug in which failure of worker thread would result in successful program exit status. - Bug fix: Throw exception when barcodes that were not provided in libraray_params were read from tile. - Fluidigm: - IlluminaUtil.java: Added adapters for fluidigm access array. - IlluminaBasecallsToSam.java: Added fluidigm access array adapters to the default set of adapter sequences to look for and trim. - Make SamRecordIntervalIterator.close() idempotent. - SAM header version number: - Change version number output in SAM text header to 1.4. - Change ViewSam so that it outputs whatever was the version number in the input, rather than always replacing with CURRENT_VERSION. -Alec |
From: Alec W. <al...@br...> - 2012-10-22 13:51:14
|
Picard release 1.79 22 October 2012 - TargetMetricsCollector.java: Avoid overflowing short storing depth to avoid getting negative MEAN_TARGET_COVERAGE. -Alec |
From: Alec W. <al...@br...> - 2012-11-19 14:11:44
|
Picard release 1.80 19 November 2012 - Updated SamFileValidator to check for records that have no read group id, or that have a read group id that is not found in the header. - FormatUtil.java: Bug fix: "-?" is now recognized as NaN - Minor change to ProgressLogger to support things that are not SAMRecord. - IntervalListTools.java: add padding to index portion of scattered directory names so that they are always 4 digits long - SAMFileReader.java: Improve javadoc for queryMate to explain peculiar way it must be used. -Alec |
From: Alec W. <al...@br...> - 2012-12-03 15:58:12
|
Picard release 1.81 3 December 2012 - IlluminaUtil.java: Added TruSeq small RNA adapter sequences. -Alec |
From: Alec W. <al...@br...> - 2012-12-17 14:13:49
|
Picard release 1.82 17 December 2012 - Speed up parsing of SAM header by precompiling regexes, by using StringUtil.split rather than String.split, and by avoiding throwing exception in non-exceptional situation in ISO-8601 date parser. - FastqRecord.java: Allow null to be passed to any ctor arg. - Added utility for automatic detection of quality encodings of fastqs and bams; incorporated into FastqToSam so no format need be specified; incorporated into ValidateSamFile so that an warning is generated if the BAM is not Standard/Phred-encoded. - SAMFileWriterFactory.java: Fix bug in which setting maxRecordsInRam caused an exception when creating a BAM writer. - Refactor SAMTextReader to allow a String in SAM text format to be parsed independently by new class SAMLineParser. - MarkDuplicates now adds PG records (chaining and uniquifying appropriately) by default. The values for all the attributes of the PG record have reasonable default values. In order to get the previous behavior of not adding PG record, pass command-line argument PROGRAM_RECORD_ID=null. -Alec |
From: Alec W. <al...@br...> - 2013-01-02 16:04:59
|
Picard release 1.83 2 January 2013 - SAMLineParser.java: Make getters public. - MergeBamAlignmentTest.java: Test that clipping of FR reads for fragments shorter than read length happens only when it should. - IntervalListTools.java: Fixed bug that caused more than the requested number of scatter directories to be created when the last sequence int the interval list was particularly long. -Alec |
From: Alec W. <al...@br...> - 2013-01-14 16:03:23
|
Picard release 1.84 14 January 2013 - IlluminaBasecallsToSam.java: Shut down background GC task at end of doWork(). - Tribble library ( http://code.google.com/p/tribble/ ) has been moved into Picard repository. Currently it is built independently, but will eventually be incorporated into SAM-JDK.jar. - Move various Seekable*Stream classes to their own package net.sf.samtools.seekablestream. - CommandLineParser now supports nested options objects. This functionality should be considered not yet final, so external developers who choose to use it should be aware that it might change. -Alec |
From: Alec W. <al...@br...> - 2013-02-07 15:26:25
|
Picard release 1.85 7 February 2013 - bug fix -- LittleEndianInputStream would hang in infinite loop if EOF was encountered before null terminator. - Include tribble build/tests in main build.xml, some refactoring. Use sam version number for tribble - Move org.broadinstitute.variant to the Picard repo -Modified build.xml to build and package the variant classes into a separate jar -Made the necessary changes to run our unit tests with Contracts for Java enabled. These now run as part of a normal "ant test". Note that our tests are not as "chatty" as yours, so there will be a short pause in the output while the variant tests run. -Add the required cofoja (Contracts for Java) and Apache commons dependencies to lib/ - Moving JEXLMap from a top level member of file VariantJEXLContext to its own class - CollectGcBiasMetrics.java: Create command line argument (short name BS) to handle bisulfite sequenced reads. CollectGcBiasMetrics will use the bisulfite-aware countMismatches() if true. - Adding Merge- and SplitVcfs, two utilities that merge VCF files and split them into files of only SNPs or indels. Also adding MergingIterator and a comparator for VariantContext -Alec |