The asymptoptic approach of version 2.0 is now one step closer. Alas we still haven't had time to give the documentation a work over, hence still labelling this release as Beta.
The primary changes are to Gap5, mainly involving stability and bug fixing although several new features have arrived too. Release notes follow.
Find Haplotypes function for use within the contig editor. This
replaces the old Sort By Sequence mode as well as generating lists
of read IDs per haplotype. It also creates a master "haplotypes"
list of lists.
Disassemble Readings now handles list of lists (in addition to
normal lists of readings), for use with the haplotype output.
Improvements to Shuffle Pads. It can now use cutoff/soft-clip
data to detect concordant soft-clips (regions where all the
soft-clip data agrees with one another). It auto-detects likely
adapter sequences to avoid extending these.
It also now detects short tandem repeats and handles these more
carefully. An STR overlapping a heterozygous indel can be clipped
back if the read doesn't span the entire STR, in order to not cause
bias in the copy numbers.
The Realign Selection option now brings up a user interface if invoked
from the main menu, but not when right clicked via the popup menu
(unless Control key is held down while right-clicking).
Export Contigs has an additional depadded option to use original
reference sequence coordinates. Note this isn't 100% robust as it
depends on which edits have happened, in particular joining and
Export SAM, BAM and CRAM should now be much faster in excessively
deep regions with soft-clips.
The quality values can now be overriden when importing FASTQ or the
default quality changed when importing FASTA.
Gap5_cmd has gained Contig Extend and Contig Trim commands.
Gap5_cmd auto_join now honours the -min_overlap and -max_overlap
Improved the consensus discrepancy score by taking into account a
per-base difference rather than joining all together.
Check Assembly text results are now more verbose, for easier
comparison between databases.
The current selected reading list in the Contig Editor now shows the
number of items in that list.
Highlight Disagreements for cutoff data now defaults to displaying
foreground coloured differences.
The consensus algorithm has been tweaked to cope with different
overcall and undercall likelihoods. This is tied to sequencing
technology, for which the default can be adjusted in the Options
menu. This uses the @RG "PL"atform heading in BAM files. The List
Libraries window permits editing of this field after import.
Small tweaks to Find Internal Joins block alignments.
Save Consensus can now optionally emit ambiguity codes for
Added back Mask/Mark consensus filtering modes to Find Internal
Read name indexing should uses less temporary disk space in
The multi-column reading list viewer now has Template Status and
Library Name columns for display/sorting.
The Template Display now has hot-keys for configurations. Use
Shift-Function-Key to remember the current settings and Function-Key
to switch to that setting. The File -> Save Settings option will
write these to your ~/.gap5rc.
Fixed a bug where using the contig selector / contig list to fill
out a dialogue box without then doing anything else at all could
cause the name to be considered invalid.
Removed used of some unintialised data in Shuffle Pads.
Functions setting sequence clip points could potentially set clips
beyond the sequence ends.
Fixed a bin corruption caused by updates to read pairs spanning two
contigs, eg with one editor open on each end. (See r3996 for the
Auto-scrolling int he template display now takes into account the
contig start location.
Many bug fixes to Shuffle Pads, but there is also a lot of new code
in there too (hopefully without new bugs!).
Tg_index -g mode was erroneously adding columns of pads in some
cases. Also fixed an issue with tg_index -a -g not working in some
Fixed rounding issues when zooming the template display quality
Some more (rather obscure) gap5 database corruptions have been
Improved robustness in the presence of database corruptions. While
not foolproof, the program shouldn't crash as often.
Import Reads no longer clears the "readings" list.
Corrected the (reversed) insertion / deletion percentages in List
Quality value "*" in BAM is no longer treated as quality -1.
Fixed a potential buffer overrun in querying individual items in the
Overhauled the base insertion and deletion code in the Contig
Editor, fixing several problems (in particular with Undo).
Various fixes to keep the contig start / end coordinates correct,
removing bin errors.
Fixed issues with annotation positions after using the editor
commands to move sequences. Similarly consensus tags at the ends of
contigs should be more robust when inserting and removing consensus
SAM/BAM export should now correct the mate-pair flags and positions.
Fixed exporting CRAM in padded mode. It was outputting a depadded
consensus, causing CRAM to work excessively hard in encoding reference
Added back fsync to main .g5d file. This should resolve some
database corruptions caused by running gap5 on a cluster, saving the
gap5 database (but not closing gap5) and then manually copying the
database from a different cluster node.
The contig editor should now work better with newer X11 window managers.
Fixed a bug in showing the reading frame translations. It
overflowed the buffer by 1 byte (a ~2002 bug).
Applied patches from Debian
The usual round of compiler warning fixes; mostly missing
prototypes, casts and unused variables.
This also includes two bug fixes:
sam_export_seq() could calculate the wrong index bin number when
exporting bam if the contig started at position <= 0 dut to the
offset being added to the sequence end position twice.
Added a debugging tool to replay events listed in gap5 log files
Log in to post a comment.