MiModD v0.1.9 - Release Notes
=============================
changes in 0.1.9:
-----------------
Due to a switch in our versioning scheme, this is a maintenance release (, not
a feature release. It fixes compatibility issues with certain versions of
SnpEff, the latest Galaxy release and the upcoming version 3.7 of Python.
If you currently have MiModD 0.1.8 installed and working, there should be no
need for you to upgrade.
bug fixes and maintenance measures:
...................................
- fixed a bug that prevented MiModD from using SnpEff versions 4.1b-4.3i
- the MiModD.enablegalaxy command can now integrate the package into Galaxy
instances that use a galaxy.yaml instead of a galaxy.ini file (Galaxy
releases >= 18.01)
- ensured compatibility with the upcoming Python version 3.7 by rebuilding
pysam with Cython 0.28 (see https://github.com/cython/cython/issues/1955);
used the opportunity to upgrade pysam from 0.7.6 to 0.7.8
- changed MiModD versioning to follow a standard Major.Minor.Patch scheme
- it is now possible to use the --region and --no-indels/--indels-only filters
of the vcf-filter tool on VCF input without genotype fields
changes in 0.1.8:
-----------------
This new feature release brings further enhancements to the NacreousMap engine
for linkage analysis and better support for variant annotation and reporting.
enhancements:
.............
- a new rebase tool has been added that allows variant coordinates in a vcf
file to be ported to a different reference coordinate system based on a
UCSC chain file-provided mapping
- the old annotate tool has been split into an annotate/varreport pair of
tools; each new tool now has a simpler command line interface; in addition,
there are the following advantages to this separation:
- variant annotation can now be carried out with idempotent external tools
like SnpEff itself or through its official Galaxy wrappers, while the
varreport can still be used on the resulting output
- variants can now be ported (using the new rebase tool) to a different ref
coordinate system before reporting them
- variants could be filtered based on annotations before reporting them
(though MiModD currently provides no support for doing so)
- variant reports generated by the varreport tool are now much richer than
those produced by the old annotate tool
- the annotate tool has a new option to specify a Codon Table to use in the
annotation
- a new mapping mode, "Variant Allele Contrast Mapping" has been added to the
map tool; in this mode, the tool lets you discover regions in the genome, in
which two of your samples are maximally divergent; use this to map causative
variants based on two samples, one selected for, the other against a given
phenotype
- map tool scatter plots show tesselated data for a much improved visual
impression
- map tool histograms feature an additional kernel density estimate (kde) that
is independent of the user-configured bin sizes; the kde maximum is reported
in each contig plot
- map tool histograms are now always normalized so that binned values of a
given width add up to one across contigs
- indexing enhancements:
the in-development index tool has been promoted to a regular tool and the old
snap-index tool been merged into it; the new index tool can now be used to
generate snap indices, but also samtools-style bam and fasta indices; an
--index-files option has been added to the varcall and delcall tools to make
use of precalculated bam indices; when used through their Galaxy tool
wrappers these tools now make use of bam indices created and managed by
Galaxy instead of calculating these indices again; this behavior can be
disabled in the package configuration
- migration to samtools 1.x for internal use has been completed
- it is now possible to configure MiModD to use the runtime working directory
as the temporary files directory
- simplified Galaxy integration: the enablegalaxy tool will try to expose the
bundled versions of samtools/bcftools to Galaxy
- most Galaxy tool wrappers have been enhanced and updated to make use of the
latest features offered by the Galaxy tool interface
- a first-run configuration wizard helps users to choose reasonable
package-wide settings
- lots of documentation enhancements: improved overall structure, revised
tutorials and updated other sections
bug fixes (selection):
......................
- the varextract --pre option now works correctly with named samples in the
input VCF file
- a bug preventing use of the snap-batch tool with the -f option has been fixed
- the snap index -O option is now exposed in the snap-index command line
interface and in the Galaxy SNAP tool wrapper
- varcall works now with an arbitrary number of contigs in its input files
(previously the tool could hit the file system limit for open files if there
was a large number of contigs)
- varcall works reliably now with more than one BAM input file
- varcall will consistently reject BAM/fasta combinations, in which sequences
mentioned in the BAM header cannot be identified (directly or through MD5 sum
comparison) in the reference fasta
- the --max-depth option of the varcall tool is working correctly now and got
properly documented
- parsing of custom hyperlink formatter files finally works as advertised
- the --chr option of the annotate tool, which turned out to be completely
dysfunctional has been removed; the --minC and --minQ options that were
redundant with vcf-filter tool functionality and not compatible with newer
versions of SnpEff have also been removed
- the package now compiles with gcc version 7
- the snap tool works now on the latest versions of macOS
other changes:
..............
- Python 3.2 compatibility has been dropped
changes in 0.1.7.3:
-------------------
Issues addressed in this bug-fix release revolve around better compatibility
with different platforms, compiler and Python versions. In addition, the
command line help system has been improved.
enhancements:
.............
- a new command line help system is now avalaible through:
mimodd help [subcommand or topic]
it is more powerful and flexible than the old mimodd [subcommand] --help
(which is still available though) and, e.g., can provide help for
administrative and in-development commands through the same interface.
- the upgrade tool has been made significantly more robust and should now work
on all supported platforms, always report correctly the version upgraded to
and, except for Python3.2, does not require pip anymore
bug fixes:
..........
see also https://sourceforge.net/p/mimodd/tickets/milestone/0.1.7.3
- incompatibilities in the snap source code that prevented installation from
source with gcc 6.x have been fixed.
- MiModD is now compatible with Python 3.5.2
- Python 3.2 compatibility is restored (again)
- Python 3.3 compatibility fix
- the info tool can handle bcf files again (was broken in v0.1.7.2 for Python
versions < 3.5)
- several bugs in the annotate tool have been fixed and yeast genome browser
links generated by the tool are functional again
- the info tool and all format-specific file parsers consistently recognize and
raise FormatParseError with empty input files
- the config tool accepts a bare --snpeff option (without a path) and
interprets it as SnpEff not being installed
- the enablegalaxy tool has been simplified and now requires write permissions
only for the Galaxy configuration file.
- VCF file parsing can now tolerate missing sample-specific fields
- a bug in the vcf-filter tool has been fixed that caused certain variants to
be dropped when the tool was used to report just a subset of the samples in
the original file
- the map tool no longer treats -i/--infer as a plotting option that requires
rpy2
changes in 0.1.7.2:
-------------------
The major issue addressed by this second bug-fix release in the 0.1.7 series is
explicit rules for and handling of text encoding. In addition, some minor bugs
have been fixed and the sanitize tool has grown a new option.
enhancements:
.............
- throughout MiModD input/output encoding follows clearly defined rules:
all output is UTF-8 encoded, input is expected to be UTF-8 encoded, but where
permissible (fasta sequence titles, SAM/BAM comment lines), undecodeable lines
will trigger a fallback to a one character per byte encoding (latin-1).
- a new -t/--truncate-at option has been added to the sanitize tool, which can
be used to provide a string at the first occurence of which fasta sequence
titles should be truncated.
- the config tool will warn about non-existing directories in settings and tools
relying on the tmpfiles directory will fail with a clear error message if the
folder does not exist.
bug fixes:
..........
- installation does not fail with ASCII-incompatible platform encoding settings
anymore.
- fixed a typo in the map tool command line interface.
- fixed a bug in the NacreousMap engine affecting incompletely determined
genotypes.
changes in 0.1.7.1:
-------------------
This first bug-fix release for the 0.1.7 series, mostly addresses problems
with the new map tool introduced in v0.1.7.0.
enhancements:
.............
- the map tool, when run in VAF mode, used to count only variants that appeared
"pure" (i.e., no evidence for another allele was allowed) for the predominant
related sample allele, but this strict behavior of discarding nearly pure
variants caused poor performance when only few variants were available for
mapping. The algorithm has now been changed to weigh variants by their
pureness making it possible to use also information from not quite pure sites
(though they contribute less to an analysis). This has hardly any effect on
analyses with many thousands of variants, but significantly improves results
for analyses relying on less markers.
- the map tool in VAF mode now skips plots of contigs for which there is no
data
- improved y-axis scaling for linkage plots produced by the map tool
- in VAF mode a Loess span of zero can now be specified to indicate that no
Loess lines should be calculated
- the varcall tool gives nicer status messages
bug-fixes:
..........
see also https://sourceforge.net/p/mimodd/tickets/milestone/0.1.7.1
- restored Python 3.2 compatibility accidentally broken in previous releases
- made MiModD compatible with OS X 10.11 El Capitan
- made from source installation compatible with gcc/g++ versions 4.9 and higher
- fixed a bug in the vcf-filter tool that caused a division-by-zero error with
the --af option when any of the samples filtered for had zero coverage for
any variant
- the delcall tool verifies the identity of the sequence dictionaries provided
by the bam and bcf input files (accidentally combining aligned reads with
unrelated variant calls, thus, gives an early and clear error message)
- tools relying on pysamtools.header give an appropriate error if their SAM/BAM
input file does not exist
- the map tool no longer fails with an error for empty vcf inputs or when the
maximum bin count in a histogram plot is zero
changes in 0.1.7:
-----------------
The single outstanding enhancement in this release is the incorporation of
linkage plotting functionality into the package.
new features/enhancements:
..........................
- the previous cloudmap tool has been refactored and modified extensively.
The resulting new map tool incorporates new code for producing linkage plots
that look like (but feel much better than) those produced by CloudMap.
To this end, we have entirely rewritten and restructured the original
CloudMap EMS Variant Density and Hawaiian Variant Mapping tools code. The
result of merging this code with MiModD's previous linkage analysis code
is a new engine for mapping-by-sequencing analysis which we call NacreousMap.
Graphical output from the map tool requires R and rpy2, but in the absence of
these dependencies the tool can produce text-based output for plotting with
other software or on servers (e.g., http://cloudmap.vm.uni-freiburg.de:8080/).
bug-fixes:
..........
see also https://sourceforge.net/p/mimodd/tickets/milestone/0.1.7
- the upgrade tool does not try to update the mimodd executable anymore,
which could previously disrupt the association of the executable script with
the Python interpreter
- the copy of the mimodd executable that in the package bin folder gets created
with a correct shebang line
- the covstats tool no longer skips the last contig in its report
- the sanitize tool can now be used as expected with the -b option
- the snap tool has been fixed to parse the -C/--clipping option correctly
changes in 0.1.6.1:
-------------------
This is the first bug-fix release for the 0.1.6 series, but also brings one
small new feature.
new features/enhancements:
..........................
- the vcf-filter tool has acquired a new --af option enabling variant filtering
by allelic fractions
bug-fixes:
..........
see also https://sourceforge.net/p/mimodd/tickets/milestone/0.1.6.1/
- the internal version of the SNAP aligner bundled with MiModD can finally deal
with fasta reference files written under Windows; the longstanding issue with
Windows-style line endings has been fixed
- realigning previously aligned data to a reference genome with different
sequence names is now possible
- sam/bam header parsing is now less strict than before: if any standard header
line contains an unknown tag the header is no longer considered invalid, but
the unknown tag is silently dropped; the GO tag has been added to the known
tags of the @HD record type
- vcf meta section handling and parsing has been improved eliminating a bug
with RG information containing characters with special meaning inside vcf
- sequence names in reference genomes or user-provided through the reheader
tool are now checked for characters incompatible with any analyses steps and
the improved sanitize tool can be used to correct reference genome sequence
names automatically
- filtering vcfs now works even if some records do not specify all of the
specified filter fields
- sorting SAM files directly from the Galaxy interface has been enabled
- the reheader Galaxy tool can now replace several read group IDs and sequence
names at a time; a bug in the tool wrapper prevented this in previous
versions
- lots of error messages have been improved
changes in 0.1.6:
-----------------
Lots of novel features have made it into this new release making MiModD more
efficient and simple to use.
new features/enhancements:
..........................
- split-on-rgs has been added as a new parameter to the convert tool and
enables by-read-group splitting into multiple output files during conversion
- conversion from sam/bam files to fastq has been added to the convert tool
- the cloudmap tool has been reworked: the user interface is more powerful now
allowing for three-sample mapping analyses and the underlying algorithm for
the detection of informative variants has been improved resulting in more
robust analysis results
- filtering out anomalous overlapping read pairs from paired-end alignments has
been improved: in the snap tool, the old --max-mate-overlap option has been
dropped in favour of a new --discard-overlapping-mates option, which allows
more fine-grained control over which types of overlapping mates should be
removed
- aligning with the snap tool has been implemented more efficiently:
runs should be between 10% - 40% faster depending on the system with more
homogenous usage of cpu time and less temporary disk space usage
- the command line parser has been cleaned up to include only fully tested
tools for data analysis.
Tools for managing the installation (the former subcommands config and
enable-galaxy and the new upgrade tool) are now accesible exclusively via:
python3 -m MiModD.<tool name>
In addition, we are going to use this new command space to try out new tools
before adding them to the analysis tool interface. The index tool added in
the previous release is an example for such a tool and this version features:
- a novel in-development tool: sanitize
available as python3 -m MiModD.sanitize, this tool is intended as a format
sanitizer for the various input file formats supported by MiModD. Currently,
its only functionality is to rewrite fasta files to ensure they are
compatible with MiModD
- MiModD now uses samtools 1.2 for variant calling resulting in somewhat more
reliable calls
- we have started to rework the exception hierarchy in the package, which
should result in nicer and clearer error messages in the long run (you may
observe some of the effects already)
bug fixes:
..........
- we have substantially increased test coverage for the package
- this has led to the discovery of a number of minor bugs, which have been
fixed
changes in 0.1.5.2:
-------------------
v0.1.5.2 is the second bug-fix release for the 0.1.5 series.
new features/enhancements:
..........................
- bam files can be indexed using:
python3 -m MiModD.index <input bam>
bug-fixes:
..........
see also https://sourceforge.net/p/mimodd/tickets/milestone/0.1.5.2/
- all tools can now deal with whitespace-containing arguments and the temporary
files directory may now contain whitespace (and other special characters) in
their path
- added compatibility with SnpEff version 4.1
- snap tool single-end reads alignment was broken in v0.1.5.1 and is now
working again
- fixed a bug in varextract that caused a rare error with pre-calculated vcf
input
- slightly more robust vcf header parsing and writing
changes in 0.1.5.1:
-------------------
v0.1.5.1 is our first bug-fix release for the 0.1.5 series.
In addition, it brings a major enhancement in the cloudmap tool.
new features/enhancements:
..........................
the cloudmap tool has been redesigned:
- the former separate modes "VARIANT" and "HAWAIIAN" have been unified into a
single "VAF" (variant allele frequency) mode. The underlying algorithms have
also been merged to perform variant and Hawaiian mapping analysis
simultaneously. The new tool is more powerful than the previous version and
easier to use.
- the former "EMS" mode has been renamed to "SVD" (simple variant density) mode,
accordingly.
bug fixes:
..........
see https://sourceforge.net/p/mimodd/tickets/milestone/0.1.5.1/
changes in 0.1.5:
-----------------
v0.1.5 is the first STABLE release of MiModD!
new features/enhancements:
..........................
completely refactored variant calling:
- the former varcall variant calling tool functionality is now distributed over
three tools (varcall, varextract, covstats)
- varcall now produces bcf output (including every reference base), from which
varextract generates a vcf of just variant sites. covstats and delcall both
use the bcf to extract coverage information and the former cov format has
been removed
- this change reduces IO overhead, increases individual tool speed and
increases the overall pipeline efficiency because varextract, covstats and
delcall can all work in parallel
- the redesign also simplifies mapping-by-sequencing approaches and goes hand
in hand with
improved CloudMap support through the new cloudmap tool (replacing the old
cm-seqdict)
extended info tool
- the tool can now report not only samples, but also most other
meta-information encoded in supported formats
- these formats now include bcf and fasta
improved Galaxy interoperability
- all MiModD tools now respect GALAXY_SLOTS settings
- MiModD can now be configured via an environmental variable, which is very
helpful for Galaxy Tool Shed installations (see the updated installation
instructions for details
- enable-galaxy now works with old (universe.wsgi.ini-based) and new
(config/galaxy.ini-based) versions of Galaxy and does not copy its xml
wrappers into Galaxy anymore, but links them in the .ini file
SnpEff v4 compatibility
bug fixes (incomplete list)
...........................
- snap bug that prevented the tool from working correctly on OS X Yosemite
- a severe bug prevented the reheader tool from keeping its promise to produce
valid output under all circumstances
- header tags are now sorted when printing SAM headers
changes in 0.1.4.1
------------------
new features/enhancements:
..........................
a version subcommand has been added
annotated variants support hyperlinks for several new species
improved command line interface for the new reheader tool
changes in 0.1.4:
-----------------
v0.1.4 is the most significant "minor" release of MiModD yet and brings
several enhancements and numerous bug-fixes.
new features/enhancements:
..........................
simplified installation (see the updated installation instructions for details):
- MiModD is now pip installable
- samtools, bcftools and snap have become integrated and need not be installed
separately anymore
up-to-date variant calling and SAM/BAM handling:
- the variant calling engine has been upgraded to samtools/bcftools version 1.0
- transition of SAM/BAM handling from samtools 0.1.19 to 1.0 has been initiated
(during the transition phase MiModD ships with both samtools versions)
substantially improved SAM/BAM header operations:
- SAM/BAM header validation has been enhanced and error messages have become
clearer
- the reheader tool has been reworked completely to offer complementary
functionality to samtools reheader
- the header, convert and reheader tools all guarantee consistency of their
results files including consistency between header and body sections.
In other words, it should be impossible to generate unprocessable SAM/BAM
output from within MiModD
- this fixes issue:
.................
#21 using custom headers with convert/reheader can result in inconsistent
SAM/BAM files
bug fixes (incomplete list):
............................
- a bug in the deletion caller has been fixed that caused a sporadically
occurring error during insert size sampling
- SAM files generated by the header tool can now be converted to BAM format
- issue #23: failure to autodetect the SnpEff data folder has been solved, so
SnpEff integration finally works as advertised
- issue #18: substitute temporary file names in tool output has been addressed
making temporary file handling by MiModD truely transparent
changes in 0.1.3.1:
-------------------
This version makes MiModD fully compatible with systems with only the standard getopt() functionality (i.e. MacOS X).
Earlier versions failed to run samtools view commands with file redirection.
fixed issue:
............
#20 pysamtool.view generates non-canonical command line
changes in 0.1.3:
-----------------
new features:
.............
snap tool: now writes MD5 tags for all reference contigs to aligned reads file
and this is used by the varcall tool to verify reference identity for variant calling.
convert tool: can now convert fastq.gz input to SAM/BAM also from within Galaxy
can convert multi-part fastq or fastq.gz input to single SAM/BAM in one step
enhancements:
.............
snap tool: faster decompression of fastq.gz or bam input
fixed issues:
.............
#13: delcall bugs
#10: header tool compatibility with samtools
changes in 0.1.2:
-----------------
added new option to varcall / Variant Calling tool to generate a report on
coverage statistics
changes in 0.1.1:
-----------------
fixed issues:
#1: temporary file management caused error when input files and temporary file
directory are on different physical devices
#2: TMPFILES_PATHs set via mimodd config -c --tmpfiles are now interpreted
relative to the current working directory, i.e.,
mimodd config -c --tmpfiles tmp is extrapolated to TMPFILES_PATH : cwd/tmp