NAME
mid_adjust - Add or extend the MIDs and primers of sequences intended
for Pyrotagger or QIIME
SYNOPSIS
mid_adjust -m mapping.txt -f seqs.fa -q seqs.qual -o out
DESCRIPTION
If you have sequence datasets multiplexed with MIDs of different length
and want to analyze them using Pyrotagger, you need to extend the
shorter MIDs so that all MIDs have the same length. MID Adjust takes a
FASTA, QUAL and mapping file and does just that!
Another scenario is the case where you have a FASTA file containing
multiple samples without MIDs that you want to analyze through
Pyrotagger or QIIME. You can use MID Adjust to add arbitrary MIDs to the
sequences from each sample prior to Pyrotagger or QIIME analysis. Note
that you must still provide a mapping file, but omit the MID barcodes
from this file.
Yet another scenario is when you have the FASTA files for multiple
samples, but no QUAL files, and that you want to analyze the sequences
through Pyrotagger or QIIME. MID Adjust will add arbitrary MIDs and
primers to the sequences, concatenate them in a single file, assign them
fake quality scores, and generate a mapping file.
Note that the output of MID Adjust is always a mapping file (in
Pyrotagger or QIIME format), a single gzipped FASTA file and a single
gzipped QUAL file. The sequences in these files always contain
same-length MID barcodes and a primer sequence.
REQUIRED ARGUMENTS
-m <mapping_file>
Tab-delimited mapping file formatted for Pyrotagger or QIIME.
The Pyrotagger format is described at
<http://pyrotagger.jgi-psf.org/cgi-bin/index.pl>. A Pyrotagger file
shoud contain sample IDs in a first column and a fusion primer, i.e.
MID (uppercase) and primer (lowercase), in a second column. For
example:
Sample1 CTACTacgggcggtgtgtrc
Sample2 CTCGCacgggcggtgtgtyc
A QIIME file should contain four columns: a sample ID, MID, primer
and description. See
<http://qiime.org/documentation/file_formats.html>.
If you want to add arbitrary MIDs to sequences without MIDs, omit
the MIDs from this mapping file. As a special case, if you simply
pass the value 'pyrotagger' or 'qiime' a mapping file will be
created and arbitrary MIDs and primer added.
-f <fasta_file>...
FASTA files containing the sequences with the MIDs to adjust. When
adding entirely arbitrary MIDs to the sequences, make sure you
specify a <sample_id> method adapted to your FASTA files.
OPTIONAL ARGUMENTS
-q <quality_file>
Quality file containing the quality scores. If you have no quality
scores for the input sequences, fake quality scores will be
generated for you. Note that you can also generate fake quality
scores independently using the included script mid_adjust_fake_qual.
-s <sample_id>
When adding entirely arbitrary MIDs, specify what sequences belong
to what sample using one of two methods: 1) 'fname', all the
sequences in each FASTA file belong to a different sample, whose
name is the basename of the file, 2) 'seqid': each read has an ID of
the form '>$SAMPLEID_$READNUM' which identifies which sample it
comes from (the included script mid_adjust_rename_by_sample can help
you put your sequence IDs in this format). Default: fname
-l <mid_length>
Specify a desired MID length. When extending existing MIDs, by
default, all MIDs are set to the length of the longest MID. When
adding MIDs to sequences that do not have any, the default is to
generate the shortest possible MIDs. This options allows to force
using longer MIDs than the default.
-p <primer_seq>
When adding an arbitrary primer, specify the primer sequence to use.
Default: ACGGGCGGTGAGTGC
-o <output_prefix>
Prefix to use for the name of the output files. The output directory
will be created if necessary. Note that the FASTA and QUAL files
will be compressed with gzip. Default: mid_adjusted/all_samples
INSTALLATION
Dependencies
You need to install these dependencies first:
* Perl
<http://www.perl.com/download.csp>
* make
Many systems have make installed by default. If your system does
not, you should install the implementation of make of your choice,
e.g. GNU make: <http://www.gnu.org/s/make/>
The following CPAN Perl modules are dependencies that will be installed
automatically for you:
* Algorithm::Combinatorics
* Bioperl (>= 1.6.902)
* Getopt::Euclid (>= 0.3.4)
* Method::Signatures
* PerlIO::eol
* PerlIO::via::gzip
Procedure
To install Pyrotagger MID Adjust globally on your system, run the
following commands in a terminal or command prompt:
On Linux, Unix, MacOS:
perl Makefile.PL
make
And finally, with administrator privileges:
make install
On Windows, run the same commands but with nmake instead of make.
No administrator privileges?
If you do not have administrator privileges, Pyrotagger MID Adjust needs
to be installed in your home directory.
First, follow the instructions to install local::lib at
<http://search.cpan.org/~apeiron/local-lib-1.008004/lib/local/lib.pm#The
_bootstrapping_technique>. After local::lib is installed, every Perl
module that you install manually or through the CPAN command-line
application will be installed in your home directory.
Then, install Pyrotagger MID Adjust by following the instructions
detailed in the "Procedure" section.
AUTHOR
Florent Angly <florent.angly@gmail.com>
BUGS
There are undoubtedly bugs lurking somewhere in this code. Bug reports
and other feedback are most welcome.
COPYRIGHT
Copyright 2011-2012, Florent Angly
This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your
option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.
You should have received a copy of the GNU General Public License along
with this program. If not, see <http://www.gnu.org/licenses/>.
SEE ALSO
mid_adjust_rename_by_sample
A script to rename sequences according to sample name.
mid_adjust_fake_qual
A script to generate fake quality scores for sequences without any.