It runs a barcode pipeline to assemble Sanger (.AB1), FASTQ or FASTA files.
If you use Pipebar or OverlapPER, please, cite:
Oliveira, R. R. M., Nunes, G. L., de Lima, T. G. L., Oliveira, G., & Alves, R. (2018). PIPEBAR and OverlapPER: tools for a fast and accurate DNA barcoding analysis and paired-end assembly. BMC Bioinformatics, 19(1). doi:10.1186/s12859-018-2307-y
To facilitate the use of PIPEBAR by the users, we created a docker image which will enable the user to
run PIPEBAR without installing its dependencies.
A docker image is available so the installation of all required tools are already wraped up for
usage along PIPEBAR.
sudo apt-get install docker.io
sudo apt-get install wget
sudo docker --version
In this step, you will download the script, available on SourceForge, that automatize the
Pipebar pipeline. To download the script, enter:
wget https://sourceforge.net/projects/pipebar/files/pipebarScript.sh
After downloading the script, you will be able to run the pipeline. With superuser permission
you will type:
sudo sh pipebarScript.sh path/to/forward/reads path/to/reverse/reads
You need to pass two parameters, the path to forward and reverse reads. Once you entered the
above command you will get a similar output, regarding the creation of the Pipebar container.
At this point you will be enabled to run the pipeline, as it follows.
./pipebar --format <"ab1", "fastq" or "fastaqual"> --sep <separator_of_forward/reverse_reads> --mo <min_overlap> --ms <min_similarity> --phred <phred_offset> -q <phred_threshold> --coding <"1" for coding sequences or "0" for non-coding" sequence> --gcode <"1" for Standard code, "2" to Vertebrate Mitochondrial Code, "5" to Invertebrate Mitochondrial Code or "11" to Bacterial, Archaeal and Plant plastid code> --rep <"full" or "fast" report>
ex.: ./pipebar --format ab1 --sep _ --mo 25 --ms 0.9 --phred 33 -q 20 --coding 1 --gcode 1 --rep fast
Options:
-h|--help
Show this output.
-V|--version
Show version information.
--format <string>
Input format. Can be "ab1", "fastq" or "fastaqual".
--sep <string>
The IDs from both forward and reverse reads must have a separator.
Ex: 001read_forward and 001read_reverse have "_" (default) as
separator
--mo <integer>
Length of the minimum overlap between the paired reads (default is 25).
--ms <float>
Percentage of the accepted minimum similarity in an overlap region of
two paired reads (default is 0.9).
--phred <integer>
The offset of the PHRED qualities codes used.
Can be 33 or 64 (default is 33).
-q <integer>
The minimum quality value for trimming and
filtering steps (default is 20).
--coding <integer>
Inform if the barcode sequences to be analyzed are from
coding (e.g. rbcL, matK) or non-coding (e.g. ITS, atpF-trnH) regions.
Inform "1" for coding or "0" for non-coding sequences (default is 1)
--gcode <integer>
The genetic code to be used when translating the nucleotide
sequences into protein, when it comes to a coding region. It can be
"1" to Standard Code, "2" to Vertebrate Mitochondrial Code,
"5" to Invertebrate Mitochondrial Code or "11" to Bacterial, Archaeal
and Plant plastid code.
--rep <string>
A full report will generate a quality graphical report for each
barcode sequence analyzed, while a fast report will generate an overview
of the analyzed barcodes in one single report (default is "fast")
When the pipeline finishes its execution, you need to exit the pipebar environment, just enter:
exit
The pipebar script saves the results in the ResultPipebar folder that is in the same directory
from where it was called. The resulting files are:
You will need to download the following packages and install them:
At this point you will be enabled to run the pipeline, as it follows.
./pipebar --format <"ab1", "fastq" or "fastaqual"> --sep <separator_of_forward/reverse_reads> --mo <min_overlap> --ms <min_similarity> --phred <phred_offset> -q <phred_threshold> --coding <"1" for coding sequences or "0" for non-coding" sequence> --gcode <"1" for Standard code, "2" to Vertebrate Mitochondrial Code, "5" to Invertebrate Mitochondrial Code or "11" to Bacterial, Archaeal and Plant plastid code> --rep <"full" or "fast" report>
ex.: ./pipebar --format ab1 --sep _ --mo 25 --ms 0.9 --phred 33 -q 20 --coding 1 --gcode 1 --rep fast
Options:
-h|--help
Show this output.
-V|--version
Show version information.
--format <string>
Input format. Can be "ab1", "fastq" or "fastaqual".
--sep <string>
The IDs from both forward and reverse reads must have a separator.
Ex: 001read_forward and 001read_reverse have "_" (default) as
separator
--mo <integer>
Length of the minimum overlap between the paired reads (default is 25).
--ms <float>
Percentage of the accepted minimum similarity in an overlap region of
two paired reads (default is 0.9).
--phred <integer>
The offset of the PHRED qualities codes used.
Can be 33 or 64 (default is 33).
-q <integer>
The minimum quality value for trimming and
filtering steps (default is 20).
--coding <integer>
Inform if the barcode sequences to be analyzed are from
coding (e.g. rbcL, matK) or non-coding (e.g. ITS, atpF-trnH) regions.
Inform "1" for coding or "0" for non-coding sequences (default is 1)
--gcode <integer>
The genetic code to be used when translating the nucleotide
sequences into protein, when it comes to a coding region. It can be
"1" to Standard Code, "2" to Vertebrate Mitochondrial Code,
"5" to Invertebrate Mitochondrial Code or "11" to Bacterial, Archaeal
and Plant plastid code.
--rep <string>
A full report will generate a quality graphical report for each
barcode sequence analyzed, while a fast report will generate an overview
of the analyzed barcodes in one single report (default is "fast")
The resulting files are: