PBSuite Documentation
Software for Long-Read Sequencing Data from PacBio
Brought to you by:
acenglishbcm
PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes. Each step in PBJelly’s workflow can be run on a cluster, thus parallelizing the gap filling process for rapid turn around, even for very large eukaryotic genomes.
Read the Documentation Here
Jelly Documentation -- v13.10
== CONTENTS ==
I. Using This README
II. Requirements
III. Installation
IV. Quick Start
V. Running Jelly
VI. Extras
== I. Using This README ==
Toy Data:
Provided with this distribution of Jelly is a toy example
inside of docs/jellyExample directory. Use this once you've
setup Jelly to test that everything is working as expected
and to become familiar with the software.
Commands:
All commands are presented in the format
> commandToExecute
where commandToExecute would be the actual command.
== II. Requirements ==
Blasr (https://github.com/PacificBiosciences/blasr)
Version 1.3.1.127046 is fully vetted as compatible with
Jelly. Other versions may run into problems. Use
> blasr -version
to figure out what you have. Blasr must be in your environment
path.
Python 2.7
Python must be in your environment path and executable with
the commands:
> python
> /usr/bin/env python
Networkx v1.1
Versions past v1.1 have been shown to have many issues. This will
be updated in the future. To check your version use, in a python
interactive terminal, type:
> import networkx
> networkx.version
If you get an error saying the attribute isn't found, you don't have
version 1.1
== III. Installation ==
1) Edit setup.sh and change $SWEETPATH to the full directory where
you've placed the package.
2) To automatically place the package into your environment, add
> source <path to="">/setup.sh
to your .bash_profile.</path>
Be sure to source your .bash_profile (or just setup.sh) before
using Jelly.
== IV. Quick Start ==
For more details on each step in the pipeline, see Section V
below. If, however, you'd like to just run the program do the
following.
1) Create your Protocol.xml
To run the lambdaExample dataset provided, edit the paths in
the <reference> , <outputdir> and the baseDir attribute in
the inputs tag to the full path in which lambdaExample is
sitting.
See Section V.1 for details about creating Protocol.xml</outputdir></reference>
2) Run each stage
Sequentially execute each state. One stage must finish executing
before continuing to the next. To run a stage, use the command
> Jelly.py <stage> yourProtocol.xml </stage>
3) Passing Parameters through Jelly.py
If you would like to pass a parameter to the stage you are running, use
"-x". For example, when running the support stage, if you only wanted
Jelly to attempt to fill captured-gaps (i.e. no inter-scaffold gaps), and
you wanted to require that a read must have a minimum mapping QV of >=
250 to support a gap, you'd use the command:
> Jelly.py support Protocol.xml -x "--capturedOnly --minMapqv=250"
All parameters you add need to be enclosed in double quotes after the -x
== V. Running Jelly ==
= Pre-Processing =
= Create Your Protocol =
= Setup your files =
= Mapping your data =
= Support The Gaps =
= Extract Useful Reads =
= Assemble The Gaps =
= Output Your Results =
== VI. Extras ==
blasrToBed.py
This script will convert blasr's .m4 or .m5 format into a
BED Format file ( http://genome.ucsc.edu/FAQ/FAQformat.html#format1 )
If you would like to visualize the alignments, I
reccommend using IGB ( http://bioviz.org/igb/index.html ).
bedToCoverageWig.py
Turn a bed file with alignments into a depth of coverage
WIG Format file ( http://genome.ucsc.edu/FAQ/FAQformat.html#format6 ).
== VI. FAQ ==
Please report your issues to the sourceforge ticketing system.
Last edit: Adam English 2013-10-30
Last edit: Rodrigo Baptista 2018-04-24
Are the above instructions/dependencies/versions still valid for Dec 2018?