PBSuite Documentation
Software for Long-Read Sequencing Data from PacBio
Brought to you by:
acenglishbcm
PBJelly is a highly automated pipeline that aligns long sequencing reads (such as PacBio RS reads or long 454 reads in fasta format) to high-confidence draft assembles. PBJelly fills or reduces as many captured gaps as possible to produce upgraded draft genomes. Each step in PBJelly’s workflow can be run on a cluster, thus parallelizing the gap filling process for rapid turn around, even for very large eukaryotic genomes.
Read the Documentation Here
Jelly Documentation -- v13.10
== CONTENTS ==
I. Using This README
II. Requirements
III. Installation
IV. Quick Start
V. Running Jelly
VI. Extras
== I. Using This README ==
Toy Data:
Provided with this distribution of Jelly is a toy example
inside of docs/jellyExample directory. Use this once you've
setup Jelly to test that everything is working as expected
and to become familiar with the software.
Commands:
All commands are presented in the format
> commandToExecute
where commandToExecute would be the actual command.
== II. Requirements ==
Blasr (https://github.com/PacificBiosciences/blasr)
Version 1.3.1.127046 is fully vetted as compatible with
Jelly. Other versions may run into problems. Use
> blasr -version
to figure out what you have. Blasr must be in your environment
path.
Python 2.7
Python must be in your environment path and executable with
the commands:
> python
> /usr/bin/env python
Networkx v1.1
Versions past v1.1 have been shown to have many issues. This will
be updated in the future. To check your version use, in a python
interactive terminal, type:
> import networkx
> networkx.version
If you get an error saying the attribute isn't found, you don't have
version 1.1
== III. Installation ==
1) Edit setup.sh and change $SWEETPATH to the full directory where
you've placed the package.
2) To automatically place the package into your environment, add/setup.sh
> source
to your .bash_profile.
Be sure to source your .bash_profile (or just setup.sh) before
using Jelly.
== IV. Quick Start ==
For more details on each step in the pipeline, see Section V
below. If, however, you'd like to just run the program do the
following.
1) Create your Protocol.xml
To run the lambdaExample dataset provided, edit the paths in
the <reference> , <outputDir> and the baseDir attribute in
the inputs tag to the full path in which lambdaExample is
sitting.
See Section V.1 for details about creating Protocol.xml
2) Run each stage
Sequentially execute each state. One stage must finish executing
before continuing to the next. To run a stage, use the command
> Jelly.py <stage> yourProtocol.xml
3) Passing Parameters through Jelly.py
If you would like to pass a parameter to the stage you are running, use
"-x". For example, when running the support stage, if you only wanted
Jelly to attempt to fill captured-gaps (i.e. no inter-scaffold gaps), and
you wanted to require that a read must have a minimum mapping QV of >=
250 to support a gap, you'd use the command:
> Jelly.py support Protocol.xml -x "--capturedOnly --minMapqv=250"
All parameters you add need to be enclosed in double quotes after the -x
== V. Running Jelly ==
= Pre-Processing =
= Create Your Protocol =
= Setup your files =
= Mapping your data =
= Support The Gaps =
= Extract Useful Reads =
= Assemble The Gaps =
= Output Your Results =
== VI. Extras ==
blasrToBed.py
This script will convert blasr's .m4 or .m5 format into a
BED Format file ( http://genome.ucsc.edu/FAQ/FAQformat.html#format1 )
If you would like to visualize the alignments, I
reccommend using IGB ( http://bioviz.org/igb/index.html ).
bedToCoverageWig.py
Turn a bed file with alignments into a depth of coverage
WIG Format file ( http://genome.ucsc.edu/FAQ/FAQformat.html#format6 ).
== VI. FAQ ==
Please report your issues to the sourceforge ticketing system.
Last edit: Adam English 2013-10-30
Last edit: Rodrigo Baptista 2018-04-24
Are the above instructions/dependencies/versions still valid for Dec 2018?