Home
Name Modified Size InfoDownloads / Week
README.txt 2010-08-11 3.9 kB
Totals: 1 Item   3.9 kB 0
Grid Deconvolution README

# ------------------------------------------------------------------------------
# Overview
# ------------------------------------------------------------------------------

This is a software library to run the JCVI barcode deconvolution pipeline using 
Sun Grid Engine, or optionally without the use of a grid.

This software deconvolves a FASTA, FASTQ or SFF file based on the barcode 
sequences, as determined by running fuzznuc to find the best hits of read to 
barcode sequences. The results are a report of trim points, and a list of 
barcode fasta files with each entry representing a read sequence that has its 
unambiguous best hit to the bar code.  The bar code sequences are trimmed by 
default, unless otherwise specified.

This software is open-source and available free of charge subject to the GNU 
General Public License, version 2.

# ------------------------------------------------------------------------------
# Getting Started
# ------------------------------------------------------------------------------

# 1. Checkout Grid and FileIO libraries from deconvolver svn repository
     svn co https://deconvolver.svn.sourceforge.net/svnroot/deconvolver deconvolver

# 2. Run the test script and verify test complete successfully
     deconvolver/Grid/trunk/t/grid_deconvolve.t
     
# 3. Read the help menu for the deconvolution command-line API
     Grid/trunk/bin/grid-deconvolve.pl --help

# ------------------------------------------------------------------------------
# Standard mode for custom DNA barcode protocols
# ------------------------------------------------------------------------------

The standard mode for grid deconvolution will search for the barcode pattern in
both strands on the entire sequence of the input fragment data sets at an 
optionally provided number of allowable mismatches to the pattern.

# ------------------------------------------------------------------------------
# sfffile_mode
# ------------------------------------------------------------------------------

Grid/bin/grid-deconvolve.pl --sfffile_mode <OPTIONS>

The release of the deconvolution code includes an "sfffile_mode" option that 
enables the deconvolution process to work similarly to the sfffile deconvolver 
for any input dataset (.sff, .fasta, .fastq).  This option strictly searches for 
the key sequence adjacent to the barcode in the 5’-to-3’ direction. 

This option was added to provide a single software solution to handle both 
standard Roche MID barcoded data, and to handle custom barcode design protocols.  

Running in sfffile_mode does the following

1)	Automatically sets and/or validates the key sequence
	a.	For .sff files, it sets the key sequence using sffinfo, or validates a 
     user-provided key sequence.
     b.	For non-sff files, it requires that the user provide the key sequence.
    
2)	Searches for barcode pattern only on positive strand on input sequence files.

Note that there may be slight differences between the results of running sfffile 
versus grid-deconvolve.pl in sfffile_mode.  The key differences are the sfffile 
tolerates gaps and does not count them against the allowable mismatch threshold, 
and sfffile looks for a sequence alignment only at the very beginning of the 
read sequence.  grid-deconvolve.pl uses fuzznuc from Emboss tools to search the 
entire string for an allowable match. 

There is a test case included in the repository to validate sfffile_mode
Grid/t/grid_deconvolve_sfffile_mode.t


# ------------------------------------------------------------------------------
# Author
# ------------------------------------------------------------------------------

Nelson Axelrod
J. Craig Venter Institute
http://www.jcvi.org
naxelrod@jcvi.org

Source: README.txt, updated 2010-08-11