Menu

Tree [a30473] master /
 History

HTTPS access


File Date Author Commit
 ReferenceFreeSource 2012-04-17 John Harting John Harting [c4a85b] initial commit
 phylokmer 2012-04-17 John Harting John Harting [c4a85b] initial commit
 ABYSS 2012-04-17 John Harting John Harting [c4a85b] initial commit
 AssembleGroups.py 2012-04-17 John Harting John Harting [c4a85b] initial commit
 AssembleTypeI.py 2012-04-17 John Harting John Harting [c4a85b] initial commit
 Makefile 2012-04-17 John Harting John Harting [c4a85b] initial commit
 README.txt 2012-04-17 John Harting John Harting [a30473] Added README
 ReadsSelector 2012-04-17 John Harting John Harting [c4a85b] initial commit

Read Me

Inside the zip file:
AssembleTypeI.py
AssembleGroups.py
ReferenceFreeSource (directory)
	ReferenceFree package
	ReadsSelector.cpp
phylokmer (directory)
	various, see below

ReferenceFreeSource is the package containing code for both Python applications. (Author: John Harting)
AssembleTypeI.py is an application for creating TypeI contigs. (Author: John Harting)
AssembleGroups.py is an application for creating group contigs. (Author: John Harting)
ReadsSelector.cpp is the C++ source for ReadsSelector (Author: Ye Chengxi)
phylokmer is a directory containing several programs (C, Perl) for creating shared kmer files (Author: Jue Ruan)


Dependencies:
Perl
Python 2.6 and higher 2.X versions  (NOT Python 3.0+).
Biopython package for Python
ReadsSelector
ABYSS
g++/gcc compilers

Installation:
1.  Unpack into some directory and cd into that directory

3. Compile ReadsSelector and make Assemble*.py files executable by typing 'make' at the command line.

make

4. Compile phylokmer programs (used to build shared kmer files).

cd phylokmer
make 

5. Compile ABYSS into same directory as Assemble*.py apps.  http://www.bcgsc.ca/platform/bioinfo/software/abyss
   (In the future we will add the ability to use other assemblers, but for now, ABYSS is the go-to program)

(You can also compile ABYSS elsewhere and put a symbolic link pointing to it in the ReferenceFree directory with the Python apps)

6. Install the ReferenceFree package into python.

cd ReferenceFreeSource
sudo python setup.py install

*If you do not wish to install the package into the system-wide python site-package directory using sudo, you can install it into a suitably-enabled directory in your pythonpath using:

python setup.py install --prefix=/yourdir

**If you do not have admin access and/or want to set up a virtual python interpreter with local package directory (e.g. on a university computing grid), you can do something like the following which will set up in /mydirectory (see http://pypi.python.org/pypi/virtualenv for more info):

wget https://raw.github.com/pypa/virtualenv/master/virtualenv.py
python virtualenv.py /mydirectory
source /mydirectory/bin/activate
cd /path/to/ReferenceFreeSource
/mydirectory/bin/python setup.py install

Then, when you execute AssembleTypI.py and/or AssembleGroups.py, make sure you invoke the virtual python interpreter explicitly:

/mydirectory/bin/python AssembleTypeI.py [args] [options] 
/mydirectory/bin/python AssembleGroups.py [args] [options]

(You can also add /mydirectory/bin to your PATH variable by adding a line in your .profile or .bashrc (or whatever shell initialization file you use) to make this the 'default' python interpreter)  

-------------------------------------------

Help menus for phylokmer programs can be displayed by typing the executable name at the terminal.  The following is an example set of commands to create a shared kmer file for analysis:

perl /path/to/phylokmer/phylokmer.pl -l 21 -n 3 -d /path/to/sequence_data -f FA -j 1 -o /path/to/outputdata/pkdat/somegroupname_l21_n3_j1.shared.pkdat
mkdir /path/to/outputdata/pkdat/somegroupname_21l/
mv /path/to/outputdata/pkdat/*pkdat /path/to/outputdata/pkdat/somegroupname_21l/

NOTE:  The /path/to/sequence_data directory contains directories of fa/fq files for each taxa/individual.  The directories should be labelled 'Genus_species' (or more generally 'identifier1_identifier2'), eg:

/path/to/sequence_data/genusA_species1
/path/to/sequence_data/genusA_species2
/path/to/sequence_data/genusB_species1
...

-------------------------------------------

Help menus for both Python applications can be displayed by the following at the terminal:

./AssembleTypeI.py -h
./AssembleGroups.py -h

There are lots of options, but most of them have 'reasonable' defaults.  Each help menu has a set of 'minimum' arguments to run the steps in the applications.  Assuming you already have your shared kmer file and reads available, a typical run of one of the programs completing all steps would use the command:

./AssembleTypeI.py all -s /path/to/sharedkmerfile -i /path/to/kmer/dir -r /path/to/reads 

Each step can take a little while, depending on the dataset size, so its also possible to run steps independently (see help). 

Data outputs go to the specified or generated directories (see options), and also stdout and stderror outputs from ReadsSelector/ABYSS are captured and put in respective text files in the folder containing the code.  
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.