Home
Name Modified Size InfoDownloads / Week
README 2011-12-01 2.9 kB
probmask.tar.gz 2011-12-01 112.4 kB
zorro_linux_x86_64 2011-12-01 126.4 kB
zorro_mac 2011-12-01 67.4 kB
Totals: 4 Items   309.2 kB 1
ZORRO : A probabilistic alignment masking program

Authors: Martin Wu, Sourav Chatterji and Jonathan Eisen

1. Prerequisites 

Compiling ZORRO needs the C compiler gcc (preferably version 4.1.2 or later)
Running ZORRO needs perl

2. Installation
You can either download the precompiled executables from http://sourceforge.net/projects/probmask/files/
	zorro_linux_x86_64 for Linux 
	zorro_mac for Mac

or you can download the source code probmask.tar.gz and compile it yourself
To compile, on the command line, type 
	tar -zxvf probmask.tar.gz
	cd probmask/trunk/
	./autogen.sh
	./configure --prefix=$HOME
	make install
	
The binary "zorro" will be produced and installed to $HOME/bin directory. 

Optional : To easily access ZORRO from the command line, add the directory containing ZORRO ($HOME/bin) to your environment variable $PATH. 

Note: If you download the executables, make sure first change the file permission by typing "chmod +x zorro_mac' or 'chmod +x zorro_linux_x86_64'. You also need to have FastTree installed and in your system's executable path.

3. Running ZORRO

ZORRO is run as: "zorro [options] inputfile > outputfile"

Input : Here the input file is multiple sequence alignment in multi fasta format. More details about this format can be found at http://en.wikipedia.org/wiki/FASTA_format

An example input file is provided in: Example/family_15833_5_prank2_2_cDNAid

Output : The output is a plain text file with confidence scores of one column per line, in order from column 1, 2, 3 ... etc. The confidence score is a measure of the reliability of a column according to a pair-HMM model.  It has a value between 0 and 10. 

An example output fle is provided in: Example/family_15833_5_prank2_2_cDNAid.mask

Options: ZORRO has several options which are listed below and can also be accessed by typing "zorro -h".

-sample          : Sampling pairs to calculate alignment reliabilty [Off By Default]
-nosample        : No Sampling i.e. using every pair to calculate alignment reliabilty [On By Default]
-noweighting     : Using sum of pairs instead of weighted sum of pairs to calculate column confidence [Off By Default]
-ignoregaps      : Ignore pair-gaps in columns when calculating column confidences [Off By Default]
-Nsample NUMBER  : Tells ZORRO to sample #NUMBER pairs when sampling, automatically turns on -sample option [Samples 10*Nseq sequences By Default]
-treeprog PROGRAM: Tells ZORRO to use PROGRAM instead of the default FastTree to create guide tree [FastTree By Default]
-guide treefile  : User provided guide tree

4. Usage example

	zorro -sample ./Example/family_15833_5_prank2_2_cDNAid > family_15833_5_prank2_2_cDNAid.mask

5. Contact

Martin Wu (mw4yv@virginia.edu)
Sourav Chatterji (souravc@gmail.com)
Jonathan Eisen (jaeisen@ucdavis.edu)

6. Other credits
ZORRO makes use of FastTree, a GPL program available from:
http://www.microbesonline.org/fasttree/
Source: README, updated 2011-12-01