Menu

Tree [d83ead] master /
 History

HTTPS access


File Date Author Commit
 .gitignore 2012-05-28 czc czc [dea800] new file: .gitignore
 Makefile 2015-07-28 Chong Chong [d83ead] On branch master
 Makefile.bak 2015-07-28 Chong Chong [d83ead] On branch master
 README.txt 2015-07-28 Chong Chong [d83ead] On branch master
 aln_cigar.h 2012-05-14 czc czc [5f2464] rainbow1.1
 asm_R2.c 2012-07-05 czc czc [110782] modified: asm_R2.c
 asm_R2.h 2012-06-09 czc czc [a66f4e] merge and assemble along tree and hash all leaves
 bitvec.h 2012-05-14 czc czc [5f2464] rainbow1.1
 bloom_filter.h 2012-06-18 czc czc [684d4b] new file: bloom_filter.h
 cluster.c 2015-07-28 Chong Chong [d83ead] On branch master
 divide.c 2012-09-18 czc czc [1d4bf1] make a few changes compatible with Ubuntu
 dna.h 2012-05-14 czc czc [5f2464] rainbow1.1
 ezmsim.c 2012-05-15 czc czc [5b0b3d] added ezmsim module
 file_reader.c 2012-05-14 czc czc [5f2464] rainbow1.1
 file_reader.h 2012-05-14 czc czc [5f2464] rainbow1.1
 hashset.h 2012-06-18 czc czc [684d4b] new file: bloom_filter.h
 heap.h 2012-05-14 czc czc [5f2464] rainbow1.1
 list.h 2012-09-18 czc czc [1d4bf1] make a few changes compatible with Ubuntu
 main.c 2015-07-28 Chong Chong [d83ead] On branch master
 mergecontig.c 2012-05-29 czc czc [594128] prepare ctgs to merge done
 mergecontig.h 2012-05-24 czc czc [4460b0] modified: divide.c
 mergectg.c 2012-09-18 czc czc [1d4bf1] make a few changes compatible with Ubuntu
 mergectg.h 2012-07-05 czc czc [110782] modified: asm_R2.c
 mergetag.c 2012-05-14 czc czc [5f2464] rainbow1.1
 rainbow.h 2012-07-05 czc czc [110782] modified: asm_R2.c
 rbasm_main.c 2012-09-18 czc czc [1d4bf1] make a few changes compatible with Ubuntu
 select_all_rbcontig.pl 2012-09-18 czc czc [6ad525] add select_all_rbcontig.pl to fetch all contigs...
 select_best_rbcontig.pl 2012-05-14 czc czc [5f2464] rainbow1.1
 select_best_rbcontig2.pl unknown
 select_best_rbcontig_plus_read1.pl 2015-07-28 Chong Chong [d83ead] On branch master
 select_sec_rbcontig.pl 2012-07-05 czc czc [d86b80] modified: README.txt
 simp_asm.h 2012-05-14 czc czc [5f2464] rainbow1.1
 sort.h 2012-05-14 czc czc [5f2464] rainbow1.1
 stdaln.c 2012-05-14 czc czc [5f2464] rainbow1.1
 stdaln.h 2012-05-14 czc czc [5f2464] rainbow1.1
 string.h 2012-05-14 czc czc [5f2464] rainbow1.1
 vector.h 2012-05-14 czc czc [5f2464] rainbow1.1

Read Me

Rainbow v2.0.4

Description
===========
Rainbow package consists of several programs used for RAD-seq related 
clustering and de novo assembly.

Installation
============
Type 'make' to compile Rainbow package. You can copy the executables/scripts
to your specific location (e.g. a directory in your $PATH). Or you can set
the PATH environment that leads to this directory.


Usage of Rainbow package
========================
EXAMPLE: a typical use of Rainbow step by step

	rainbow cluster -1 1.fq  -2 2.fq > rbcluster.out 2> log
	rainbow div -i rbcluster.out -o rbdiv.out
	rainbow merge -o rbasm.out -a -i rbdiv.out -N500

The final output file of 'rainbow merge -a' is based on the final merged
clusters. Each cluster has been locally assembled by 'rainbow merge -a'. For
each cluster, rainbow outputs all assembled contigs seperated by '//' for each
record:
E clusterID
C contigID1
L length
S sequence
N #reads
R readIDs
//
C contigID2
L length
S sequence
N #reads
R readIDs
.
.
.

We have also provided four simple perl scripts that can be used to extract the assembly
information: select_all_rbcontig.pl, select_best_rbcontig.pl, select_sec_rbcontig.pl, select_best_rbcontig_plus_read1.pl

select_all_rbcontig.pl extracts all the assembled contigs, i.g., all the
records

select_best_rbcontig.pl and select_sec_rbcontig.pl extract the longest and
the longest plus the second longest contigs for the final clusters,
respectively

select_best_rbcontig_plus_read1.pl, as select_best_rbcontig.pl, it  extracts the longest contig for each cluster. Besides, it also outputs the read1. If read1 overlaps with the contig, it joins the two as a whole. If read1 does not overlap with the contig, it pads 10 'X' to join the read1 and the contig, thus generating a long contig. 

----------------------------------------------------------------------------------
rainbow 2.0.3 -- <ruanjue@gmail.com, chongzechen@gmail.com>
Usage: rainbow <cmd> [options]

 cluster
  Input  File Format: paired fasta/fastq file(s)
  Output File Format: <seqid:int>\t<cluster_id:int>\t<read1:string>\t<read2:string>
  -1 <string> Input fasta/fastq file, supports multiple '-1'
  -2 <string> Input fasta/fastq file, supports multiple '-2' [null]
  -l <int>    Read length, default: 0 variable
  -m <int>    Maximum mismatches [4]
  -e <int>    Exactly matching threshold [2000]
  -L          Low level of polymorphism
 div
  Input File Format: <seqid:int>\t<cluster_id:int>\t<read1:string>\t<read2:string>
  Output File Format: <seqid:int>\t<cluster_id:int>\t<read1:string>\t<read2:string>[\t<pre_cluster_id:int>]
  -i <string> Input rainbow cluster output file [stdin]
  -o <string> Output file [stdout]
  -k <int>    K_allele, min variants to create a new group [2]
  -K <int>    K_allele, divide regardless of frequency when num of variants exceed this value [50]
  -f <float>  Frequency, min variant frequency to create a new group [0.2]
 merge
  Input File Format: <seqid:int>\t<cluster_id:int>\t<read1:string>\t<read2:string>[\t<pre_cluster_id:int>]
  -i <string> Input rainbow div output file [stdin]
  -a          output assembly 
  -o <string> Output file [stdout]
  -N <int>    Maximum number of divided clusters to merge [300]
  -l <int>    Minimum overlap when assemble two reads (valid only when '-a' is opened) [5]
  -f <float>  Minimum fraction of similarity when assembly (valid only when '-a' is opened) [0.90]
  -r <int>    Minimum number of reads to assemble (valid only when '-a' is opened) [5]
  -R <int>    Maximum number of reads to assemble (valid only when '-a' is opened) [300]

----------------------------------------------------------------------------------
rbasm: a greedy assembler to locally assemble each cluster produced by rainbow or the other
tools. This has been integrated into the merge module. Please always open '-a' option when running
'rainbow merge'.
Local assemble fragments around restriction sites
NOTE: the input file format should be: <seqid:int>\t<cluster_id:int>\t<read1:string>\t<read2:string>[\t<pre_cluster_id:int>]
Usage: rbasm [options]
 -i <string> Input file [STDIN] 
 -o <string> Output file [STDOUT]
 -l <int>    Minium length of overlap [5]
 -s <float>  Minium similiarity of overlap [0.90]
 -r <int>    Minium reads to execute assembly [5]
 -R <int>    Maxium reads to execute assembly [200]

----------------------------------------------------------------------------------
<Obsoleted> rbmergetag: a program merges divided results to evaluate clustering performance.
                  Users should omit this program when de novo assembling RAD-seq reads.
Usage: rbmergetag [options]
Options:
 -i <string>    Input file name [stdin]
 -o <string>    Output file name [stdout]
 -j <cns|merge> Job type, cns: consensus, merge: merging, [merge]
 -m <int>       Maximum mismatches to merge two groups [1]
 -h             Show this document

----------------------------------------------------------------------------------

Change log:
===========
v2.0.1: README and usage infomation updated
v2.0.2: 'merge' options are riched. The 'merge' assembly work can be customized like rbasm now. Thanks Ross Whetten in NCSU for advicing this.
v2.0.3: changed the name of script 'select_best_rbcontig2.pl' to 'select_best_rbcontig_plus_read1.pl', and documented it. 
v2.0.4: fixed a bug that rainbow cannot be compiled in Mac OS