vipR Code
Status: Beta
Brought to you by:
andrealtmann
| File | Date | Author | Commit |
|---|---|---|---|
| PosCount.java | 2011-04-12 | andrealtmann | [r11] updated readme |
| README.txt | 2013-02-23 | andrealtmann | [r15] fixed small coding bug |
| pileup2vipr.java | 2011-01-12 | andrealtmann | [r3] speedup for java version, and support for .gz i... |
| pileup2vipr.py | 2011-01-12 | andrealtmann | [r2] added Java implementation for pileup2vipr |
| skellam.R | 2014-08-08 | andrealtmann | [r16] added skellam.R |
| vipR.R | 2014-08-08 | andrealtmann | [r16] added skellam.R |
Welcome to vipR!
This is the current Beta version of vipR. vipR
comprises two parts
I) conversion of the simple pileup file generated by SAM tools
into vipR's input format
II) detection of sequence variants.
For part I) we offer two implementations:
a) pileup2vipr.py is a python implementation, and a bit slow
b) pileup2vipr.java is a much faster Java implementation. Prior
to execution, the java code must be compiled via
> javac pileup2vipr.java
NOTE: some JAVA flavours might require a javac -C pileup2vipr.java
then the program can be executed using
> java pileup2vipr
Currently vipR can only handle the simple pileup format produced by samtools.
For achieving best results, use the latest version of samtools and execute
following command:
samtools mpileup -d 1000000 -s -f REFERENCE.fasta INPUTFILE
the option '-s' ensures that the 'simple' pileup format is generated, the option
'-d 1000000' sets the read depth limit to 1 million. The standard parameter is 250,
which might be a bit low for targeted resequencing experiments.
For part II) we offer a R/S script. In order to execute it, a
working R/S environment installed on your computer is required.
Simply execute 'source(vipR.R)' in your R/S environment, and
do the variant calling using the commands
vipr.loadData and vipr.run for loading your data and searching
for variants, respectively.
Basically, using vipR.R is a three step process:
> source("vipR.R")
> my.data <- vipr.loadData(c("POOL1","POOL2",...,"POOLn"))
> vipr.run(my.data, nhap = #haplotypes, ofname="myoutputfile.vcf")
where POOL1,...,POOLn correspond to the vipR input files, generated in step I) and
#haplotypes is the number of haplotypes per pool.
If there are questions feel free to contact me:
altmann@mpipsykl.mpg.de or
altmann@stanford.edu
Have fun!
Andre