You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(4) |
Jun
(2) |
Jul
(3) |
Aug
(3) |
Sep
(5) |
Oct
(2) |
Nov
(4) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
|
Mar
(5) |
Apr
(6) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
(15) |
Nov
(3) |
Dec
|
2012 |
Jan
|
Feb
(7) |
Mar
(3) |
Apr
(17) |
May
(5) |
Jun
|
Jul
(5) |
Aug
(1) |
Sep
(2) |
Oct
(3) |
Nov
(2) |
Dec
(1) |
2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
(1) |
Aug
|
Sep
(2) |
Oct
(2) |
Nov
(2) |
Dec
(2) |
2014 |
Jan
|
Feb
(2) |
Mar
(9) |
Apr
(2) |
May
|
Jun
(2) |
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
(1) |
Dec
(1) |
2015 |
Jan
|
Feb
|
Mar
(4) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jean-Philippe V. <jpv...@gm...> - 2017-06-06 20:23:14
|
Hi, Sorry for multiple receptions... I met problems using USEQ for ChIPSeq. UPDATE : Maybe it's relative to that https://support.bioconductor.org/p/83850/ Caused by latter version of DESeq2 so how I recompile all the executables ? Could you give a hand on this ? Thanks ! *Command : * java -Xmx4G -jar /home/jean-philippe.villemin/bin/USeq_8.9.6/Apps/ChIPSeq -y sam -v H_sapiens_Dec_2013 -t /home/jean-philippe.villemin/data/data/USEQ/T1/K4ME1 -c /home/jean-philippe.villemin/data/data/USEQ/T1/INPUT -s /home/jean-philippe.villemin/CHIPSEQ_2017_1_ALL/USEQ -r /home/jean-philippe.villemin/bin/anaconda3/lib/R/bin/R *In the output :* *.... * Calculating individual peak shifts for top 100... 90 Median peak shift 246 Mean peak shift 310.4 Stnd dev peak shifts Could not robustly identify the peak shift in your data, using the default 150bp peak shift and 250bp window size. ************************************************************ ******************* Scanning chromosomes for differential enrichment/reduction with MultipleReplicaScanSeqs... Stats and parameters... 92704636 Treatment Observations 39079825 Control Observations 150 Peak shift 250 Window size 15 Minimum number reads in window java.io.IOException: DESeq2 R results file doesn't exist. Check temp files in save directory for error. at edu.utah.seq.analysis.MultipleReplicaScanSeqs.executeDESeq2( MultipleReplicaScanSeqs.java:337) at edu.utah.seq.analysis.MultipleReplicaScanSeqs.run( MultipleReplicaScanSeqs.java:139) at edu.utah.seq.analysis.MultipleReplicaScanSeqs.<init> (MultipleReplicaScanSeqs.java:99) at edu.utah.seq.analysis.ChIPSeq.scanSeqs(ChIPSeq.java:174) at edu.utah.seq.analysis.ChIPSeq.<init>(ChIPSeq.java:86) at edu.utah.seq.analysis.ChIPSeq.main(ChIPSeq.java:442) *In tmp files , in RScript.txt.Rout : * Warning message: package ‘RColorBrewer’ was built under R version 3.3.2 > countTable = read.delim('/home/jean-philippe.villemin/mount_ archive2/commun.luco/EMT_2015_SEQ_data/EMT_CHIPSEQ_FILES/2017/Luco/USEQ/ MultipleReplicaScanSeqs/TempRDir_XQKXQ8D/matrixFile.txt', header=FALSE) > rownames(countTable) = countTable[,1] > countTable = countTable[,-1] > sampleInfo = data.frame(condition=as.factor(c('T','C'))) > rownames(sampleInfo) = c('T0','C0') > cds = DESeqDataSetFromMatrix(countData=countTable, colData=sampleInfo, design = ~condition) Error in FUN(X[[i]], ...) : assay colnames() must be NULL or equal colData rownames() Calls: DESeqDataSetFromMatrix ... SummarizedExperiment -> SummarizedExperiment -> .local -> vapply -> FUN Execution halted -- *Jean-Philippe Villemin *- Bioinformatics, Software Engineer - IGH (Institute of Human Genetics) Team: Splicing and Epigenetics 141 rue de la Cardonille Montpellier France 4094 cedex 5 *jpv...@gm... <jpv...@gm...>* |
From: Franziska <fra...@kc...> - 2015-03-26 11:57:05
|
Dear all, I've been getting an error running MultipleReplicaScanSeqs: java.io.IOException: DESeq2 R results file doesn't exist. Check temp files in save directory for error. When I check the temp file, it is telling me that it cannot find the function "rlog". I suppose this is some kind of version compatibility issue? I have both DESeq and DESeq2 packages installed. My input command was as follows: java -jar /media/sf_BioLinux/Useq_8.8.9/Apps/MultipleReplicaScanSeqs -t /media/sf_BioLinux/bam_chr17_35-36K/point_data/SNL -c /media/sf_BioLinux/bam_chr17_35-36K/point_data/SHAM -s /media/sf_BioLinux/bam_chr17_35-36K/point_data/SNLvsSHAM_MRS -p 260 -w 500 Any advice would be much appreciated! Thank you and best wishes, Franziska |
From: Biocyberman <bio...@gm...> - 2015-03-13 08:33:32
|
I am new to the tool and have no idea to judge the significances of the warnings in the command. My questions: 1. What is the consequence of "seqs null" messages below. 2. What are the consequences of the WARNING below? In orther words, how does it effect the outcome of downstream analysis? I tried increasing the -n values to 120K, 240K, but still got the similar warnings, and with higher values of -n, the tool takes really long time to finish. I actually had to kill the process when -n 240K. Vang ------------------------------- [4 Mar 2015 18:12] USeq_8.8.9 Arguments: -f ../chromosomes/canonical_chrs -u ./refFlat.canonical.txt -r 69 -n 60000 Transcript table ./refFlat.canonical.txt Fasta directory ../chromosomes/canonical_chrs Splice radius 69 Max # splices per gene 60000 Max minutes per splice 10 Skip duplicate splice junctions false Saving fasta files to . Processing... chr20 ....seqs length 74423 max num 60000 seqs null . chr10 ..seqs length 82234 max num 60000 seqs null seqs null ............ chr11 .... chr12 .... chr13 ..... chr5 .......... chr6 ....... chrY chr7 .seqs length 126083 max num 60000 ......... chr8 .......... chr9 ...... chrX ..... chr15 ..... chr14 ..... chr17 .... chr16 .... chr1 ....................... chr19 .... chr18 .... chr2 .......... chr3 .............. chr4 ........... WARNING: splice junctions for the following genes were only partially made. They exceeded the maximum number (N) or minutes (M) thresholds. Only those present in the transcripts are guarenteed to be present in the fasta file. Brap:4N Tesb:4N Lace1:4N Col2a1:1N Larp7:4N Rbm25:4N Col3a1:1N Ccdc63:4N Col16a1:1N Fras1:1N Arntl2:4N Nmbr:4N Aldh1l1:4N Vom2r80:4N Hook3:4N Oas3:4N Fosl2:4N Col7a1:1N Olr1610:4N Pld2:4N Col9a1:1N Dennd1c:4N Col5a1:1N Col18a1:1N Col5a2:1N Col5a3:1N Slc12a9:4N Loxl2:4N Col17a1:1N Lrp1b:1N Map3k5:4N Pcca:4N Ift172:1N Arhgef12:1N Atp2a1:1N Thbs1:1N Col11a1:1N Col11a2:1N Rbm24:4N Ces5a:4N Col13a1:1N Trdn:1N Dync2h1:1N Si:1N Sema4b:1N Adap2:4N Copb2:4N Rtn1:4N Adra1b:4N Col27a1:1N Snx17:4N Usp34:1N Itpr2:1N Ggt1:4N Plb1:1N Prph2:4N Sspo:1N Lamc3:4N Fry:1N Ryr2 :1N Ttc30a:1N Kdm2b:4N Hspa1a:1N Dmbt1:4N Abcc4:1N Abcc3:1N Ptprg:1N Col1a1:1N Csn1s1:4N Hpse:4N Wrap73:4N Col1a2:1N Acaa1b:4N Trpm7:1N March9:4N Tmem132b:1N Stab2:1N 18715 number transcripts created 9920691 number splices created Done! 223 min |
From: David A. N. <dav...@gm...> - 2015-03-12 12:08:28
|
Hello Vang, Can’t use bams for STP, the header from a transcriptome alignment can be huge and the picard tools for reading bams load the entire header causing a memory issue. A couple options, why not use the -h option in STP to put on your own header? Then merge. I think there are picard utilities for replacing headers too. Without the -h option, STP creates a header based on the sequences present and the position of the last aligned read, thus, these will differ between samples. You could also use the USeq MergeSams app. It’s aware of differences in headers and creates a stripped composite. -cheers, David On Mar 12, 2015, at 4:05 AM, Biocyberman <bio...@gm...> wrote: > Hello Useq community, > I am trying to use Useq for RNAseq analysis with novoalignCS as the aligner. My questions: How can I avoid this manual fixing of the header? And how can I use the BAM file directly for STP? > > Here are some information: > > I produced each BAM file per lane for one sample. I used SamTranscriptomeParser to convert the BAM file to chromosomes coordinates. Here come my problems: > > 1. I can't use the BAM file directly for '-f' argument, I get around this by converting the BAM file to SAM and feed it to STP. But I still wonder if there is a way to use the BAM file directly. > > 2. I can't merge the BAM files after step 1. For example: Sample1_Lane1.BAM, Sample1_Lane2.BAM, and Sample1_Lane3.BAM. The problem is that the @SQ lines of these BAM files are different, both at SN and LN values. If I do with test data of small number of reads, there are more differences in SN values because STP throw away all chromosomes that do not have reads. Below is first 30 lines from my RAW bam header, BEFORE the conversion > > > @HD VN:1.0 SO:unsorted > @PG ID:novoalignCS PN:novoalignCS VN:V1.05.00 CL:novoalignCS -d rn6.rnaseq.n60k.cnx -f L03.xsq -F XSQ LT2 -o SAM -r Random > @SQ SN:chr1 LN:282763074 AS:rn6.rnaseq.cnx > @SQ SN:chr2 LN:266435125 AS:rn6.rnaseq.cnx > @SQ SN:chr3 LN:177699992 AS:rn6.rnaseq.cnx > @SQ SN:chr4 LN:184226339 AS:rn6.rnaseq.cnx > @SQ SN:chr5 LN:173707219 AS:rn6.rnaseq.cnx > @SQ SN:chr6 LN:147991367 AS:rn6.rnaseq.cnx > @SQ SN:chr7 LN:145729302 AS:rn6.rnaseq.cnx > @SQ SN:chr8 LN:133307652 AS:rn6.rnaseq.cnx > @SQ SN:chr9 LN:122095297 AS:rn6.rnaseq.cnx > @SQ SN:chr10 LN:112626471 AS:rn6.rnaseq.cnx > @SQ SN:chr11 LN:90463843 AS:rn6.rnaseq.cnx > @SQ SN:chr12 LN:52716770 AS:rn6.rnaseq.cnx > @SQ SN:chr13 LN:114033958 AS:rn6.rnaseq.cnx > @SQ SN:chr14 LN:115493446 AS:rn6.rnaseq.cnx > @SQ SN:chr15 LN:111246239 AS:rn6.rnaseq.cnx > @SQ SN:chr16 LN:90668790 AS:rn6.rnaseq.cnx > @SQ SN:chr17 LN:90843779 AS:rn6.rnaseq.cnx > @SQ SN:chr18 LN:88201929 AS:rn6.rnaseq.cnx > @SQ SN:chr19 LN:62275575 AS:rn6.rnaseq.cnx > @SQ SN:chr20 LN:56205956 AS:rn6.rnaseq.cnx > @SQ SN:chrX LN:159970021 AS:rn6.rnaseq.cnx > @SQ SN:chrY LN:3310458 AS:rn6.rnaseq.cnx > @SQ SN:Zbtb22:chr20:5478760-5478829_5479425-5479494 LN:138 AS:rn6.rnaseq.cnx > @SQ SN:Taf11:chr20:7477919-7477988_7482497-7482566 LN:138 AS:rn6.rnaseq.cnx > @SQ SN:Taf11:chr20:7477185-7477254_7478450-7478519 LN:138 AS:rn6.rnaseq.cnx > @SQ SN:Taf11:chr20:7478469-7478538_7480277-7480346 LN:138 AS:rn6.rnaseq.cnx > @SQ SN:Taf11:chr20:7477185-7477254_7477891-7477960 LN:138 AS:rn6.rnaseq.cnx > @SQ SN:Taf11:chr20:7480357-7480426_7482497-7482566 LN:138 AS:rn6.rnaseq.cnx > > And below is the header produced by STP after the conversion: > > @HD VN:1.4 SO:coordinate > @RG ID:2.LT2.3 SM:LT2 CN:AfMD LB:LT2_2 PL:SOLiD PU:2.LT2.3 > @SQ SN:chr10 LN:112626471 AS:rn6.rnaseq.cnx > @SQ SN:chr11 LN:90463843 AS:rn6.rnaseq.cnx > @SQ SN:chr12 LN:52716770 AS:rn6.rnaseq.cnx > @SQ SN:chr13 LN:114033958 AS:rn6.rnaseq.cnx > @SQ SN:chr5 LN:173707219 AS:rn6.rnaseq.cnx > @SQ SN:chr6 LN:147991367 AS:rn6.rnaseq.cnx > @SQ SN:chrY LN:3310458 AS:rn6.rnaseq.cnx > @SQ SN:chr7 LN:145729302 AS:rn6.rnaseq.cnx > @SQ SN:chr8 LN:133307652 AS:rn6.rnaseq.cnx > @SQ SN:chr9 LN:122095297 AS:rn6.rnaseq.cnx > @SQ SN:chrX LN:159970021 AS:rn6.rnaseq.cnx > @SQ SN:chr15 LN:111246239 AS:rn6.rnaseq.cnx > @SQ SN:chr14 LN:115493446 AS:rn6.rnaseq.cnx > @SQ SN:chr17 LN:90843779 AS:rn6.rnaseq.cnx > @SQ SN:chr16 LN:90668790 AS:rn6.rnaseq.cnx > @SQ SN:chr19 LN:62275575 AS:rn6.rnaseq.cnx > @SQ SN:chr1 LN:282763074 AS:rn6.rnaseq.cnx > @SQ SN:chr18 LN:88201929 AS:rn6.rnaseq.cnx > @SQ SN:chr2 LN:266435125 AS:rn6.rnaseq.cnx > @SQ SN:chr3 LN:177699992 AS:rn6.rnaseq.cnx > @SQ SN:chr4 LN:184226339 AS:rn6.rnaseq.cnx > @PG ID:SamTranscriptomeParser CL: args [12 Mar 2015 9:53] USeq_8.8.9 -f LT2.2.LT2.3_unsorted.fifo.sam -a 900 -n 100 -u -s LT2.2.LT2.3_unsorted.tmp.bam > @PG ID:novoalignCS PN:novoalignCS VN:V1.05.00 CL:novoalignCS -d rn6.rnaseq.n60k.cnx -f L03.xsq -F XSQ LT2 -o SAM -r Random > > As you may see, the canonical chromosomes got reordered, chromosome 20 is thrown out. If I have @RG lines, they would be thrown out also. > I can extract the header I want to use from the raw bam file: samtools view -H raw_alignment.bam| egrep -v "@SQ\s+SN:[A-Za-z0-9]+:.*$|@HD". Then I can add the header with: > > "SamTranscriptomeParser -f test.sam -s test.bam -h head.txt " > > > Thanks > Vang > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: Biocyberman <bio...@gm...> - 2015-03-12 10:05:38
|
Hello Useq community, I am trying to use Useq for RNAseq analysis with novoalignCS as the aligner. My questions: How can I avoid this manual fixing of the header? And how can I use the BAM file directly for STP? Here are some information: I produced each BAM file per lane for one sample. I used SamTranscriptomeParser to convert the BAM file to chromosomes coordinates. Here come my problems: 1. I can't use the BAM file directly for '-f' argument, I get around this by converting the BAM file to SAM and feed it to STP. But I still wonder if there is a way to use the BAM file directly. 2. I can't merge the BAM files after step 1. For example: Sample1_Lane1.BAM, Sample1_Lane2.BAM, and Sample1_Lane3.BAM. The problem is that the @SQ lines of these BAM files are different, both at SN and LN values. If I do with test data of small number of reads, there are more differences in SN values because STP throw away all chromosomes that do not have reads. Below is first 30 lines from my RAW bam header, BEFORE the conversion @HD VN:1.0 SO:unsorted @PG ID:novoalignCS PN:novoalignCS VN:V1.05.00 CL:novoalignCS -d rn6.rnaseq.n60k.cnx -f L03.xsq -F XSQ LT2 -o SAM -r Random @SQ SN:chr1 LN:282763074 AS:rn6.rnaseq.cnx @SQ SN:chr2 LN:266435125 AS:rn6.rnaseq.cnx @SQ SN:chr3 LN:177699992 AS:rn6.rnaseq.cnx @SQ SN:chr4 LN:184226339 AS:rn6.rnaseq.cnx @SQ SN:chr5 LN:173707219 AS:rn6.rnaseq.cnx @SQ SN:chr6 LN:147991367 AS:rn6.rnaseq.cnx @SQ SN:chr7 LN:145729302 AS:rn6.rnaseq.cnx @SQ SN:chr8 LN:133307652 AS:rn6.rnaseq.cnx @SQ SN:chr9 LN:122095297 AS:rn6.rnaseq.cnx @SQ SN:chr10 LN:112626471 AS:rn6.rnaseq.cnx @SQ SN:chr11 LN:90463843 AS:rn6.rnaseq.cnx @SQ SN:chr12 LN:52716770 AS:rn6.rnaseq.cnx @SQ SN:chr13 LN:114033958 AS:rn6.rnaseq.cnx @SQ SN:chr14 LN:115493446 AS:rn6.rnaseq.cnx @SQ SN:chr15 LN:111246239 AS:rn6.rnaseq.cnx @SQ SN:chr16 LN:90668790 AS:rn6.rnaseq.cnx @SQ SN:chr17 LN:90843779 AS:rn6.rnaseq.cnx @SQ SN:chr18 LN:88201929 AS:rn6.rnaseq.cnx @SQ SN:chr19 LN:62275575 AS:rn6.rnaseq.cnx @SQ SN:chr20 LN:56205956 AS:rn6.rnaseq.cnx @SQ SN:chrX LN:159970021 AS:rn6.rnaseq.cnx @SQ SN:chrY LN:3310458 AS:rn6.rnaseq.cnx @SQ SN:Zbtb22:chr20:5478760-5478829_5479425-5479494 LN:138 AS:rn6.rnaseq.cnx @SQ SN:Taf11:chr20:7477919-7477988_7482497-7482566 LN:138 AS:rn6.rnaseq.cnx @SQ SN:Taf11:chr20:7477185-7477254_7478450-7478519 LN:138 AS:rn6.rnaseq.cnx @SQ SN:Taf11:chr20:7478469-7478538_7480277-7480346 LN:138 AS:rn6.rnaseq.cnx @SQ SN:Taf11:chr20:7477185-7477254_7477891-7477960 LN:138 AS:rn6.rnaseq.cnx @SQ SN:Taf11:chr20:7480357-7480426_7482497-7482566 LN:138 AS:rn6.rnaseq.cnx And below is the header produced by STP after the conversion: @HD VN:1.4 SO:coordinate @RG ID:2.LT2.3 SM:LT2 CN:AfMD LB:LT2_2 PL:SOLiD PU:2.LT2.3 @SQ SN:chr10 LN:112626471 AS:rn6.rnaseq.cnx @SQ SN:chr11 LN:90463843 AS:rn6.rnaseq.cnx @SQ SN:chr12 LN:52716770 AS:rn6.rnaseq.cnx @SQ SN:chr13 LN:114033958 AS:rn6.rnaseq.cnx @SQ SN:chr5 LN:173707219 AS:rn6.rnaseq.cnx @SQ SN:chr6 LN:147991367 AS:rn6.rnaseq.cnx @SQ SN:chrY LN:3310458 AS:rn6.rnaseq.cnx @SQ SN:chr7 LN:145729302 AS:rn6.rnaseq.cnx @SQ SN:chr8 LN:133307652 AS:rn6.rnaseq.cnx @SQ SN:chr9 LN:122095297 AS:rn6.rnaseq.cnx @SQ SN:chrX LN:159970021 AS:rn6.rnaseq.cnx @SQ SN:chr15 LN:111246239 AS:rn6.rnaseq.cnx @SQ SN:chr14 LN:115493446 AS:rn6.rnaseq.cnx @SQ SN:chr17 LN:90843779 AS:rn6.rnaseq.cnx @SQ SN:chr16 LN:90668790 AS:rn6.rnaseq.cnx @SQ SN:chr19 LN:62275575 AS:rn6.rnaseq.cnx @SQ SN:chr1 LN:282763074 AS:rn6.rnaseq.cnx @SQ SN:chr18 LN:88201929 AS:rn6.rnaseq.cnx @SQ SN:chr2 LN:266435125 AS:rn6.rnaseq.cnx @SQ SN:chr3 LN:177699992 AS:rn6.rnaseq.cnx @SQ SN:chr4 LN:184226339 AS:rn6.rnaseq.cnx @PG ID:SamTranscriptomeParser CL: args [12 Mar 2015 9:53] USeq_8.8.9 -f LT2.2.LT2.3_unsorted.fifo.sam -a 900 -n 100 -u -s LT2.2.LT2.3_unsorted.tmp.bam @PG ID:novoalignCS PN:novoalignCS VN:V1.05.00 CL:novoalignCS -d rn6.rnaseq.n60k.cnx -f L03.xsq -F XSQ LT2 -o SAM -r Random As you may see, the canonical chromosomes got reordered, chromosome 20 is thrown out. If I have @RG lines, they would be thrown out also. I can extract the header I want to use from the raw bam file: samtools view -H raw_alignment.bam| egrep -v "@SQ\s+SN:[A-Za-z0-9]+:.*$|@HD". Then I can add the header with: "SamTranscriptomeParser -f test.sam -s test.bam -h head.txt " Thanks Vang |
From: David A. N. <dav...@gm...> - 2014-12-18 17:10:29
|
Hello Gang, Many thanks for the heads up. Major bug here. The filter for common regions was inadvertently disabled. Ugg! I’ve posted an update to the Sourceforge site so download USeq_8.8.8 and disregard prior VCFComparator analysis. Its not clear when this error occurred. Its easy to see if it hit your prior analysis, look at the number of test variants (or key variants) and make sure the pre and the post shared region numbers differ. https://sourceforge.net/projects/useq/files/?source=navbar I can’t seem to reproduce the SNP issue. When I run without a -s or -n option both snp and indels are compared and written to the match and noMatch files. Could you point me to your files and I’ll try to reproduce. -cheers, David On Nov 12, 2014, at 11:27 AM, Peng,Gang <GP...@md...> wrote: > Dear all, > > I am now using VCFComparator in USeq to compare two vcf files: > > java -jar -Xmx10g ~/soft/USeq_8.8.5/Apps/VCFComparator > -a NISTIntegratedCalls_14datasets_131103_allcall_UGHapMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf > > -b union13callableMQonlymerged_addcert_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18_2mindatasets_5minYesNoRatio.bed > > -c ../IonProton/IP.vcf > > -d ../IonProton/Ion-TargetSeq-b37_simp.bed > > -p ../IonProton/compare/ > > > According to the manual: "Only calls that fall in the common interrogated regions are compared.". But in the result: > > 2195078292 Interrogated bps in key > 47302058 Interrogated bps in test > 33755093 Interrogated bps in common > 2915731 > Key variants > 2915731 > Key variants in shared regions > 2.128631378 Shared key variants Ti/Tv > 126539 > Test variants > 126539 > Test variants in shared regions > 2.050087979 Shared test variants Ti/Tv > > And in the result vcf files, many positions are not in the shared bed file. It seems that all the variants in the two vcf files are compared. And by default, it should compare both SNPs and non SNPs. But in the result vcf files, there are only SNPs. > > Do I need to change some parameters to compare the vcfs only in the interrogated regions and compare both SNPs and non-SNPs. > > Thanks, > Gang > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: Peng,Gang <GP...@md...> - 2014-11-12 18:27:24
|
Dear all, I am now using VCFComparator in USeq to compare two vcf files: java -jar -Xmx10g ~/soft/USeq_8.8.5/Apps/VCFComparator -a NISTIntegratedCalls_14datasets_131103_allcall_UGHapMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf -b union13callableMQonlymerged_addcert_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18_2mindatasets_5minYesNoRatio.bed -c ../IonProton/IP.vcf -d ../IonProton/Ion-TargetSeq-b37_simp.bed -p ../IonProton/compare/ According to the manual: "Only calls that fall in the common interrogated regions are compared.". But in the result: 2195078292 Interrogated bps in key 47302058 Interrogated bps in test 33755093 Interrogated bps in common 2915731 Key variants 2915731 Key variants in shared regions 2.128631378 Shared key variants Ti/Tv 126539 Test variants 126539 Test variants in shared regions 2.050087979 Shared test variants Ti/Tv And in the result vcf files, many positions are not in the shared bed file. It seems that all the variants in the two vcf files are compared. And by default, it should compare both SNPs and non SNPs. But in the result vcf files, there are only SNPs. Do I need to change some parameters to compare the vcfs only in the interrogated regions and compare both SNPs and non-SNPs. Thanks, Gang |
From: David A. N. <dav...@gm...> - 2014-08-21 21:51:34
|
Hello Ashutosh, Sorry for the delay in responding… it’s been a crazy summer…. let’s see…. Hmm, have you tried explicitly setting which R to use? By default it is going to /usr/bin/R. So try using the -r flag. See the cmd menu. -cheers, David nix-laptop:~ u0028003$ java -jar -Xmx4G ~/AppsUSeq/ChIPSeq ************************************************************************************** ** ChIPSeq: May 2014 ** ************************************************************************************** The ChIPSeq application is a wrapper for processing ChIP-Seq data through a variety of USeq applications. It: 1) Parses raw alignments (sam, eland, bed, or novoalign) into binary PointData 2) Filters PointData for duplicate alignments 3) Makes relative ReadCoverage tracks from the PointData (reads per million mapped) 4) Runs the PeakShiftFinder to estimate the peak shift and optimal window size 5) Runs the MultipleReplicaScanSeqs to window scan the genome generating enrichment tracks using DESeq2's negative binomial pvalues and B&H's FDRs 6) Runs the EnrichedRegionMaker to identify likely chIP peaks (FDR < 1%, >2x). Options: -s Save directory, full path. -t Treatment alignment file directories, full path, comma delimited, no spaces, one for each biological replica. These should each contain one or more text alignment files (gz/zip OK) for a particular replica. Alternatively, provide one directory that contains multiple alignment file directories. -c Control alignment file directories, ditto. -y Type of alignments, either novoalign, sam, bed, or eland (sorted or export). -v Genome version (e.g. H_sapiens_Feb_2009, M_musculus_Jul_2007), see UCSC FAQ, http://genome.ucsc.edu/FAQ/FAQreleases. -r Full path to R, defaults to '/usr/bin/R'. Be sure to install DESeq2, gplots, and qvalue Bioconductor packages. Advanced Options: -m Combine any replicas and run single replica analysis (ScanSeqs), defaults to using DESeq2. -a Maximum alignment score. Defaults to 60, smaller numbers are more stringent. -q Minimum mapping quality score. Defaults to 13, bigger numbers are more stringent. This is a phred-scaled posterior probability that the mapping position of read is incorrect. Set to 0 for RNASeq data. -p Peak shift, defaults to the PeakShiftFinder peak shift or 150bp. Set to 0 for RNASeq data. -w Window size, defaults to the PeakShiftFinder peak shift + stnd dev or 250bp. -i Minimum number reads in window, defaults to 10. -f Filter bed file (tab delimited: chr start stop) to use in excluding intersecting windows while making peaks, e.g. satelliteRepeats.bed . -g Print verbose output from each application. -e Don't look for reduced regions. Example: java -Xmx2G -jar pathTo/USeq/Apps/ChIPSeq -y eland -v D_rerio_Dec_2008 -t /Data/PolIIRep1/,/Data/PolIIRep2/ -c /Data/PolIINRep1/,/Data/PolIINRep2/ -s /Data/Results/WtVsNull -f /Anno/satelliteRepeats.bed ************************************************************************************** nix-laptop:~ u0028003$ On Jul 15, 2014, at 12:01 PM, Ashutosh Shukla <ash...@cc...> wrote: > Dear All, > > I am trying to use USeq package for ChIP-Seq analysis. I have installed required packages DESeq2 , gplots and qvalue in a user local library ("~/R/x86_64-unknown-linux-gnu-library/3.1"). > When I open R in command line, i could see library(DESeq2),library(gplots) and library(qvalue) are present there. Now, when I try to run ChIPSeq using command > > $ java -Xmx2G -jar /home/pb/USeq_8.8.1/Apps/ChIPSeq -y novoalign -v S_cerevisiae_Apr_2011 -t /home/pb/Desktop/test/ -c /home/pb/Desktop/control/ -s /home/pb/Desktop/SAVE/ > > I get the following error: > > Checking parameters... > > The following problems were encountered when processing your parameter file. Correct and restart. -> > > Error: Cannot find the required R libraries. Did you install DESeq2, gplots, and qvalue? Once installed, launch an R terminal and type 'library(DESeq2); library(qvalue); library(gplots)' to see if it is present. Error message: > Error in library(DESeq2) : there is no package called ‘DESeq2’ Execution halted > > is this error because of ChIPSeq only works with global R library? but not with library in my local folders. > > -- to check this problem i even tried to install all the required packages and dependencies one after another in /usr/local/lib/R/site-library but i got the following error > > Checking parameters... > > The following problems were encountered when processing your parameter file. Correct and restart. -> > > Error: Cannot find the required R libraries. Did you install DESeq2, gplots, and qvalue? Once installed, launch an R terminal and type 'library(DESeq2); library(qvalue); library(gplots)' to see if it is present. Error message: > Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/RcppArmadillo/libs/RcppArmadillo.so': libRlapack.so: cannot open shared object file: No such file or directory Error: package ‘RcppArmadillo’ could not be loaded Execution halted. > > am i doing something wrong with the commands and package installations ? I am basically a biologist who recently started learning Linux and R. so please bear me , if I am asking a trivial question. any elaborated help would by highly appreciated. > > Thanks for help > -- > Ashutosh Shukla > Senior Research Fellow > W313, West wing, IInd floor, > Centre For Cellular & Molecular Biology > Hyderabad -500007 > India. > > ------------------------------------------------------------------------------ > Want fast and easy access to all the code in your enterprise? Index and > search up to 200,000 lines of code with a free copy of Black Duck > Code Sight - the same software that powers the world's largest code > search on Ohloh, the Black Duck Open Hub! Try it now. > http://p.sf.net/sfu/bds > _______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: Ashutosh S. <ash...@cc...> - 2014-07-15 18:21:01
|
Dear All, I am trying to use USeq package for ChIP-Seq analysis. I have installed required packages DESeq2 , gplots and qvalue in a user local library ("~/R/x86_64-unknown-linux-gnu-library/3.1"). When I open R in command line, i could see library(DESeq2),library(gplots) and library(qvalue) are present there. Now, when I try to run ChIPSeq using command $ java -Xmx2G -jar /home/pb/USeq_8.8.1/Apps/ChIPSeq -y novoalign -v S_cerevisiae_Apr_2011 -t /home/pb/Desktop/test/ -c /home/pb/Desktop/control/ -s /home/pb/Desktop/SAVE/ I get the following error: Checking parameters... The following problems were encountered when processing your parameter file. Correct and restart. -> Error: Cannot find the required R libraries. Did you install DESeq2, gplots, and qvalue? Once installed, launch an R terminal and type 'library(DESeq2); library(qvalue); library(gplots)' to see if it is present. Error message: Error in library(DESeq2) : there is no package called ‘DESeq2’ Execution halted is this error because of ChIPSeq only works with global R library? but not with library in my local folders. -- to check this problem i even tried to install all the required packages and dependencies one after another in /usr/local/lib/R/site-library but i got the following error Checking parameters... The following problems were encountered when processing your parameter file. Correct and restart. -> Error: Cannot find the required R libraries. Did you install DESeq2, gplots, and qvalue? Once installed, launch an R terminal and type 'library(DESeq2); library(qvalue); library(gplots)' to see if it is present. Error message: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared object '/usr/local/lib/R/site-library/RcppArmadillo/libs/RcppArmadillo.so': libRlapack.so: cannot open shared object file: No such file or directory Error: package ‘RcppArmadillo’ could not be loaded Execution halted. am i doing something wrong with the commands and package installations ? I am basically a biologist who recently started learning Linux and R. so please bear me , if I am asking a trivial question. any elaborated help would by highly appreciated. Thanks for help -- Ashutosh Shukla Senior Research Fellow W313, West wing, IInd floor, Centre For Cellular & Molecular Biology Hyderabad -500007 India. |
From: Jesse R. <jes...@u2...> - 2014-06-30 20:20:21
|
Hello, I am trying to locally run VCFAnnotator, and get the error below. Even though I have specified an annovar directory, it looks like it is a pointing to a directory on the Tomato server. How can I correct? USeq_8.8.1 Arguments: -v 10764X1_RG.recal.vcf -o 10764X1_RG.recal.ann.vcf -p /Applications/annovar -t /Applications/RSeQC-2.3.3/lib/tabix Using ALL 1K samples for annotation Working on file chunk: 1 *************************************** * Starting: Ensembl Annotations *************************************** IO error during annovar command execution: Cannot run program "/home/u0855942/annovar_version/annotate_variation.pl": error=2, No such file or directory java.io.IOException: Cannot run program "/home/u0855942/annovar_version/annotate_variation.pl": error=2, No such file or directory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041) at edu.utah.seq.vcf.VCFAnnotator$AnnovarCommand.runCommand(VCFAnnotator.java:698) at edu.utah.seq.vcf.VCFAnnotator.<init>(VCFAnnotator.java:125) at edu.utah.seq.vcf.VCFAnnotator.main(VCFAnnotator.java:497) Caused by: java.io.IOException: error=2, No such file or directory at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.<init>(UNIXProcess.java:135) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022) ... 3 more thanks, Jesse Rowley Division of Pulmonary Medicine University of Utah School of Medicine |
From: massimo a. <max...@gm...> - 2014-06-03 16:06:15
|
Hallo to everybody, I'm doing some vcf comparation using the NIST guidelines. I found that some variants are found both in matching and non-matching output files of VCF comparator and I can't understand why any help would be appreciated Thank you in advance Massimo Acquaviva Bioinformatician Giannina Gaslini Institute Genova Italy |
From: Jesse R. <jes...@u2...> - 2014-04-16 20:06:31
|
I noticed that changing parameters -x and -e in DRDS alters the calculated FPKM. Is this because FPKM is calculated after removal of genes? If so, based on the definition of FPKM, shouldn't this be calculated per million mapped reads and not per million analyzed reads? Jesse Rowley Division of Pulmonary Medicine University of Utah School of Medicine |
From: David N. <Dav...@hc...> - 2014-04-08 16:09:59
|
Hello Folks, We’ve purchase an IPA license for the University that allows unlimited accounts and six months of unlimited concurrent users, afterward it drops to just one concurrent user, with options to purchase more if needed. We’re asking $2K/ lab group to help defray the $25.5K cost. If you are interested in accessing this software, please send a chart field account number and the email addresses and uNIDs of covered lab members by Friday the 11th for the first batch access. We’ll be monitoring usage through the year to refine needs for 2015. -cheers, David -- David Austin Nix, PhD Huntsman Cancer Institute Dept. OncSci University of Utah Bioinformatics Shared Resource HCI 3165 (801) 587-4611 dav...@hc... On Mar 14, 2014, at 3:25 PM, David Nix <Dav...@hc...> wrote: > Hello Folks, > > Hopefully by now you have had a chance to work with the IPA trial, it ends Sunday. I would very much like to see access to this application and it’s underlying knowledge base at the U. We can purchase individual licenses ($10-13K/ year) or pool resources and purchase a campus wide license at a discount ($25K for 1 or $48K for 2 concurrent instances/ year, unlimited number users). The more I use it (20hrs +) the more I feel it is a critical resource for interpreting most every dataset coming off our genome sequencers. > > So the purpose of this email is to determine if we should pursue the campus wide license. If you would, answer the following by Tuesday PM. > > 1) Would you use the IPA software in the coming year? > 2) If so, how many hours, guessimate. 0-4/month, 5-10/month, 11+/month? > 3) How much would your lab contribute to purchasing an IPA campus license to pay for your guessimated use? $2K, $4K?, $6K? > > Thanks! > > David > > -- > David Austin Nix, PhD > Research Assistant Professor > Dept. Oncological Sciences > Co-Director Bioinformatics Shared Resource > Huntsman Cancer Institute and the University of Utah > 2000 Circle of Hope, Salt Lake City, UT 84112 > Room HCI 3165 > (801) 587-4611 > dav...@hc... > http://bioserver.hci.utah.edu > |
From: David N. <Dav...@hc...> - 2014-03-14 21:26:31
|
Hello Folks, Hopefully by now you have had a chance to work with the IPA trial, it ends Sunday. I would very much like to see access to this application and it’s underlying knowledge base at the U. We can purchase individual licenses ($10-13K/ year) or pool resources and purchase a campus wide license at a discount ($25K for 1 or $48K for 2 concurrent instances/ year, unlimited number users). The more I use it (20hrs +) the more I feel it is a critical resource for interpreting most every dataset coming off our genome sequencers. So the purpose of this email is to determine if we should pursue the campus wide license. If you would, answer the following by Tuesday PM. 1) Would you use the IPA software in the coming year? 2) If so, how many hours, guessimate. 0-4/month, 5-10/month, 11+/month? 3) How much would your lab contribute to purchasing an IPA campus license to pay for your guessimated use? $2K, $4K?, $6K? Thanks! David -- David Austin Nix, PhD Research Assistant Professor Dept. Oncological Sciences Co-Director Bioinformatics Shared Resource Huntsman Cancer Institute and the University of Utah 2000 Circle of Hope, Salt Lake City, UT 84112 Room HCI 3165 (801) 587-4611 dav...@hc... http://bioserver.hci.utah.edu |
From: David N. <Dav...@hc...> - 2014-03-10 18:55:36
|
Hello Peter, I see the issue. GQs are being parsed as integers and Java is seeing a float so it’s throwing an error. I’ve modified the classes to fix and will post the changes to sourceforge this PM or tomorrow AM, as USeq_8.7.8 -cheers, David -- David Austin Nix, PhD Huntsman Cancer Institute Dept. OncSci University of Utah Bioinformatics Shared Resource HCI 3165 (801) 587-4611 dav...@hc...<mailto:dav...@hc...> From: "White, Peter" <Pet...@na...<mailto:Pet...@na...>> Subject: The floating point in GQ is causing an error Date: March 6, 2014 at 4:17:37 AM MST To: "'use...@li...<mailto:use...@li...>'" <use...@li...<mailto:use...@li...>> I receive this error when running the following command: java -jar -Xmx22g USeq_8.7.6/Apps/VCFComparator -a GIAB_2.18.vcf -b GIAB_2.18.bed -c test.vcf -d GIAB_2.18.bed -p results It seems to be listing every record as having an issue with the GQ field: Skipping malformed VCF Record-> chr1 28563 . A G 747.81 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=32;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=27.68;MQ0=0;QD=23.37;SB=-344.16;VQSLOD=-4.1666;culprit=MQ GT:DP:GQ:PL 1/1:32:83.94:781,84,0 Error-> For input string: "83.94" Skipping malformed VCF Record-> chr1 28663 . T A 135.07 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=7;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=21.88;MQ0=0;QD=19.30;SB=-0.01;VQSLOD=-6.8165;culprit=DP GT:DP:GQ:PL1/1:7:18.02:168,18,0 Error-> For input string: "18.02" Aborting, problem parsing vcf file -> churchill.vcf java.lang.Exception: Too many malformed VCF Records. at edu.utah.seq.vcf.VCFParser.parseVCF(VCFParser.java:229) at edu.utah.seq.vcf.VCFParser.<init>(VCFParser.java:136) at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:436) at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) Any ideas what might be causing it? Thanks, Peter Peter White, Ph.D. Principal Investigator, Center for Microbial Pathogenesis<http://www.nationwidechildrens.org/microbial-pathogens> Director, Biomedical Genomics Core<http://genomics.nchresearch.org/> Director of Molecular Bioinformatics, The Research Institute at Nationwide Children's Hospital<http://www.nationwidechildrens.org/pediatric-research> Assistant Professor of Pediatrics, The Ohio State University<http://pro.osumc.edu/profiles/white.1586/> From: <use...@li...<mailto:use...@li...>> Subject: confirm ed03d06c1b9925db6bd2d780cf965ee5e57e5168 If you reply to this message, keeping the Subject: header intact, Mailman will discard the held message. Do this if the message is spam. If you reply to this message and include an Approved: header with the list password in it, the message will be approved for posting to the list. The Approved: header can also appear in the first line of the body of the reply. |
From: David N. <Dav...@hc...> - 2014-03-06 14:16:26
|
Begin forwarded message: From: <use...@li...<mailto:use...@li...>> Subject: Useq-users post from pet...@na...<mailto:pet...@na...> requires approval Date: March 6, 2014 at 4:38:33 AM MST To: <use...@li...<mailto:use...@li...>> As list administrator, your authorization is requested for the following mailing list posting: List: Use...@li...<mailto:Use...@li...> From: pet...@na...<mailto:pet...@na...> Subject: The floating point in GQ is causing an error Reason: Post by non-member to a members-only list At your convenience, visit: https://lists.sourceforge.net/lists/admindb/useq-users to approve or deny the request. From: "White, Peter" <Pet...@na...<mailto:Pet...@na...>> Subject: The floating point in GQ is causing an error Date: March 6, 2014 at 4:17:37 AM MST To: "'use...@li...<mailto:use...@li...>'" <use...@li...<mailto:use...@li...>> I receive this error when running the following command: java -jar -Xmx22g USeq_8.7.6/Apps/VCFComparator -a GIAB_2.18.vcf -b GIAB_2.18.bed -c test.vcf -d GIAB_2.18.bed -p results It seems to be listing every record as having an issue with the GQ field: Skipping malformed VCF Record-> chr1 28563 . A G 747.81 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=32;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=27.68;MQ0=0;QD=23.37;SB=-344.16;VQSLOD=-4.1666;culprit=MQ GT:DP:GQ:PL 1/1:32:83.94:781,84,0 Error-> For input string: "83.94" Skipping malformed VCF Record-> chr1 28663 . T A 135.07 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=7;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=21.88;MQ0=0;QD=19.30;SB=-0.01;VQSLOD=-6.8165;culprit=DP GT:DP:GQ:PL1/1:7:18.02:168,18,0 Error-> For input string: "18.02" Aborting, problem parsing vcf file -> churchill.vcf java.lang.Exception: Too many malformed VCF Records. at edu.utah.seq.vcf.VCFParser.parseVCF(VCFParser.java:229) at edu.utah.seq.vcf.VCFParser.<init>(VCFParser.java:136) at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:436) at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) Any ideas what might be causing it? Thanks, Peter Peter White, Ph.D. Principal Investigator, Center for Microbial Pathogenesis<http://www.nationwidechildrens.org/microbial-pathogens> Director, Biomedical Genomics Core<http://genomics.nchresearch.org/> Director of Molecular Bioinformatics, The Research Institute at Nationwide Children's Hospital<http://www.nationwidechildrens.org/pediatric-research> Assistant Professor of Pediatrics, The Ohio State University<http://pro.osumc.edu/profiles/white.1586/> From: <use...@li...<mailto:use...@li...>> Subject: confirm ed03d06c1b9925db6bd2d780cf965ee5e57e5168 If you reply to this message, keeping the Subject: header intact, Mailman will discard the held message. Do this if the message is spam. If you reply to this message and include an Approved: header with the list password in it, the message will be approved for posting to the list. The Approved: header can also appear in the first line of the body of the reply. |
From: White, P. <Pet...@na...> - 2014-03-06 11:38:32
|
I receive this error when running the following command: java -jar -Xmx22g USeq_8.7.6/Apps/VCFComparator -a GIAB_2.18.vcf -b GIAB_2.18.bed -c test.vcf -d GIAB_2.18.bed -p results It seems to be listing every record as having an issue with the GQ field: Skipping malformed VCF Record-> chr1 28563 . A G 747.81 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=32;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=27.68;MQ0=0;QD=23.37;SB=-344.16;VQSLOD=-4.1666;culprit=MQ GT:DP:GQ:PL 1/1:32:83.94:781,84,0 Error-> For input string: "83.94" Skipping malformed VCF Record-> chr1 28663 . T A 135.07 TruthSensitivityTranche99.90to100.00 AC=2;AF=1.00;AN=2;DP=7;Dels=0.00;FS=0.000;HRun=0;HaplotypeScore=0.0000;MQ=21.88;MQ0=0;QD=19.30;SB=-0.01;VQSLOD=-6.8165;culprit=DP GT:DP:GQ:PL1/1:7:18.02:168,18,0 Error-> For input string: "18.02" Aborting, problem parsing vcf file -> churchill.vcf java.lang.Exception: Too many malformed VCF Records. at edu.utah.seq.vcf.VCFParser.parseVCF(VCFParser.java:229) at edu.utah.seq.vcf.VCFParser.<init>(VCFParser.java:136) at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:436) at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) Any ideas what might be causing it? Thanks, Peter Peter White, Ph.D. Principal Investigator, Center for Microbial Pathogenesis<http://www.nationwidechildrens.org/microbial-pathogens> Director, Biomedical Genomics Core<http://genomics.nchresearch.org/> Director of Molecular Bioinformatics, The Research Institute at Nationwide Children's Hospital<http://www.nationwidechildrens.org/pediatric-research> Assistant Professor of Pediatrics, The Ohio State University<http://pro.osumc.edu/profiles/white.1586/> |
From: David N. <Dav...@hc...> - 2014-03-05 18:19:37
|
Hello Ravi, Looks like the issue is with missing/ no vcf records after filtering causing the app to abort. I’ve put in a bunch of catches for this. Give USeq_8.7.7 a try https://sourceforge.net/projects/useq/ It’s just going to abort with what I hope is an informative error message instead of the stack trace. Note for future reference, I’m hamstrung trying to debug using dummy datasets, see if you can post the real ones of interest next time. http://bioserver.hci.utah.edu/BioInfo/index.php/FAQ#How_do_I_report_a_problem_or_issue_with_a_USeq_application.3F -cheers, D -- David Austin Nix, PhD Huntsman Cancer Institute Dept. OncSci University of Utah Bioinformatics Shared Resource HCI 3165 (801) 587-4611 dav...@hc... On Mar 3, 2014, at 4:35 PM, Ravi Vijaya Satya - QIAGEN <Rav...@qi...> wrote: > I am using VCFComparator from the USeq_8.7.6 package. This tool very useful in comparing with NIST GIB calls. However, there seems to be a bug in the code that is triggered by some bed files while others work fine. > > Exception in thread "main" java.lang.NegativeArraySizeException > at edu.utah.seq.vcf.VCFParser.filterVCFRecords(VCFParser.java:325) > at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:420) > at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) > at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) > > > In an effort to narrow the problem down, I supplied a VCF with no variant calls (but a header), and still got the error with this bed file. I am comparing against NIST 2.18 calls. My command line: > > > java -jar -Xmx22g /mnt/fdkbio05/rvijaya/software/USeq_8.7.6/Apps/VCFComparator -a ~/misc/NA12878_NIST/NISTIntegratedCalls_14datasets_131103_allcall_UGHapMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf -b ~/misc/NA12878_NIST/union13callableMQonlymerged_addcert_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18_2mindatasets_5minYesNoRatio.bed -c dummy.vcf -d test.bed -p NIST_comparison > > Any pointers on fixing this problem? I am attaching the vcf and the bed files that I am using. > > Thanks, > Ravi<test.bed><dummy.vcf>------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: David N. <Dav...@hc...> - 2014-03-04 18:04:54
|
FYI Begin forwarded message: From: Janet Ellingson <jan...@ut...<mailto:jan...@ut...>> Subject: [chpc-hpc-users] CHPC Presentation TODAY - Using Python Date: March 4, 2014 at 10:59:07 AM MST To: "chp...@li...<mailto:chp...@li...>" <chp...@li...<mailto:chp...@li...>> Using Python for Scientific Computing March, 4th, 2014, 1:00pm By: Wim Cardoen LOCATION: INSCC Auditorium (Room 110) TIME: 1:00 - 2:00 p.m. In this talk we will discuss several features which make Python a viable tool for scientific computing: * 1. strength and flexibility of the Python language * 2. mathematical libraries (numpy,scipy,..) * 3. graphical libraries (matplotlib,..) * 4. extending Python using C/C++ and Fortran To unsubscribe, please go to http://www.lists.utah.edu and login with your campus email address and password to manage the your University of Utah Email subscriptions, or send email to is...@ch...<mailto:is...@ch...> with the details of your request including list name and your email address. |
From: Ravi V. S. - Q. <Rav...@qi...> - 2014-03-04 16:22:45
|
Hi David, Thanks for looking into this. If there are no GIB variants in my Bed regions, then all my calls should be reported as false positives, right? There might be individual bed intervals without GIB variants, but there are certainly some GIB variants in all my bed intervals put together. Thanks, Ravi -----Original Message----- From: David Austin Nix [mailto:dav...@gm...] Sent: Tuesday, March 04, 2014 11:17 AM To: Ravi Vijaya Satya - QIAGEN Cc: USeq Subject: Re: [Useq-users] Errors with VCF comparator Hello Ravi, I'll take a look. I suspect you don't have any variants after filtering. I'd first check if any variants are in your bed regions. -cheers, D On Mar 3, 2014, at 4:35 PM, Ravi Vijaya Satya - QIAGEN <Rav...@qi...> wrote: > I am using VCFComparator from the USeq_8.7.6 package. This tool very useful in comparing with NIST GIB calls. However, there seems to be a bug in the code that is triggered by some bed files while others work fine. > > Exception in thread "main" java.lang.NegativeArraySizeException > at edu.utah.seq.vcf.VCFParser.filterVCFRecords(VCFParser.java:325) > at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:420) > at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) > at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) > > > In an effort to narrow the problem down, I supplied a VCF with no variant calls (but a header), and still got the error with this bed file. I am comparing against NIST 2.18 calls. My command line: > > > java -jar -Xmx22g > /mnt/fdkbio05/rvijaya/software/USeq_8.7.6/Apps/VCFComparator -a > ~/misc/NA12878_NIST/NISTIntegratedCalls_14datasets_131103_allcall_UGHa > pMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludese > gdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf -b > ~/misc/NA12878_NIST/union13callableMQonlymerged_addcert_nouncert_exclu > desimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18 > _2mindatasets_5minYesNoRatio.bed -c dummy.vcf -d test.bed -p > NIST_comparison > > Any pointers on fixing this problem? I am attaching the vcf and the bed files that I am using. > > Thanks, > Ravi<test.bed><dummy.vcf>--------------------------------------------- > --------------------------------- Subversion Kills Productivity. Get > off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization > and the freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg. > clktrk_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: David A. N. <dav...@gm...> - 2014-03-04 16:16:47
|
Hello Ravi, I’ll take a look. I suspect you don’t have any variants after filtering. I’d first check if any variants are in your bed regions. -cheers, D On Mar 3, 2014, at 4:35 PM, Ravi Vijaya Satya - QIAGEN <Rav...@qi...> wrote: > I am using VCFComparator from the USeq_8.7.6 package. This tool very useful in comparing with NIST GIB calls. However, there seems to be a bug in the code that is triggered by some bed files while others work fine. > > Exception in thread "main" java.lang.NegativeArraySizeException > at edu.utah.seq.vcf.VCFParser.filterVCFRecords(VCFParser.java:325) > at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:420) > at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) > at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) > > > In an effort to narrow the problem down, I supplied a VCF with no variant calls (but a header), and still got the error with this bed file. I am comparing against NIST 2.18 calls. My command line: > > > java -jar -Xmx22g /mnt/fdkbio05/rvijaya/software/USeq_8.7.6/Apps/VCFComparator -a ~/misc/NA12878_NIST/NISTIntegratedCalls_14datasets_131103_allcall_UGHapMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf -b ~/misc/NA12878_NIST/union13callableMQonlymerged_addcert_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18_2mindatasets_5minYesNoRatio.bed -c dummy.vcf -d test.bed -p NIST_comparison > > Any pointers on fixing this problem? I am attaching the vcf and the bed files that I am using. > > Thanks, > Ravi<test.bed><dummy.vcf>------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and the > freedom to use Git, Perforce or both. Make the move to Perforce. > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: Ravi V. S. - Q. <Rav...@qi...> - 2014-03-03 23:51:06
|
I am using VCFComparator from the USeq_8.7.6 package. This tool very useful in comparing with NIST GIB calls. However, there seems to be a bug in the code that is triggered by some bed files while others work fine. Exception in thread "main" java.lang.NegativeArraySizeException at edu.utah.seq.vcf.VCFParser.filterVCFRecords(VCFParser.java:325) at edu.utah.seq.vcf.VCFComparator.parseFilterFiles(VCFComparator.java:420) at edu.utah.seq.vcf.VCFComparator.<init>(VCFComparator.java:69) at edu.utah.seq.vcf.VCFComparator.main(VCFComparator.java:515) In an effort to narrow the problem down, I supplied a VCF with no variant calls (but a header), and still got the error with this bed file. I am comparing against NIST 2.18 calls. My command line: java -jar -Xmx22g /mnt/fdkbio05/rvijaya/software/USeq_8.7.6/Apps/VCFComparator -a ~/misc/NA12878_NIST/NISTIntegratedCalls_14datasets_131103_allcall_UGHapMerge_HetHomVarPASS_VQSRv2.18_all_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs.vcf -b ~/misc/NA12878_NIST/union13callableMQonlymerged_addcert_nouncert_excludesimplerep_excludesegdups_excludedecoy_excludeRepSeqSTRs_noCNVs_v2.18_2mindatasets_5minYesNoRatio.bed -c dummy.vcf -d test.bed -p NIST_comparison Any pointers on fixing this problem? I am attaching the vcf and the bed files that I am using. Thanks, Ravi |
From: David A. N. <dav...@gm...> - 2014-02-25 16:03:13
|
Yes this is due to having different headers. You’ll need to replace the headers with the Picard ReplaceSamHeader.jar or re run STP using the -h option. -cheers, D On Feb 24, 2014, at 5:52 PM, Joseph Whipple <jos...@bi...> wrote: > I'm having an issue running the GATK's RealignerTargetCreator on bam files generated using STP. I get the following error when I try to run the app: > > ##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths: > ##### ERROR contig reads = chrI / 15172264 > ##### ERROR contig reference = chrI / 15072423. > ##### ERROR reads contigs = [chrX, chrI, chrV, chrIII, chrIV, chrM, chrII] > ##### ERROR reference contigs = [chrI, chrII, chrIII, chrIV, chrM, chrV, chrX] > > I saw in an earlier post that this error is caused by the bam file header having a different length than the reference fasta file. How would I change the header to fix this? Is the correct tool to use going to be picard ReplaceSamHeader.jar? Also, what are the appropriate values to change (i.e. should the length of each chr in the header be changed to match the length of each chr from the UCSC chr.fa)? > > Thanks for the help, > > Joe W > ------------------------------------------------------------------------------ > Flow-based real-time traffic analytics software. Cisco certified tool. > Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer > Customize your own dashboards, set traffic alerts and generate reports. > Network behavioral analysis & security monitoring. All-in-one tool. > http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk > _______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |
From: Joseph W. <jos...@bi...> - 2014-02-25 01:27:32
|
I'm having an issue running the GATK's RealignerTargetCreator on bam files generated using STP. I get the following error when I try to run the app: ##### ERROR MESSAGE: Input files reads and reference have incompatible contigs: Found contigs with the same name but different lengths: ##### ERROR contig reads = chrI / 15172264 ##### ERROR contig reference = chrI / 15072423. ##### ERROR reads contigs = [chrX, chrI, chrV, chrIII, chrIV, chrM, chrII] ##### ERROR reference contigs = [chrI, chrII, chrIII, chrIV, chrM, chrV, chrX] I saw in an earlier post that this error is caused by the bam file header having a different length than the reference fasta file. How would I change the header to fix this? Is the correct tool to use going to be picard ReplaceSamHeader.jar? Also, what are the appropriate values to change (i.e. should the length of each chr in the header be changed to match the length of each chr from the UCSC chr.fa)? Thanks for the help, Joe W |
From: David A. N. <dav...@gm...> - 2013-12-19 16:25:25
|
Yes this is a common problem and why chIPSeq detectors always need an input control. Better yet, compare two chIPSeq samples. The problem is with over dispersion. With high read counts, minor changes in the count #'s between t and c will return a very significant p-value with a binomial test. Thus the popularity of a log2ratio threshold as well as a p-value/ FDR threshold. Better to use a negative binomial test with a robust estimation of the over dispersion. This necessitates replicas though, at least three for the chIP and input samples. That said my incorporation of the DESeq package into the USeq MultipleReplicaScanSeqs app didn't really solve the problem. ChIPSeq samples have such high variability that DESeq returns very few, if any significantly enriched regions. So my current recommendations are to run the standard ScanSeqs and MultipleReplicaScanSeqs as well as the Liu lab's MACS package. The latter does some clever things to estimate the expect/ lambda at each location instead of using a genome average. -cheers, D On Dec 16, 2013, at 5:25 PM, Noboru Jo Sakabe <ns...@uc...> wrote: > Hi David, analyzing a sample from my lab that seems to be a failed IP, I found a number of regions in the genome that have many reads, but are not real peaks. They are basically the same across samples, including inputs and failed IPs. One example region is mm9 chrX:166,393,669-166,477,668 (figure attached). > This is not a problem exclusive to USeq, but I was wondering if there's something that can be done to improve peak callers, since inputs are also "enriched" in these regions. I know that people have generated files containing regions that should be masked, but since input samples are also enriched, I wonder why they still come up as peaks. > Thank you. > > <fig.png><nsakabe.vcf>------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk_______________________________________________ > Useq-users mailing list > Use...@li... > https://lists.sourceforge.net/lists/listinfo/useq-users |