[test]
name = DT.1
input_file = /data/projects/bwamap/DT.1.out/DT.1.srt.dedup.withdbsnprealign.withdbsnpNchiprecal.bam
[control]
name = DH.1
input_file = /data/projects/bwamap/DH.1.out/DH.1.srt.dedup.withdbsnprealign.withdbsnpNchiprecal.bam
when I run:
java -Xmx40g -jar /full/path/qsv-0.3.jar -ini /full/path/test.ini -tmp /tmp/directory
I get:
some {16:20:13.027 [main] INFO org.qcmg.qsv.QSV} line# omit this part info
usage: qsv [OPTIONS] --ini [ini_file] --tmp [temporary_directory]
org.qcmg.qsv.QSVException: No insert sizes were provided in the ini file
at org.qcmg.qsv.QSVParameters.getISizesFromIniFile(QSVParameters.java:177)
at org.qcmg.qsv.QSVParameters.<init>(QSVParameters.java:125)
at org.qcmg.qsv.QSVPipeline.setQSVParameters(QSVPipeline.java:142)
at org.qcmg.qsv.QSVPipeline.<init>(QSVPipeline.java:110)
at org.qcmg.qsv.QSV.runQSV(QSV.java:89)
at org.qcmg.qsv.QSV.main(QSV.java:36)
16:20:13.094 [main] SEVERE org.qcmg.qsv.QSV - org.qcmg.qsv.QSVException: No insert sizes were provided in the ini file
at org.qcmg.qsv.QSVParameters.getISizesFromIniFile(QSVParameters.java:177)
at org.qcmg.qsv.QSVParameters.<init>(QSVParameters.java:125)
at org.qcmg.qsv.QSVPipeline.setQSVParameters(QSVPipeline.java:142)
at org.qcmg.qsv.QSVPipeline.<init>(QSVPipeline.java:110)
at org.qcmg.qsv.QSV.runQSV(QSV.java:89)
at org.qcmg.qsv.QSV.main(QSV.java:36)
Hi Xuan,
At present, qSV will not automatically detect the insert size.
It used to do this, but we found that the results were problematic.
I have update the wiki page to more accurately reflect this.
There are a number of different ways of getting this information externally (eg. using qProfiler or Picard's CollectISizeMetrics (http://broadinstitute.github.io/picard/command-line-overview.html#CollectInsertSizeMetrics)
You will then need to update your ini file with the gathered isize information and try running qsv again.
Please let me know if you experiece any further issues.
Thanks for using qSV!
Cheers,
Oliver Holmes
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tested the previous bug and found the output of qprofiler must be .xml suffix.
However the html file maybe also have a bug. You use systemsbiology-visualizations utility in googlecode to show graph and information, but google code has already stopped, the project moved to github now.https://github.com/IlyaLab/systemsbiology-visualizations
Please update related q-software :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Oliver,
Thank you for helping me and sharing your tool!
About insert size, CollectISizeMetrics in Picard can't give result by readgroup.
ReadGroupProperties in GATK can only give the median isize by RG.
samtools stat -S RG the.bam can give mean isize and sd by RG.
qProfiler seems OK, but the xml format output is tough. The qvisualise doesn't work on my server. Bug info:
Oliver, could you please tell me the definition of "upper/lower" insert size value? How should I set this part when I get many statistics from picard.
Thanks a lot!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Oliver,
Visual estimation may mot be very well and batch usable, but I did that.
Now clip mode is okay, but pair mode is still in trouble.
I use bwa mem -a -M -R "my readgroup" ref.fa read1.fq read2.fq >out.sam to align all hiseq-Xten pair end data. Is it okay for qsv?
Then I remove duplication with picard, realign indel and do BQSR with GATK. Finally, one of my reads contain such information: ST-E00144:112:HF5FJCCXX:2:2123:14702:21895 163 1 135 0 4S146M = 193 208 AAAACATCTTACTTTTGAGAGTTGAGCTGACCCCCAGTCCCTCACAGTTCCACACTGCCTGCAGAGTGAGTTTCCCATGTCTTCACCAGAGACTTTTGCCAGAGGCTTCTGAGACGCAAGTTAACAATGCAGACATGGAGGGTATCTCCA =<,<;===>==:=<==<<=7=:;=:=;;==<:::==<:=6;>:=<==<=/;=<=<>=;:>;8====<===-7+<618<=<:<===<:==6=;<>====<=====<;>=9<=;>;=9698=<=>),:),4-)+9<)7=69=<=7;<0<89= MC:Z:150M BD:Z:DD>>IGKJGFDIGE==FHECDGEEGEIIHHGGAAAGHHHGAGEHDDHHFEGGDDDGHIGGHIIHDEHDHEHF=EGAGHIEHEEEHDGGHDEDGGE==FIGGHDEGGIEEEHHEEHGFJHEIGDEFEHFJJJIFIFJKJHHJDJLJMEEGG MD:Z:9G0C13C4T100C15 PG:Z:MarkDuplicates RG:Z:DT.1.novo.250_L2 BI:Z:GGDDGGHHFEFFHEBBFHEEEHFFHEHHGHEFCCCHHHGFCGEHDFHHFFFHDFDHGFFGGFHHEEHFHEIFCGGDIHHGHGFGIEGIIFFFFIFCCGGGIIFFHHIFGGHIFFFHGIHGJHHHGHIIIHJJGGIJJJJIKHKKKKGFGI NM:i:5 MQ:i:0 AS:i:121 XS:i:125
bwa mem dosen't output SM NH X0 XA field anymore. So qsv stopped with informaton:
13:22:45.443 [pool-1-thread-2] SEVERE org.qcmg.qsv.annotate.AnnotateFilterMT - org.qcmg.qsv.QSVException: No discordant pair records passed the filter.
at org.qcmg.qsv.annotate.AnnotateFilterMT.run(AnnotateFilterMT.java:182)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
usage: qsv [OPTIONS] --ini [ini_file] --tmp [temporary_directory]
org.qcmg.qsv.QSVException: Exception observed when annotating and filtering BAMs
at org.qcmg.qsv.QSVPipeline.annotateAndFilterBams(QSVPipeline.java:414)
at org.qcmg.qsv.QSVPipeline.runPipeline(QSVPipeline.java:237)
at org.qcmg.qsv.QSV.runQSV(QSV.java:90)
at org.qcmg.qsv.QSV.main(QSV.java:36)
13:22:45.444 [main] SEVERE org.qcmg.qsv.QSV - org.qcmg.qsv.QSVException: Exception observed when annotating and filtering BAMs
at org.qcmg.qsv.QSVPipeline.annotateAndFilterBams(QSVPipeline.java:414)
at org.qcmg.qsv.QSVPipeline.runPipeline(QSVPipeline.java:237)
at org.qcmg.qsv.QSV.runQSV(QSV.java:90)
at org.qcmg.qsv.QSV.main(QSV.java:36)
Hi developers,
I'm trying to use qsv on my tumor data.
Here is the content of test .ini file.
[general]
log = DT.1.log
loglevel = DEBUG
sample = DT.1
sv_analysis = both
output = /home/beta/tmp
reference = /home/beta/pub/genome/dna.toplevel.withMaskedY.SimpleID.fa
platform = illumina
[pair]
pairing_type = pe
mapper = bwa-mem
[clip]
blatpath = /home/beta/bin/x86_64/
blatserver = 127.0.0.1
blatport = 3456
[test]
name = DT.1
input_file = /data/projects/bwamap/DT.1.out/DT.1.srt.dedup.withdbsnprealign.withdbsnpNchiprecal.bam
[control]
name = DH.1
input_file = /data/projects/bwamap/DH.1.out/DH.1.srt.dedup.withdbsnprealign.withdbsnpNchiprecal.bam
when I run:
java -Xmx40g -jar /full/path/qsv-0.3.jar -ini /full/path/test.ini -tmp /tmp/directory
I get:
some {16:20:13.027 [main] INFO org.qcmg.qsv.QSV} line# omit this part info
usage: qsv [OPTIONS] --ini [ini_file] --tmp [temporary_directory]
org.qcmg.qsv.QSVException: No insert sizes were provided in the ini file
at org.qcmg.qsv.QSVParameters.getISizesFromIniFile(QSVParameters.java:177)
at org.qcmg.qsv.QSVParameters.<init>(QSVParameters.java:125)
at org.qcmg.qsv.QSVPipeline.setQSVParameters(QSVPipeline.java:142)
at org.qcmg.qsv.QSVPipeline.<init>(QSVPipeline.java:110)
at org.qcmg.qsv.QSV.runQSV(QSV.java:89)
at org.qcmg.qsv.QSV.main(QSV.java:36)
16:20:13.094 [main] SEVERE org.qcmg.qsv.QSV - org.qcmg.qsv.QSVException: No insert sizes were provided in the ini file
at org.qcmg.qsv.QSVParameters.getISizesFromIniFile(QSVParameters.java:177)
at org.qcmg.qsv.QSVParameters.<init>(QSVParameters.java:125)
at org.qcmg.qsv.QSVPipeline.setQSVParameters(QSVPipeline.java:142)
at org.qcmg.qsv.QSVPipeline.<init>(QSVPipeline.java:110)
at org.qcmg.qsv.QSV.runQSV(QSV.java:89)
at org.qcmg.qsv.QSV.main(QSV.java:36)
16:20:13.095 [main] EXEC org.qcmg.qsv.QSV - StopTime 2017-02-06 16:20:13
16:20:13.095 [main] EXEC org.qcmg.qsv.QSV - TimeTaken 00:00:00
16:20:13.095 [main] EXEC org.qcmg.qsv.QSV - ExitStatus 1
Please tell me how to write the ini file to let qsv calculate insert size auto?
Thanks!
Xuan
Related
Discussion: general
Hi Xuan,
At present, qSV will not automatically detect the insert size.
It used to do this, but we found that the results were problematic.
I have update the wiki page to more accurately reflect this.
There are a number of different ways of getting this information externally (eg. using qProfiler or Picard's CollectISizeMetrics (http://broadinstitute.github.io/picard/command-line-overview.html#CollectInsertSizeMetrics)
You will then need to update your ini file with the gathered isize information and try running qsv again.
Please let me know if you experiece any further issues.
Thanks for using qSV!
Cheers,
Oliver Holmes
I tested the previous bug and found the output of qprofiler must be .xml suffix.
However the html file maybe also have a bug. You use systemsbiology-visualizations utility in googlecode to show graph and information, but google code has already stopped, the project moved to github now.https://github.com/IlyaLab/systemsbiology-visualizations
Please update related q-software :)
Hello Oliver,
Thank you for helping me and sharing your tool!
About insert size, CollectISizeMetrics in Picard can't give result by readgroup.
ReadGroupProperties in GATK can only give the median isize by RG.
samtools stat -S RG the.bam can give mean isize and sd by RG.
qProfiler seems OK, but the xml format output is tough. The qvisualise doesn't work on my server. Bug info:
11:05:04.749 [main] EXEC org.qcmg.qvisualise.QVisualise - Uuid fdb9da05_0d9c_4578_8b59_1b52de2e518f
11:05:04.749 [main] EXEC org.qcmg.qvisualise.QVisualise - StartTime 2017-02-09 11:05:04
11:05:04.749 [main] EXEC org.qcmg.qvisualise.QVisualise - OsName Linux
11:05:04.749 [main] EXEC org.qcmg.qvisualise.QVisualise - OsArch amd64
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - OsVersion 3.10.0-229.20.1.el7.x86_64
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - RunBy wangxuan
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - ToolName qvisualise
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - ToolVersion 0.1pre (118)
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - CommandLine qvisualise -i /home/wangxuan/tmp/qp.out -log /home/wangxuan/tmp/qp.log
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - JavaHome /home/wangxuan/software/jre1.8.0_111
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - JavaVendor Oracle Corporation
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - JavaVersion 1.8.0_111
11:05:04.750 [main] EXEC org.qcmg.qvisualise.QVisualise - host centaurus
11:05:04.793 [main] WARNING org.qcmg.qprofiler.QProfiler - qVisualise failed for qprofiler output: /home/wangxuan/tmp/qp.out
11:05:04.793 [main] EXEC org.qcmg.qprofiler.QProfiler - StopTime 2017-02-09 11:05:04
11:05:04.793 [main] EXEC org.qcmg.qprofiler.QProfiler - TimeTaken 00:21:09
11:05:04.793 [main] EXEC org.qcmg.qprofiler.QProfiler - ExitStatus 0
Could you please help me again?
BTW, what's the detail meaning of "upper/lower" value? which Statistics, SD or MAD or Any percentage?
If I use qsv in my paper, which paper should I cite? This one?
http://www.biotechniques.com/BiotechniquesJournal/2014/July/A-workflow-to-increase-verification-rate-of-chromosomal-structural-rearrangements-using-high-throughput-next-generation-sequencing/biotechniques-352784.html
Cheers,
Xuan
Hi Xuan,
If you run Picard with the "METRIC_ACCUMULATION_LEVEL" option set to "READ_GROUP" then you should hopefully get readgroup specific isize metrics.
We are hoping to release a newer version of qprofiler and qvisualise soon which should address the isues that you are experiencing.
The Kelly Quek paper that you linked to is fine for citing qsv, as it has yet to be published...
Hope this helps.
Cheers,
Oliver Holmes
Oliver, could you please tell me the definition of "upper/lower" insert size value? How should I set this part when I get many statistics from picard.
Thanks a lot!
Hi Xuan,
You want to set the insert size bounds so that the majority of reads would be within the range.
For example, in the following plot, a lower value of 100, and an upper value of 450 would be reasonable.
https://duckduckgo.com/?q=picard+CollectInsertSizeMetrics&ia=images&iax=1&iai=http%3A%2F%2Fblog.amelieff.jp%2Fimages%2Finsertsize.png
Hope this helps,
Oliver Holmes
Hi Oliver,
Visual estimation may mot be very well and batch usable, but I did that.
Now clip mode is okay, but pair mode is still in trouble.
I use
bwa mem -a -M -R "my readgroup" ref.fa read1.fq read2.fq >out.sam
to align all hiseq-Xten pair end data. Is it okay for qsv?Then I remove duplication with picard, realign indel and do BQSR with GATK. Finally, one of my reads contain such information:
ST-E00144:112:HF5FJCCXX:2:2123:14702:21895 163 1 135 0 4S146M = 193 208 AAAACATCTTACTTTTGAGAGTTGAGCTGACCCCCAGTCCCTCACAGTTCCACACTGCCTGCAGAGTGAGTTTCCCATGTCTTCACCAGAGACTTTTGCCAGAGGCTTCTGAGACGCAAGTTAACAATGCAGACATGGAGGGTATCTCCA =<,<;===>==:=<==<<=7=:;=:=;;==<:::==<:=6;>:=<==<=/;=<=<>=;:>;8====<===-7+<618<=<:<===<:==6=;<>====<=====<;>=9<=;>;=9698=<=>),:),4-)+9<)7=69=<=7;<0<89= MC:Z:150M BD:Z:DD>>IGKJGFDIGE==FHECDGEEGEIIHHGGAAAGHHHGAGEHDDHHFEGGDDDGHIGGHIIHDEHDHEHF=EGAGHIEHEEEHDGGHDEDGGE==FIGGHDEGGIEEEHHEEHGFJHEIGDEFEHFJJJIFIFJKJHHJDJLJMEEGG MD:Z:9G0C13C4T100C15 PG:Z:MarkDuplicates RG:Z:DT.1.novo.250_L2 BI:Z:GGDDGGHHFEFFHEBBFHEEEHFFHEHHGHEFCCCHHHGFCGEHDFHHFFFHDFDHGFFGGFHHEEHFHEIFCGGDIHHGHGFGIEGIIFFFFIFCCGGGIIFFHHIFGGHIFFFHGIHGJHHHGHIIIHJJGGIJJJJIKHKKKKGFGI NM:i:5 MQ:i:0 AS:i:121 XS:i:125
bwa mem dosen't output SM NH X0 XA field anymore. So qsv stopped with informaton:
13:22:45.443 [pool-1-thread-2] SEVERE org.qcmg.qsv.annotate.AnnotateFilterMT - org.qcmg.qsv.QSVException: No discordant pair records passed the filter.
at org.qcmg.qsv.annotate.AnnotateFilterMT.run(AnnotateFilterMT.java:182)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
usage: qsv [OPTIONS] --ini [ini_file] --tmp [temporary_directory]
org.qcmg.qsv.QSVException: Exception observed when annotating and filtering BAMs
at org.qcmg.qsv.QSVPipeline.annotateAndFilterBams(QSVPipeline.java:414)
at org.qcmg.qsv.QSVPipeline.runPipeline(QSVPipeline.java:237)
at org.qcmg.qsv.QSV.runQSV(QSV.java:90)
at org.qcmg.qsv.QSV.main(QSV.java:36)
13:22:45.444 [main] SEVERE org.qcmg.qsv.QSV - org.qcmg.qsv.QSVException: Exception observed when annotating and filtering BAMs
at org.qcmg.qsv.QSVPipeline.annotateAndFilterBams(QSVPipeline.java:414)
at org.qcmg.qsv.QSVPipeline.runPipeline(QSVPipeline.java:237)
at org.qcmg.qsv.QSV.runQSV(QSV.java:90)
at org.qcmg.qsv.QSV.main(QSV.java:36)
13:22:45.444 [main] EXEC org.qcmg.qsv.QSV - StopTime 2017-02-19 13:22:45
13:22:45.444 [main] EXEC org.qcmg.qsv.QSV - TimeTaken 00:27:25
13:22:45.445 [main] EXEC org.qcmg.qsv.QSV - ExitStatus 1
I think we need another filter query compatible with widely used bwa-mem. What's your opinion?
Best wish!
Xuan
Last edit: xuan wang 2017-02-19
Hi Xuan,
I agree, you will need to dial the fitlers to suit your bam files.
Details on the filtering options are available on the wiki:
https://sourceforge.net/p/adamajava/wiki/qsv/#filter-options
Please let me know if you have any further questions.
Cheers,
Oliver Holmes