VirTools Wiki
a low frequency Virus Variant detection pipeline for Illumina data.
Brought to you by:
jokereumers,
yveswetzels
I am having some trouble installing VirVarSeq. I should clarify that I am working on a cluster where I don't have root access, but I think I have successfully installed all of the required modules and packages to my local library. I installed VirVarSeq following your instructions without getting any error messages, but when I test the installation I run into a problem. When I enter:
sh run.sh
It runs and creates a VirVarSeq.log file. When I look at the VirVarSeq.log file, I see the following:
run.sh: line 11: ./map_vs_ref.pl: Permission denied
run.sh: line 12: ./consensus.pl: Permission denied
run.sh: line 13: ./map_vs_consensus.pl: Permission denied
run.sh: line 14: ./codon_table.pl: Permission denied
run.sh: line 15: ./mixture_model.pl: Permission denied
Any insight as to why I am having trouble running these perl scripts? Is this an issue I can solve without root access?
Please check if the Perl scripts (and the run.sh) have the
execute
permission.You can make the scripts executable using the
chmod
commandJust type
to make all Perl scripts executable
Do the same for the
run.sh
scriptSee how the
x
has been added for the owner/groupI re-created and uploaded the newly created .tar.gz file with the modified rights.
Hi Radko
Do you have all pre-requisites in place? (BWA, R, ...)
What OS are you running?
Could you please e-mail me the complete logfile of the VirVar seq run?
Kind Regards
Yves
Dear Yves Wetzels
Thank you for your answer. I attached log file. I use Linux. I am quite new Linux user, however, seems that I managed to install: Statistics::Basic (which means that Perl should work), "rmgt" (which means that R should work). I installed also BWA (with no error messages, however I do not know should I specify unix environment variable of BWA for VirVarSeq). About Fortran - It seems to me that your VirVarSeq files already contained this - anyway I did not instal anything associated with Fortran.
Best wishes,
Radko Avi
Hi Radko
It seems there is something wrong with
bwa
.In the logfile you should see something like
You said however that
bwa
was installed?Can you execute
on your command prompt in your linux session and show me the result?
Kind Regards
Yves
Hi Yves!
Thank you! It helped. I reinstalled BWA and it started to work (with your HCV test data).
I tried to use may own HIV data, and seems that I managed to get codon, indel and stat table. However I got error message in log file. I add VirVarSeq.log file to this letter. If necessary I can send also my data files.
It would be very nice if you can comment this.
Best wishes,
Radko Avi
Hi Radko Avi
I can see an error in your VirVarSeq.txt logfile in the
refine gapped alignments
phaseCan you confirm the VirVarSeq runs with the HCV test data?
Kind Regards
Yves
Hi Yves
Thank you for your answer, however I cound not entirely understand it. I tried to run the program again with your HCV test data and with my HIV data. Seems that program runs correctly with your HCV test data in my computer, but using HIV data I got error message into log file (however generated codon table file seems to be ok). I add both HCV and HIV log files as attachment.
It would be very nice if you could comment it.
Kind Regards
Radko
Hi Radko
Since the pipeline runs on our test HCV data proofs all is in place.
If I look at the error message it still seems to be a
bug
orproblem
in the alignerbwa
.If I google the error message "bwa_refine_gapped_core: Assertion
re - rb == rlen' failed" you
ll find several hits saying it is a bug ...Maybe you can contact the people who wrote the
bwa
tool?Kind Regards
Yves
Hi,
I can comment on this one. This is a bug introduced in bwa aln after version 0.7, which very unpredictably sometimes pops up. If you ask the developer of bwa, Heng Li, he will comment that bwa aln is no longer supported, only bwa-mem
If you want to use our pipeline, it is better to install an older version of bwa (0.6.2).
best,
Joke
Hi,
how can I find the number of mapped reads
best regards
Luca
Hi Luca
The number of mapped reads is not available as output.
The only thing I think of is counting the lines in the .sam file?
Kind Regards
Yves
2015-04-14 19:15 GMT+02:00 luca carioti lcu75@users.sf.net:
--
Tarwestraat 5
B-2200 Noorderwijk
M (+32) 478 391 556
@yves.wetzels@wetzels-consulting.be
Hi Yves
Thanks you for your answer.
I converted .sam to .bam file and then extracted the numer of mapped reads by samtools
Can this be appropriate?
Best,
Luca
Hi Luca
Yes, I think this is correct.
Cheers
Yves
2015-04-27 15:22 GMT+02:00 luca carioti lcu75@users.sf.net:
--
Tarwestraat 5
B-2200 Noorderwijk
M (+32) 478 391 556
@yves.wetzels@wetzels-consulting.be
Thanks
Luca
hi im trying to run the virvarseq pipline its starts good but than in the ./map_vs_ref.pl script its doenwt recognize my sample name, there is '*' insted of '.'
see here:
-2017-01-31 10:09:24 ./map_vs_ref.pl 1.4.1 Locate reads ...
ls: cannot access /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_/R1gz: No such file or directory
ls: cannot access /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_/R2gz: No such file or directory
this is all the log file:
017-01-31 10:09:12 ./map_vs_ref.pl 1.0 Checking command line arguments ...
2017-01-31 10:09:12 ./map_vs_ref.pl 1.1 Checking input and output locations ...
2017-01-31 10:09:12 ./map_vs_ref.pl |--- Check fastq dir ...
2017-01-31 10:09:12 ./map_vs_ref.pl |--- Check map_vs_ref dir ...
2017-01-31 10:09:12 ./map_vs_ref.pl 1.2 Load in reference ...
2017-01-31 10:09:12 ./map_vs_ref.pl 1.3 Get sample names from file [/home/efrat/Downloads/VirVarSeq/miseq5/samples.txt]...
2017-01-31 10:09:12 ./map_vs_ref.pl 1.4 Do read mapping and save SAM file ...
2017-01-31 10:09:12 ./map_vs_ref.pl 1.4.1 Locate reads ...
2017-01-31 10:09:13 ./map_vs_ref.pl 1.4.2 Map reads for sample [110901_6_12_random_1] ...
2017-01-31 10:09:13 ./map_vs_ref.pl !---- Paired end mapping using /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R1_001.fastq.gz /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R2_001.fastq.gz ...
2017-01-31 10:09:13 ./map_vs_ref.pl Start aligning read
[bwa_read_seq] 0.0% bases are trimmed.
[bwa_aln_core] calculate SA coordinate... 36.70 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 33933 sequences have been processed.
[main] Version: 0.7.5a-r405
[main] CMD: bwa aln -t 8 -q 15 -n 12 -k 6 -f /home/efrat/Downloads/VirVarSeq/miseq5/results/map_vs_ref/110901_6_12_random_1.1.sai /home/efrat/Downloads/VirVarSeq/miseq5/ref/HXB2.fasta /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R1_001.fastq.gz
[main] Real time: 4.898 sec; CPU: 36.804 sec
2017-01-31 10:09:17 ./map_vs_ref.pl End aligning read
2017-01-31 10:09:17 ./map_vs_ref.pl Start aligning read
[bwa_read_seq] 0.0% bases are trimmed.
[bwa_aln_core] calculate SA coordinate... 36.01 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 33933 sequences have been processed.
[main] Version: 0.7.5a-r405
[main] CMD: bwa aln -t 8 -q 15 -n 12 -k 6 -f /home/efrat/Downloads/VirVarSeq/miseq5/results/map_vs_ref/110901_6_12_random_1.2.sai /home/efrat/Downloads/VirVarSeq/miseq5/ref/HXB2.fasta /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R2_001.fastq.gz
[main] Real time: 4.804 sec; CPU: 36.116 sec
2017-01-31 10:09:22 ./map_vs_ref.pl End aligning read
2017-01-31 10:09:22 ./map_vs_ref.pl Start generating paired alignment
[bwa_read_seq] 0.0% bases are trimmed.
[bwa_read_seq] 0.0% bases are trimmed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (142, 219, 311)
[infer_isize] low and high boundaries: 251 and 649 for estimating avg and std
[infer_isize] inferred external isize from 6751 pairs: 361.615 +/- 89.942
[infer_isize] skewness: 0.950; kurtosis: 0.175; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 793 (4.80 sigma)
[bwa_sai2sam_pe_core] time elapses: 0.09 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 6614 out of 7236 Q17 singletons are mated.
[bwa_paired_sw] 0 out of 10410 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 1.01 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 0.08 sec
[bwa_sai2sam_pe_core] print alignments... 0.10 sec
[bwa_sai2sam_pe_core] 33933 sequences have been processed.
[main] Version: 0.7.5a-r405
[main] CMD: bwa sampe -P -f /home/efrat/Downloads/VirVarSeq/miseq5/results/map_vs_ref/110901_6_12_random_1.sam /home/efrat/Downloads/VirVarSeq/miseq5/ref/HXB2.fasta /home/efrat/Downloads/VirVarSeq/miseq5/results/map_vs_ref/110901_6_12_random_1.1.sai /home/efrat/Downloads/VirVarSeq/miseq5/results/map_vs_ref/110901_6_12_random_1.2.sai /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R1_001.fastq.gz /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R2_001.fastq.gz
[main] Real time: 1.653 sec; CPU: 1.520 sec
2017-01-31 10:09:24 ./map_vs_ref.pl End generating paired alignment
2017-01-31 10:09:24 ./map_vs_ref.pl 1.5 End.
2017-01-31 10:09:24 ./map_vs_ref.pl 1.4.1 Locate reads ...
ls: cannot access /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_/R1gz: No such file or directory
ls: cannot access /home/efrat/Downloads/VirVarSeq/miseq5/fastq/Sample_/R2gz: No such file or directory
2017-01-31 10:09:24 ./map_vs_ref.pl 1.4.2 Map reads for sample [] ...
2017-01-31 10:09:24 ./map_vs_ref.pl !---- Paired end mapping using ...
2017-01-31 10:09:24 ./map_vs_ref.pl Start aligning read
Usage: bwa aln [options] <prefix> <in.fq>
Options: -n NUM max #diff (int) or missing prob under 0.02 err rate (float) [-1.00]
-o INT maximum number or fraction of gap opens [1]
-e INT maximum number of gap extensions, -1 for disabling long gaps [-1]
-i INT do not put an indel within INT bp towards the ends [5]
-d INT maximum occurrences for extending a long deletion [10]
-l INT seed length [32]
-k INT maximum differences in the seed [6]
-m INT maximum entries in the queue [2000000]
-t INT number of threads [8]
-M INT mismatch penalty [3]
-O INT gap open penalty [11]
-E INT gap extension penalty [4]
-R INT stop searching when there are >INT equally best hits [30]
-q INT quality threshold for read trimming down to 35bp [15]
-f FILE file to write output to instead of stdout
-B INT length of barcode
-L log-scaled gap penalty for long deletions
-N non-iterative mode: search for all n-difference hits (slooow)
-I the input is in the Illumina 1.3+ FASTQ-like format
-b the input read file is in the BAM format
-0 use single-end reads only (effective with -b)
-1 use the 1st read in a pair (effective with -b)
-2 use the 2nd read in a pair (effective with -b)
-Y filter Casava-filtered sequences
2017-01-31 10:09:24 ./map_vs_ref.pl End aligning read
2017-01-31 10:09:24 ./map_vs_ref.pl Start aligning read
somone know why?
thanks
efrat
I think you have some trouble between the file samples.txt and the name of the folder where you localize the fastq.gz file
IHTH
Last edit: Selene Zarate 2017-10-12
Hi,
I avving some issues when running ./map_vs_ref.pl, which instead of generating the .sam file it sends the content to the log file, which makes other scripts in the pipeline fail.
Any ideas,
Thanks,
Selene
Hi,
After running your software on the test dataset you provided, I get the output codon table with the following headers:
SAMPLE POSITION REF_CODON CODON REF_AA AA FWD_CNT FWD_DENOM REV_CNT REV_DENOM FWD_MEAN_MIN_QUAL REV_MEAN_MIN_QUAL FWD_FREQ REV_FREQ LOW_FREQ LOW_FREQ_DIRECTION FWD_STDDEV_MIN_QUAL REV_STDDEV_MIN_QUAL CNT DENOM FREQ
What do each of these headers mean? Where can I find the explanation of these headers?
Thanks
Rahil
Greetings,
I' am trying to run the program but i am facing some issues.
I'am getting this error:
Undefined subroutine &Files::openFile calles at ./map_vsconsensus.pl line 44
and after some prints
Could not open ./testdata/results/map_vs_consensus/110901_6_12_random_1.sam
and after some prints again...
Error in file(file,"rt") : Cannot open the connection
Calls: read.delim -> read.table -> file
The program seems to work fine in the beggining but after a bit it fails with the above error messages.
Any suggestions? Thanks! (I'll attach the .log file too)
Hi Anastasios
Could you to a listing of the files in the 'results' directory like
[ywetzel@awsabiirl1105 VirVarSeq]$ ls -lrt testdata/results/*
testdata/results/map_vs_ref:
total 1104880
-rw-r--r-- 1 ywetzel zusers 17515360 Jul 29 16:32
110901_6_12_random_1.1.sai
-rw-r--r-- 1 ywetzel zusers 17735008 Jul 29 16:33
110901_6_12_random_1.2.sai
-rw-r--r-- 1 ywetzel zusers 530443090 Jul 29 16:34 110901_6_12_random_1.sam
-rw-r--r-- 1 ywetzel zusers 17515360 Jul 29 16:35
110901_6_12_random_2.1.sai
-rw-r--r-- 1 ywetzel zusers 17735008 Jul 29 16:37
110901_6_12_random_2.2.sai
-rw-r--r-- 1 ywetzel zusers 530443090 Jul 29 16:37 110901_6_12_random_2.sam
testdata/results/consensus:
total 88
-rw-r--r-- 1 ywetzel zusers 9628 Jul 29 16:39
110901_6_12_random_1_consensus.fa
-rw-r--r-- 1 ywetzel zusers 9628 Jul 29 16:40
110901_6_12_random_2_consensus.fa
-rw-r--r-- 1 ywetzel zusers 9708 Jul 29 16:40
110901_6_12_random_1_consensus.fa.bwt
-rw-r--r-- 1 ywetzel zusers 2403 Jul 29 16:40
110901_6_12_random_1_consensus.fa.pac
-rw-r--r-- 1 ywetzel zusers 49 Jul 29 16:40
110901_6_12_random_1_consensus.fa.ann
-rw-r--r-- 1 ywetzel zusers 9 Jul 29 16:40
110901_6_12_random_1_consensus.fa.amb
-rw-r--r-- 1 ywetzel zusers 4856 Jul 29 16:40
110901_6_12_random_1_consensus.fa.sa
-rw-r--r-- 1 ywetzel zusers 9708 Jul 29 16:44
110901_6_12_random_2_consensus.fa.bwt
-rw-r--r-- 1 ywetzel zusers 2403 Jul 29 16:44
110901_6_12_random_2_consensus.fa.pac
-rw-r--r-- 1 ywetzel zusers 49 Jul 29 16:44
110901_6_12_random_2_consensus.fa.ann
-rw-r--r-- 1 ywetzel zusers 9 Jul 29 16:44
110901_6_12_random_2_consensus.fa.amb
-rw-r--r-- 1 ywetzel zusers 4856 Jul 29 16:44
110901_6_12_random_2_consensus.fa.sa
testdata/results/map_vs_consensus:
total 1102824
-rw-r--r-- 1 ywetzel zusers 17584120 Jul 29 16:42
110901_6_12_random_1.1.sai
-rw-r--r-- 1 ywetzel zusers 17805280 Jul 29 16:43
110901_6_12_random_1.2.sai
-rw-r--r-- 1 ywetzel zusers 529253201 Jul 29 16:44 110901_6_12_random_1.sam
-rw-r--r-- 1 ywetzel zusers 17584120 Jul 29 16:45
110901_6_12_random_2.1.sai
-rw-r--r-- 1 ywetzel zusers 17805280 Jul 29 16:47
110901_6_12_random_2.2.sai
-rw-r--r-- 1 ywetzel zusers 529253201 Jul 29 16:47 110901_6_12_random_2.sam
testdata/results/codon_table:
total 2356
-rw-r--r-- 1 ywetzel zusers 1034 Jul 29 16:51 110901_6_12_random_1.stat
-rw-r--r-- 1 ywetzel zusers 2198944 Jul 29 16:51 110901_6_12_random_1.codon
-rw-r--r-- 1 ywetzel zusers 206310 Jul 29 16:54 110901_6_12_random_1.indel
...
...
It seems the alignment (bwa) was not succeeded succefully?
Kind Regards
Yves
Op ma 29 jul. 2019 om 14:56 schreef Anastasios Mx anastmx@users.sourceforge.net:
--
Tarwestraat 5
B-2200 Noorderwijk
M (+32) 478 391 556
@yves.wetzels@wetzels-consulting.be
Mr Yves,
Thanks for the quick response, the ls -lrt testdata/results/* command returned me this:
results/map_vs_consensus:
total 0
results/codon_table:
total 0
results/mixture_model:
total 0
results/map_vs_ref:
total 1104884
-rw-r--r-- 1 ubagen-2 ubagen-2 17515360 Jul 29 16:02 110901_6_12_random_1.1.sai
-rw-r--r-- 1 ubagen-2 ubagen-2 17735008 Jul 29 16:05 110901_6_12_random_1.2.sai
-rw-r--r-- 1 ubagen-2 ubagen-2 530442673 Jul 29 16:05 110901_6_12_random_1.sam
-rw-r--r-- 1 ubagen-2 ubagen-2 17515360 Jul 29 16:09 110901_6_12_random_2.1.sai
-rw-r--r-- 1 ubagen-2 ubagen-2 17735008 Jul 29 16:12 110901_6_12_random_2.2.sai
-rw-r--r-- 1 ubagen-2 ubagen-2 530442673 Jul 29 16:13 110901_6_12_random_2.sam
results/consensus:
total 24
-rw-r--r-- 1 ubagen-2 ubagen-2 9628 Jul 29 16:31 110901_6_12_random_1_consensus.fa
-rw-r--r-- 1 ubagen-2 ubagen-2 9628 Jul 29 16:32 110901_6_12_random_2_consensus.fa
If i type on my terminal (>bwa), it prints the bwa info, soo bwa seems to be in PATH and work (?)
Thanks in advance!
Last edit: Anastasios Mx 2019-08-01
Hi Anastasios
Since I don't see any files in the 'map_vs_consensus' it is likely that
something went wrong with the 3 step 'map_vs_consensus.pl'.
Can you please modify the run.sh script and put all commands in comment and
only execute the 'mapping versus consensus' step?
You could remove the redirection to the logfile to see what the script is
doing.
for example:
[ywetzel@awsabiirl1105 VirVarSeq]$ more run.sh
indir=./testdata/fastq
outdir=./testdata/results
samples=./testdata/samples.txt
ref=./testdata/ref/1b_con1_AJ238799.NCBI.fa
startpos=3112
endpos=5531
region_start=3420
region_len=181
qv=0
./map_vs_ref.pl --samplelist $samples --ref $ref --indir $indir --outdir
$outdir --mapping paired
./consensus.pl --samplelist $samples --ref $ref --indir $indir --outdir
$outdir --start $startpos --end $endpos
./map_vs_consensus.pl --samplelist $samples --indir $indir --outdir $outdir
--mapping paired
./codon_table.pl --samplelist $samples --ref $ref --outdir $outdir
--start=$region_start --len=$region_len --trimming=0 --qual=$qv >> VirVa
rSeq.log 2>&1
./mixture_model.pl --samplelist $samples --outdir $outdir --ref $ref
--start=$region_start --len=$region_len --qual=$qv >> VirVarSeq.log 2>
&1
and then run the script:
[ywetzel@awsabiirl1105 VirVarSeq]$ ./run.sh
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.0 Checking command line
arguments ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.1 Get sample names from
file [./testdata/samples.txt]...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.2 Checking input and
output locations ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.2.1 Create local
[map_vs_consensus] dir ...
2019-08-07 16:06:17 ./map_vs_consensus.pl |--- Check fastq dir ...
2019-08-07 16:06:17 ./map_vs_consensus.pl |--- Check map_vs_consensus
dir ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.3 Do read mapping and
save SAM file ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.3.1 Locate reads ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.3.2 Locate consensus for
[110901_6_12_random_1] ...
2019-08-07 16:06:17 ./map_vs_consensus.pl 3.3.3 Map reads for sample
[110901_6_12_random_1] ...
2019-08-07 16:06:17 ./map_vs_consensus.pl !---- Paired end mapping
using
./testdata/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R1_001.fastq.gz
./testdata/fastq/Sample_110901_6_12_random_1/110901_6_12_random_1_CTTGTA_L006_R2_001.fastq.gz
...
2019-08-07 16:06:17 ./map_vs_consensus.pl Start aligning read
...
...
Kind Regards
Yves
Op do 1 aug. 2019 om 16:20 schreef Anastasios Mx anastmx@users.sourceforge.net:
--
Tarwestraat 5
B-2200 Noorderwijk
M (+32) 478 391 556
@yves.wetzels@wetzels-consulting.be