Menu

Error running lofreq somatic and indel calling handler on normal tumor bam files

2015-01-14
2015-01-19
  • chris.cornor

    chris.cornor - 2015-01-14

    Hi Andreas,

    I am trying to use lofreq for somatic indel calling using the somatic and --call-indels handler for my normal and tumor bam files. I downloaded the version lofreq_star-2.1.0_linux-x86-64.tgz. I am using it on our institute cluster Linux version 3.2.0-4-amd64. I have just unpacked the software and then tried to run it. The command am using is below

    /scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq somatic --threads 8 -n /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam -t /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam -f /scratch/GT/vdas/test_exome/exome/hg19.fa --call-indels -l /scratch/GT/vdas/referenceBed/hg19/ss_v4/SureSelect_XT_Human_All_Exon_V4.bed -d /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -o /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_

    N_S8980.realigned.recal.bam is my normal bam

    T_S7998.realigned.recal.bam is my tumor bam

    Since it is an exome data am using the bed file of the SureSelect V4 and also am using the dbSNP file to remove the variants from the output.

    I am getting the following error.

    WARNING [2015-01-14 17:09:00,626]: Using stringent normal for SQ which does not however contain indels!
    CRITICAL [2015-01-14 17:09:00,724]: The following command failed with code 1: lofreq call-parallel --pp-threads 8 -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1 -l /scratch/GT/vdas/referenceBed/hg19/ss_v4/SureSelect_XT_Human_All_Exon_V4.bed --call-indels -a 0.100000 --use-orphan -B -N -A /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam -o /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_normal_relaxed.vcf.gz
    CRITICAL [2015-01-14 17:09:00,724]: Received the following on stderr:
    INFO [2015-01-14 17:09:00,653]: Using 8 threads with following basic args: lofreq call -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1 --call-indels -a 0.100000 --use-orphan -B -N -A /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam

    Traceback (most recent call last):

    File "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py", line 751, in <module>

    main()
    

    File "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py", line 563, in main

    bed_bins = [Region._make(x) for x in read_bed_coords(bed_file)]
    

    File "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py", line 106, in read_bed_coords

    (chrom, start, end) = line.split("\t")[0:3]
    

    ValueError: need more than 1 value to unpack

    CRITICAL [2015-01-14 17:09:00,724]: Somatic SNV caller failed. Exiting

    Can you please tell me where am getting wrong and how to get rid of this error. I guess the software I downloaded if fine for my system.

    Regards,

    Chris

     
    • Andreas Wilm

      Andreas Wilm - 2015-01-15

      Hi Chris,

      this looks like an unhandled error triggered in the bed reading
      function. Is your bed file in any way unusual? Would you mind to share
      it, either here or via PM (wilma@gis.a-star.edu.sg)? The problem
      should be easy to fix then.

      Andreas

      On 15 January 2015 at 01:45, chris.cornor vd4mmind@users.sf.net wrote:

      Hi Andreas,

      I am trying to use lofreq for somatic indel calling using the somatic and
      --call-indels handler for my normal and tumor bam files. I downloaded the
      version lofreq_star-2.1.0_linux-x86-64.tgz. I am using it on our institute
      cluster Linux version 3.2.0-4-amd64. I have just unpacked the software and
      then tried to run it. The command am using is below

      /scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq somatic --threads 8 -n
      /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
      -t
      /scratch/GT/vdas/pietro/exome_seq/results/T_S7998/T_S7998.realigned.recal.bam
      -f /scratch/GT/vdas/test_exome/exome/hg19.fa --call-indels -l
      /scratch/GT/vdas/referenceBed/hg19/ss_v4/SureSelect_XT_Human_All_Exon_V4.bed
      -d /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -o
      /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_

      N_S8980.realigned.recal.bam is my normal bam

      T_S7998.realigned.recal.bam is my tumor bam

      Since it is an exome data am using the bed file of the SureSelect V4 and
      also am using the dbSNP file to remove the variants from the output.

      I am getting the following error.

      WARNING [2015-01-14 17:09:00,626]: Using stringent normal for SQ which does
      not however contain indels!
      CRITICAL [2015-01-14 17:09:00,724]: The following command failed with code
      1: lofreq call-parallel --pp-threads 8 -d 101000 -f
      /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1
      -l
      /scratch/GT/vdas/referenceBed/hg19/ss_v4/SureSelect_XT_Human_All_Exon_V4.bed
      --call-indels -a 0.100000 --use-orphan -B -N -A
      /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
      -o
      /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_normal_relaxed.vcf.gz
      CRITICAL [2015-01-14 17:09:00,724]: Received the following on stderr:
      INFO [2015-01-14 17:09:00,653]: Using 8 threads with following basic args:
      lofreq call -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose
      --no-default-filter -b 1 --call-indels -a 0.100000 --use-orphan -B -N -A
      /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam

      Traceback (most recent call last):

      File
      "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py",
      line 751, in <module>

      main()

      File
      "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py",
      line 563, in main

      bed_bins = [Region._make(x) for x in read_bed_coords(bed_file)]

      File
      "/scratch/GT/softwares/lofreq_star-2.1.0/bin/lofreq2_call_pparallel.py",
      line 106, in read_bed_coords

      (chrom, start, end) = line.split("\t")[0:3]

      ValueError: need more than 1 value to unpack

      CRITICAL [2015-01-14 17:09:00,724]: Somatic SNV caller failed. Exiting

      Can you please tell me where am getting wrong and how to get rid of this
      error. I guess the software I downloaded if fine for my system.

      Regards,

      Chris


      Error running lofreq somatic and indel calling handler on normal tumor bam
      files


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
  • chris.cornor

    chris.cornor - 2015-01-15

    I am not sure how I can share the bed file, can i host it up anywhere? its almost 180 mb I guess. it is the sure select bed file which I have been supplied with the kit. I also have a .list of the same file, can that be used or I have to use the .bed format itself?

     
  • chris.cornor

    chris.cornor - 2015-01-15

    I have used the S03723314_Covered.bed file which you can download from the agilent website. And this does not work. So you recommend to just create a custom bed file with coordinates and then run the command?

     
  • chris.cornor

    chris.cornor - 2015-01-15

    The format of the bed file looks like this

    head -5 S03723314_Covered.bed

    browser position chr1:762097-762270

    track name="Covered" description="Agilent SureSelect DNA - SureSelect #Human All Exon V4- Genomic regions covered by probes" color=0,0,128

    chr1 762097 762270 #ref|LINC00115,ens|ENST00000473798,ens|ENST00000536430,ref|NR_024321,mRNA|AK026292,mRNA|BC017762,mRNA|BC017762,mRNA|AK026292,ref|NR_024321,ens|ENST00000536430,ens|ENST00000473798,ref|LINC00115

    chr1 861281 861490 #ref|SAMD11,ccds|CCDS2.2,ens|ENST00000420190,ens|ENST00000437963,ens|ENST00000342066,ref|NM_152486,mRNA|AF161376,mRNA|AF161376,ref|NM_152486,ens|ENST00000342066,ens|ENST00000437963,ens|ENST00000420190,ccds|CCDS2.2,ref|SAMD11

    chr1 865591 865791 #ref|SAMD11,ccds|CCDS2.2,ens|ENST00000420190,ens|ENST00000437963,ens|ENST00000342066,ens|ENST00000341065,ref|NM_152486,mRNA|AF161376,mRNA|AF161376,ref|NM_152486,ens|ENST00000341065,ens|ENST00000342066,ens|ENST00000437963,ens|ENST00000420190,ccds|CCDS2.2,ref|SAMD11

    This is a standard bed format for the agilent kits.

     

    Last edit: chris.cornor 2015-01-15
    • Andreas Wilm

      Andreas Wilm - 2015-01-15

      Oh ok. That looks like an extension of the bed format. LoFreq (and
      samtools) expect something much simpler as defined here
      http://genome.ucsc.edu/FAQ/FAQformat.html#format1:
      1st column chromosome name
      2nd column region start (zero based)
      3rd column region end (inclusive)
      All other columns are optional.
      You have unexpected rows in your data needed for the UCSC track definition.

      Anyway, the following will produce a version of your file that should work:
      $ grep -v ^browser S03723314_Covered.bed | grep -v ^track > new.bed

      Try the created file called "new.bed" instead of the original bed
      file. Let me know if that works and I will make corresponding changes
      in the LoFreq code.

      Thanks,
      Andreas

      On 15 January 2015 at 19:15, chris.cornor vd4mmind@users.sf.net wrote:

      The format of the bed file looks like this

      head -5 S03723314_Covered.bed

      browser position chr1:762097-762270

      track name="Covered" description="Agilent SureSelect DNA - SureSelect Human
      All Exon V4- Genomic regions covered by probes" color=0,0,128
      chr1 762097 762270
      ref|LINC00115,ens|ENST00000473798,ens|ENST00000536430,ref|NR_024321,mRNA|AK026292,mRNA|BC017762,mRNA|BC017762,mRNA|AK026292,ref|NR_024321,ens|ENST00000536430,ens|ENST00000473798,ref|LINC00115
      chr1 861281 861490
      ref|SAMD11,ccds|CCDS2.2,ens|ENST00000420190,ens|ENST00000437963,ens|ENST00000342066,ref|NM_152486,mRNA|AF161376,mRNA|AF161376,ref|NM_152486,ens|ENST00000342066,ens|ENST00000437963,ens|ENST00000420190,ccds|CCDS2.2,ref|SAMD11
      chr1 865591 865791
      ref|SAMD11,ccds|CCDS2.2,ens|ENST00000420190,ens|ENST00000437963,ens|ENST00000342066,ens|ENST00000341065,ref|NM_152486,mRNA|AF161376,mRNA|AF161376,ref|NM_152486,ens|ENST00000341065,ens|ENST00000342066,ens|ENST00000437963,ens|ENST00000420190,ccds|CCDS2.2,ref|SAMD11#

      This is a standard bed format for the agilent kits.


      Error running lofreq somatic and indel calling handler on normal tumor bam
      files


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
  • chris.cornor

    chris.cornor - 2015-01-15

    Yes the tool is working now, I modified the bed file of the company to a general bed format as of mentioned in UCSC just keeping the chr start and end coordinates and it seem to run , am waiting for the output now. Just to ask you a thing, this vcf are zipped , to annotate I usually use VEP for retrieving maximum annotated variants but it does not work with pakced vcf. I will try to unzip it once the run is over. Just to ask you any annotator that can directly accept vcf zipped files?

     
  • chris.cornor

    chris.cornor - 2015-01-15

    I am watching 3 different outout, 2 is of snvs ( one is relaxed and other is stringent) and one stringent for the indel file. I uncompressed the stringent and relaxed file of my output to see how the contents are placed. In some SNVs the QUAL filter does not have any values and it is having (.) and in other they have values. Usually I chose variants having QUAL above 50. ALso here I see the stringent SNV file has over 20k variants. in the info column I could see something as CONSVAR, may I know what it stands for? It seems I need to filter the vcf files as well from the lofreq output. I could not find much details on these front in any of the documentation. If still I had missed I would ask for apology. Please let me know then shall I go for the filter handler to filter the vcf files that are output here from LoFreq?

     
  • chris.cornor

    chris.cornor - 2015-01-15

    Also I carefully noticed some log comments where I see these comments
    WARNING [2015-01-15 11:59:20,322]: Using stringent normal for SQ which does not however contain indels!
    CRITICAL [2015-01-15 14:44:52,604]: The following command failed with code 1: lofreq vcfset -a complement -2 /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -1 /scratch/GT/vdas/pietro/exome_s
    eq/results/lofreq_out/out_LG_somatic_final.snvs.vcf.gz --only-snvs -o /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_LG_somatic_final_minus-dbsnp.snvs.vcf.gz
    CRITICAL [2015-01-15 14:44:52,604]: Received the following on stderr:
    FATAL(lofreq_vcfset.c|main_vcfset:301): Couldn't load tabix index for /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf

    CRITICAL [2015-01-15 14:44:52,604]: Somatic SNV caller failed. Exiting

    Does this mean that the tool did not work properly? do I have to index even te dbsnp file?

    Also in the log file which comes as an output I see the below comments which shows that the run was fine

    lofreq call-parallel --pp-threads 8 -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1 -l /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed --call-indels

    -a 0.100000 --use-orphan -B -N -A /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam -o /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_normal_relaxed.vcf.gz
    stderr: INFO [2015-01-15 11:59:13,416]: Using 8 threads with following basic args: lofreq call -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1 --call-indels -a 0.10
    0000 --use-orphan -B -N -A /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
    stderr:
    stderr: INFO [2015-01-15 11:59:14,418]: Adding 27 commands to mp-pool
    stderr: Number of substitution tests performed: 17946093
    stderr: Number of indel tests performed: 424952
    stderr: INFO [2015-01-15 12:51:21,951]: Copying concatenated vcf file to final destination

    Can you please guide me where am going wrong? I mean we should not have so many SNVs or indels. Both are around 20k and 14k respectively after filtering for variants with QUAL score. So am in a fix now. I have already used this bam files on GATK and VarScan and I have got respectable counts for SNVs and indels but I want to make my analysis mroe strong using LoFreq indel caller to build a comprehensive list of genes that are indels. Also I have very few samples and I work individually on them.

     
    • Andreas Wilm

      Andreas Wilm - 2015-01-16

      Hi Chris,

      When you call somatic SNVs then you only need to look at the file that
      ends in "somatic_final.snvs.vcf.gz", or if you used dbsnp then
      "somatic_final_minus-dbsnp.snvs.vcf.gz". All the other files are more
      or less temporary and you should ignore them. See also
      http://csb5.github.io/lofreq/commands/#somatic

      You are getting a "tabix index" error because LoFreq expects your
      dbsnp file to be tabix indexed (see 'lofreq somatic --help'). This is
      to speed up things. Just run 'bgzip' and 'tabix' on the dbsnp vcf file
      to achieve this. This will also save you some disk space.

      Regarding the quality filtering: LoFreq filters variants for you
      already. Just use the files called final (see above). So there is no
      need for you to do that again. We call variants with more than 50%
      allele frequency consensus variants (CONSVAR) and don't assign a
      quality to them, which is why they get a "dot" (for missing) instead.
      These are "obvious" variants i.e. you could say they are high quality.
      You will want to keep them.

      You asked for a variant annotator that directly accepts zipped vcf
      files: I don't know if SNPeff etc. support this, but if it doesn't,
      then it's just a matter of running 'gunzip' on the file first.

      And finally the bed file issue: samtools actually does support the
      format you used and I fixed LoFreq to do the same. The fix will be
      included in the next version (2.1.2). For the sake of completeness,
      could you please send me a link to the bed file you used (you said you
      downloaded it from somewhere), so that I can test this thoroughly?

      Let me know if you need more information or if you have any more problems,
      Andreas

      On 16 January 2015 at 01:30, chris.cornor vd4mmind@users.sf.net wrote:

      Also I carefully noticed some log comments where I see these comments
      WARNING [2015-01-15 11:59:20,322]: Using stringent normal for SQ which does
      not however contain indels!
      CRITICAL [2015-01-15 14:44:52,604]: The following command failed with code
      1: lofreq vcfset -a complement -2
      /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf -1
      /scratch/GT/vdas/pietro/exome_s
      eq/results/lofreq_out/out_LG_somatic_final.snvs.vcf.gz --only-snvs -o
      /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_LG_somatic_final_minus-dbsnp.snvs.vcf.gz
      CRITICAL [2015-01-15 14:44:52,604]: Received the following on stderr:
      FATAL(lofreq_vcfset.c|main_vcfset:301): Couldn't load tabix index for
      /scratch/GT/vdas/test_exome/exome/databases/dbsnp_137.hg19.vcf

      CRITICAL [2015-01-15 14:44:52,604]: Somatic SNV caller failed. Exiting

      Does this mean that the tool did not work properly? do I have to index even
      te dbsnp file?

      Also in the log file which comes as an output I see the below comments which
      shows that the run was fine

      lofreq call-parallel --pp-threads 8 -d 101000 -f
      /scratch/GT/vdas/test_exome/exome/hg19.fa --verbose --no-default-filter -b 1
      -l /scratch/GT/vdas/referenceBed/hg19/ss_v4/Exon_SSV4_clean.bed
      --call-indels

      -a 0.100000 --use-orphan -B -N -A
      /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
      -o
      /scratch/GT/vdas/pietro/exome_seq/results/lofreq_out/out_HG_normal_relaxed.vcf.gz
      stderr: INFO [2015-01-15 11:59:13,416]: Using 8 threads with following basic
      args: lofreq call -d 101000 -f /scratch/GT/vdas/test_exome/exome/hg19.fa
      --verbose --no-default-filter -b 1 --call-indels -a 0.10
      0000 --use-orphan -B -N -A
      /scratch/GT/vdas/pietro/exome_seq/results/N_S8980/N_S8980.realigned.recal.bam
      stderr:
      stderr: INFO [2015-01-15 11:59:14,418]: Adding 27 commands to mp-pool
      stderr: Number of substitution tests performed: 17946093
      stderr: Number of indel tests performed: 424952
      stderr: INFO [2015-01-15 12:51:21,951]: Copying concatenated vcf file to
      final destination

      Can you please guide me where am going wrong? I mean we should not have so
      many SNVs or indels. Both are around 20k and 14k respectively after
      filtering for variants with QUAL score. So am in a fix now. I have already
      used this bam files on GATK and VarScan and I have got respectable counts
      for SNVs and indels but I want to make my analysis mroe strong using LoFreq
      indel caller to build a comprehensive list of genes that are indels. Also I
      have very few samples and I work individually on them.


      Error running lofreq somatic and indel calling handler on normal tumor bam
      files


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
  • chris.cornor

    chris.cornor - 2015-01-16

    Hi Andreas,
    Thank you for your reply. I would like to say that I had a successful run. Actually while it was runnnig the *final.vcf.gz were still not generated so I could not see them. I would look carefully now on them. One note I would like to say is I do not see any file with somatic_final_minus-dbsnp.snvs.vcf.gz. Is this because my dbsnp file is not indexed? Should I index it and run it once more? I will now go through the outputs but meanwhile would like to inform you that I have used -d dbsnps file but I do not see any output with final_minus-dbsnps. So shall I run again with the indexed dbsnp and then making a zip file?

    Also the bed file you are talking about is hosted in agilent website. You have to log in

    https://earray.chem.agilent.com/earray/
    If you are new to earray, you will first need to register and then it is free to use. In the top right corner of the screen choose Application type Sureselect Target enrichment. From there choose “Libraries” and then “Browse Libraries”. Choose the catalogue kit that you are interested in and click download. A list of the annotation files available will then appear, choose the file type you require and click download.

    If you are facing issues let me know I can put my file in dropbox and link it to you. Thanks.
    Chris

     

    Last edit: chris.cornor 2015-01-16
    • Andreas Wilm

      Andreas Wilm - 2015-01-19

      Hey Chris,

      yes, the file somatic_final_minus-dbsnp.snvs.vcf.gz is not there,
      because the dbsnp removal step failed, since your dbsnp was not
      indexed. I changed the code so that it will now complain directly.
      Anyway, you can just run bgzip and tabix on dbsnp and then rerun
      LoFreq somatic using the same parameters as before and add --continue.
      That will pick up things were they failed and continue. If you however
      unzipped some files while LoFreq was still running, then it's probably
      safest to remove all intermediate files and start all over again.

      Andreas

      On 16 January 2015 at 17:49, chris.cornor vd4mmind@users.sf.net wrote:

      Hi Andreas,
      Thank you for your reply. I would like to say that I had a successful run.
      Actually while it was runnnig the *final.vcf.gz were still not generated so
      I could not see them. I would look carefully now on them. One note I would
      like to say is I do not see any file with
      somatic_final_minus-dbsnp.snvs.vcf.gz. Is this because my dbsnp file is not
      indexed? Should I index it and run it once more? I will now go through the
      outputs but meanwhile would like to inform you that I have used -d dbsnps
      file but I do not see any output with final_minus-dbsnps. So shall I run
      again with the indexed dbsnp and then making a zip file?

      Also the bed file you are talking about is hosted in agilent website. You
      have to log in

      https://earray.chem.agilent.com/earray/
      If you are new to earray, you will first need to register and then it is
      free to use. In the top right corner of the screen choose Application type
      Sureselect Target enrichment. From there choose “Libraries” and then “Browse
      Libraries”. Choose the catalogue kit that you are interested in and click
      download. A list of the annotation files available will then appear, choose
      the file type you require and click download.

      If you are facing issues let me know I can put my file in dropbox and link
      it to you. Thanks.
      Chris


      Error running lofreq somatic and indel calling handler on normal tumor bam
      files


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       
  • chris.cornor

    chris.cornor - 2015-01-19

    Hi Andreas,

    Thank you very much for the reply. I would like to let you know that in the final indel file (which is my primary requirement now), I have annotated and there I found some indels with rsid after annotation which says that they are indeed present in dbsnp database so I guess rerunning it wont be that necessary right? However sometimes some variants even after discounting we see they do not show any rsid at vcf level but on annotation they do get some ids. So do you think at this point it will be necessary to do it all over again. My idea is to find variants in tumor and in its reprogrammed clone and match them , provided if some of them are not novel it is not an issue as of now. When I will go for tumor driving variants then probably discounting the dbsnp variants will be useful. That can also be done filtering for annotated variants which has some #rsid after annotating them. Please let me know your views on this.

    Thanks,

    Chris

     

    Last edit: chris.cornor 2015-01-19
    • Andreas Wilm

      Andreas Wilm - 2015-01-19

      Hey Chris,

      if you get a final vcf file, then there is no need to rerun LoFreq.

      Whether you want to remove dbsnp or not ,depends a bit on the question
      you want to answer. The default is be to remove dbsnp, simply to avoid
      cases where a germline call was erroneously missed in the normal,
      which then produces a (false) somatic call. If in doubt, go with the
      default!

      Andreas

      On 19 January 2015 at 17:48, chris.cornor vd4mmind@users.sf.net wrote:

      Hi Andreas,

      Thank you very much for the reply. I would like to let you know that in the
      final indel file (which is my primary requirement now, I have not annotated
      them and there I found some indels with rsid after annotation which says
      that they are indeed present in dbsnp database so I guess rerunning it wont
      be that necessary right? However sometimes some variants even after
      discounting we see they do not show any rsid at vcf level but on annotation
      they do get some ids. So do you think at this point it will be necessary to
      do it all over again. My idea is to find variants in tumor and in its
      reprogrammed clone and match them , provided if some of them are not novel
      it is not an issue as of now. When I will go for tumor driving variants then
      probably discounting the dbsnp variants will be useful. That can also be
      done filtering for annotated variants which has some #rsid after annotating
      them. Please let me know your views on this.

      Thanks,

      Chris


      Error running lofreq somatic and indel calling handler on normal tumor bam
      files


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/lofreq/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --
      Andreas Wilm
      andreas.wilm@gmail.com | mail@andreas-wilm.com | 0x7C68FBCC

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.