Menu

JoinMap File conversion Segregation nnxnp gives nn, np, and pp classification type (Only nn and np are allowed).

Anonymous
2018-07-30
2018-12-20
  • Anonymous

    Anonymous - 2018-07-30

    Hi there,

    Good day to you. We’ve been exploring NGSEP a little bit so far with our GBS data, I love the overall simplicity for the end-user, but we’ve run into a snag. We use JoinMap for our linkage analysis. It looks like we are getting some erroneous classification types during the conversion from .vcf to JoinMap, unless there is something that we don’t understand about the allowable outputs for the program. In the segregation types nnxnp and lmxll, which result in classifications of nn and np and lm and ll respectively, we are getting the occasional calls in the data for both mm and pp classifications, this should not exist and is flagged in JoinMap as an incorrect code. Could you please explain these occurrences and how we should be interpreting these situations? Thank you for your time.

    Best Regards,

    Jacob Snelling,

    Horticulture
    4160 Agriculture and Life Sciences Building
    Oregon State University
    Corvallis, OR 97331-7304

    Sent from Mail for Windows 10

     
  • Jorge Duitama

    Jorge Duitama - 2018-07-31

    Hi Jacob

    Many thanks for your interest in NGSEP. We are glad to know that you found the software easy to use. My first guess on the issue that you are describing is that the genotype calls that Join map is showing as erroneous are actually erroneous in the VCF file. If possible, please share a filtered VCF file including one or more SNPs with erroneous genotype calls so that I can take a look. For different reasons, errors like this unfortunately happen in every variant calling pipeline. There are several alternatives to reduce them. The first I recommend, would be to increase the minimum quality score using the filter functionality. If the percentage of SNPs affected with errors is not too big, you can just remove these SNPs. Given that you have GBS data, you shoud have plenty of SNPs to take conservative filtering decisions and still build a dense genetic map. Also, join map may have a specific function to remove only the erroneous datapoints or transform them into heterozygous, which would be the most likely correct genotype for a erroneous homozygous datapoint. Finally, if you see too many errors in one specific SNP, this may reflect a genotyping error in one of the parents for such SNP.

    Let me know how things go.

    Jorge

     
  • Anonymous

    Anonymous - 2018-07-31

    Hi Jorge,

    I believe you are right, this does stem from the vcf file, or, it seems more likely, in the actual conversion script. I've gotten the same errors from Stacks V2 vcf outputs as well as NGSEP from start to finish. I'm attaching and example of both the vcf and corresponding joinmap files. Just do a text seach in the joinmap file for pp or mm and then find the same variant in the vcf file. For CP type crosses, the only allowable classifications for these segregation types in JoinMap should be nnxnp = nn, np, -- or lmxll = lm, ll, --, the pp or mm classifications should not be possible. Thank you for your help.

    Jacob

     
  • Jorge Duitama

    Jorge Duitama - 2018-08-02

    Thanks Jacob

    I went over your files and I actually found an error that swapped the genotype information of the parents. Fortunately this should not affect the construction of the genetic map but it definitely looks weird. Giving a bit of thought to the main issue, I also decided to change the behavior of the converter when an inconsistent homozygous genotype is found in the VCF file. Instead of exporting the error, NGSEP will now issue a warning in the log file and generate an unknown genotype call ("--"). This fix will formally appear in the next release. In the mean time, one quick fix would be to eliminate the SNPs with inconsistent genotypes, which can be done with the "-frs" option of FilterVCF. If you prefer to try already the fixed (but stil unstable) version, you can clone the github repository:

    git clone https://github.com/NGSEP/NGSEPcore

    Build the jar for version 3.2.1:

    cd /path/to/NGSEPcore
    make

    And run again the converter with NGSEPcore_3.2.1.jar. For other functionalities different than this one, please keep using the official jar of the previous release (NGSEPcore_3.2.0.jar). Although the probability of having errors is currently not too big, version 3.2.1 has not formally passed through the sanity tests to ensure that everything is still working fine.

    Let me know how things go

    Jorge

     
  • Anonymous

    Anonymous - 2018-08-03

    Thank you Jorge. I'll let you know when I've made some progress. I'm having some issues with both compiling and java after a recent system upgrade, so I haven't been able to test it out yet. We'll see what happens first, the next version, or working out the bugs in my system :)

    Jacob

     
  • Anonymous

    Anonymous - 2018-10-30

    Hi Jorge. I just wanted to let you know that the JoinMap conversion is working properly after the most recent update. Thanks for your help again.

     
  • Jorge Duitama

    Jorge Duitama - 2018-10-31

    Thanks Jacob. It is great for us to know that the new version worked for you. Feel free to write back if you have further quesrtions or issues with NGSEP.

    Best regards

     
    • Julian Bello

      Julian Bello - 2018-12-17

      Hi Jorge!!
      I hope this email finds you well. I was wondering if between the options of
      NGSEP there is a filtering option such as QualByDEpth of GATK.
      Thanks in advance,
      Julian.

      On Wed, Oct 31, 2018 at 8:31 AM Jorge Duitama jduitama@users.sourceforge.net wrote:

      Thanks Jacob. It is great for us to know that the new version worked for
      you. Feel free to write back if you have further quesrtions or issues with
      NGSEP.

      Best regards

      JoinMap File conversion Segregation nnxnp gives nn, np, and pp
      classification type (Only nn and np are allowed).
      https://sourceforge.net/p/ngsep/discussion/faq/thread/f904da69/?limit=25#333f


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/ngsep/discussion/faq/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Jorge Duitama

        Jorge Duitama - 2018-12-20

        Hi Julian

        First of all sorry for the delayed answer. I took a look to the QualByDepth calculation and we definitely do not have a similar filter. Reading the documentation, normalizing QUAL by depth sounds counterintuitive to me because in principle more evidence should be translated in better quality. They claim that "variants in regions with deep coverage can have artificially inflated QUAL scores" sounds more like an issue with the GATK model to calculate QUAL scores than an inherent aspect of the data.

        In NGSEP the QUAL field is always less or equal than 255 and it is calculated as the maximum GQ value of the samples genotyped. I think that this calculation is consistent with the definition of QUAL in the VCF format, which is basically the probability of existance of the variant encoded as Phred score. The QualByDepth filter may be an indirect way to filter out some variants within duplications. For that case, in NGSEP you can use a catalog of repetitive elements to filter directly those regions. Moreover, if you have WGS and if the coverage distribution looks normal, you can call CNVs along with SNVs with the FindVariants command and then you can filter out SNVs (or small indels) within the predicted CNVs.

        If the question is related to the thread above, you can always use the structure of the population to filter variants by MAF and observed heterozygosity.

        Let me know your thougths or further questions on this matter.

        Jorge

         

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.