Menu

Multi Mapping Paired End Issue

2019-09-19
2019-09-21
  • Mohammad Erfatpour

    Hello Jorge,

    I am trying to analyze SkimGBS data of some bean genotypes obtained from Illumina MiSeq using NGSEP program. I am able to do 'multi mapping single end' analysis, but can't run 'multi mapping paired end'! The reads have been trimmed before the analysis and the quality of reads looks good. I was wondering if you would provide me with some advice with regard to this matter. I really appreciate your support and consideration on this matter.

    Best,

    Mohammad Erfatpour
    PhD Candidate, Dry Bean Breeding & Genetics
    Department of Plant Agriculture, University of Guelph

     
  • Jorge Duitama

    Jorge Duitama - 2019-09-20

    Hi Mohammad

    No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.

    Best regards

    Jorge

     
    • Mohammad Erfatpour

      Hello Jorge,

      Thanks for your prompt response, please find attached the log files of two samples. I just wanted to let you know that for 'Mapping Paired End' I follow the instruction of NGSEP manual: I call Read 1 and Read 2 of each sample and proceed the analysis. I also tried 'Mapping Single End' analysis of already paired reads exported from 'CLC Workbench Genomics Platform' which worked fine, but I think each variant in a position has its reverse complement variant in a nearby position. I would appreciate any comments or suggestions on this matter.

      Best,
      Mohammad


      From: Jorge Duitama jduitama@users.sourceforge.net
      Sent: Friday, September 20, 2019 11:03 AM
      To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
      Subject: [ngsep:discussion] Multi Mapping Paired End Issue

      Hi Mohammad

      No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.

      Best regards

      Jorge


      Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#690e


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
  • Jorge Duitama

    Jorge Duitama - 2019-09-20

    Hi Mohammad

    The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.

    Best regards

    Jorge

     
    • Mohammad Erfatpour

      Jorge: You are absolutely right, the number of reads are not equal after trimming with CLC genomics. But, the raw reads are not equal in Read 1 and Read 2 of none of my samples in the first place. I will try Trimmomatic and hope that it fix the issue, because trimming the raw data improve my alignment rate from about 50% to above 70%. Thanks for your help.

      Best,
      Mohammad


      From: Jorge Duitama jduitama@users.sourceforge.net
      Sent: Friday, September 20, 2019 5:33 PM
      To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
      Subject: [ngsep:discussion] Multi Mapping Paired End Issue

      Hi Mohammad

      The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.

      Best regards

      Jorge


      Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#0d48


      Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/

      To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

       
      • Jorge Duitama

        Jorge Duitama - 2019-09-20

        Hi Mohammad

        No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).

        Best regards

        Jorge

         
        • Mohammad Erfatpour

          Jorge: I just remembered that I have undetermined R1 and R2 fastq files which definitely contain the missing pieces of this puzzle. Thanks again for your help.

          Regards,
          Mohammad


          From: Jorge Duitama jduitama@users.sourceforge.net
          Sent: Friday, September 20, 2019 7:29 PM
          To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
          Subject: [ngsep:discussion] Re: Multi Mapping Paired End Issue

          Hi Mohammad

          No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).

          Best regards

          Jorge


          Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#0d48/ea73/8bf2


          Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/

          To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

           

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.