I am trying to analyze SkimGBS data of some bean genotypes obtained from Illumina MiSeq using NGSEP program. I am able to do 'multi mapping single end' analysis, but can't run 'multi mapping paired end'! The reads have been trimmed before the analysis and the quality of reads looks good. I was wondering if you would provide me with some advice with regard to this matter. I really appreciate your support and consideration on this matter.
Best,
Mohammad Erfatpour
PhD Candidate, Dry Bean Breeding & Genetics
Department of Plant Agriculture, University of Guelph
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.
Best regards
Jorge
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your prompt response, please find attached the log files of two samples. I just wanted to let you know that for 'Mapping Paired End' I follow the instruction of NGSEP manual: I call Read 1 and Read 2 of each sample and proceed the analysis. I also tried 'Mapping Single End' analysis of already paired reads exported from 'CLC Workbench Genomics Platform' which worked fine, but I think each variant in a position has its reverse complement variant in a nearby position. I would appreciate any comments or suggestions on this matter.
No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.
The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.
Best regards
Jorge
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Jorge: You are absolutely right, the number of reads are not equal after trimming with CLC genomics. But, the raw reads are not equal in Read 1 and Read 2 of none of my samples in the first place. I will try Trimmomatic and hope that it fix the issue, because trimming the raw data improve my alignment rate from about 50% to above 70%. Thanks for your help.
The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.
No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).
Best regards
Jorge
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Jorge: I just remembered that I have undetermined R1 and R2 fastq files which definitely contain the missing pieces of this puzzle. Thanks again for your help.
No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).
Hello Jorge,
I am trying to analyze SkimGBS data of some bean genotypes obtained from Illumina MiSeq using NGSEP program. I am able to do 'multi mapping single end' analysis, but can't run 'multi mapping paired end'! The reads have been trimmed before the analysis and the quality of reads looks good. I was wondering if you would provide me with some advice with regard to this matter. I really appreciate your support and consideration on this matter.
Best,
Mohammad Erfatpour
PhD Candidate, Dry Bean Breeding & Genetics
Department of Plant Agriculture, University of Guelph
Hi Mohammad
No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.
Best regards
Jorge
Hello Jorge,
Thanks for your prompt response, please find attached the log files of two samples. I just wanted to let you know that for 'Mapping Paired End' I follow the instruction of NGSEP manual: I call Read 1 and Read 2 of each sample and proceed the analysis. I also tried 'Mapping Single End' analysis of already paired reads exported from 'CLC Workbench Genomics Platform' which worked fine, but I think each variant in a position has its reverse complement variant in a nearby position. I would appreciate any comments or suggestions on this matter.
Best,
Mohammad
From: Jorge Duitama jduitama@users.sourceforge.net
Sent: Friday, September 20, 2019 11:03 AM
To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
Subject: [ngsep:discussion] Multi Mapping Paired End Issue
Hi Mohammad
No problem. Please share a log file of the process that is failing to try to track the error. Also make sure in the first screen that the read files are properly paired.
Best regards
Jorge
Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#690e
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Hi Mohammad
The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.
Best regards
Jorge
Jorge: You are absolutely right, the number of reads are not equal after trimming with CLC genomics. But, the raw reads are not equal in Read 1 and Read 2 of none of my samples in the first place. I will try Trimmomatic and hope that it fix the issue, because trimming the raw data improve my alignment rate from about 50% to above 70%. Thanks for your help.
Best,
Mohammad
From: Jorge Duitama jduitama@users.sourceforge.net
Sent: Friday, September 20, 2019 5:33 PM
To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
Subject: [ngsep:discussion] Multi Mapping Paired End Issue
Hi Mohammad
The last line of the log files is telling the issue. Probably, the preprocessing step that you are doing with CLC is storing more reads in the first file than in the second file and read aligners such as bowtie2 do not like that. Because you are following a reference-guided pipeline, you could just try to process the raw reads without the preprocessing of CLC. If you definitely want to do some sort of data cleaning, I think that open source software such as Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) can do a much better job.
Best regards
Jorge
Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#0d48
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/
Hi Mohammad
No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).
Best regards
Jorge
Jorge: I just remembered that I have undetermined R1 and R2 fastq files which definitely contain the missing pieces of this puzzle. Thanks again for your help.
Regards,
Mohammad
From: Jorge Duitama jduitama@users.sourceforge.net
Sent: Friday, September 20, 2019 7:29 PM
To: [ngsep:discussion] faq@discussion.ngsep.p.re.sourceforge.net
Subject: [ngsep:discussion] Re: Multi Mapping Paired End Issue
Hi Mohammad
No problem. In such case, you also may want to check if the second file was not completely extracted from the sequencer or if the demultiplexing procedure is creating the mess (just in case, we also offer demultiplexing with an option to trim adapter contamination). If the read ids correspond, you can just check the lines of the files with the "wc -l" command. If the difference is small, you can just remove the last lines of the first file (using the "head" command for example).
Best regards
Jorge
Multi Mapping Paired End Issuehttps://sourceforge.net/p/ngsep/discussion/faq/thread/6ec1a86a4f/?limit=25#0d48/ea73/8bf2
Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/ngsep/discussion/faq/
To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/