First of all, thanks for the tool. I have a couple of questions regarding the de novo assembler for GBS data:
When we select to ignore 5 nucleotides from the 5’ end and 5 nucleotides for the 3’ end, and we have paired-end ddRAD, this means it will ignore the 5 first nucleotides from R2 (which is actually the 3’ end of the genomic fragment)?
The output VCF from an analysis where I selected 4x for ploidy only has diploid genotypes (x/x, instead of x/x/x/x like from Freebayes for example), am I interprenting something wrong?
Thanks,
Edgardo
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your interest in NGSEP. Regarding the first question, the filtering is performed by read, not by fragment. In your example, 5 bp will be ignored from the 5' end of the two reads (both ends of the fragment) and from the 3' end of each read (somewhere in the middle of each fragment.
For polyploids, after many rounds forth and back, we decided to maintain the GT field as if it were diploid (x/x) and add the format field ACN having the number of copies of each allele in the genotype call. For example, a heterozygous call with one alternative allele with format GT:ACN should look like (0/1:3,1)
Let me know if you have further questions
Jorge
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First of all, thanks for the tool. I have a couple of questions regarding the de novo assembler for GBS data:
When we select to ignore 5 nucleotides from the 5’ end and 5 nucleotides for the 3’ end, and we have paired-end ddRAD, this means it will ignore the 5 first nucleotides from R2 (which is actually the 3’ end of the genomic fragment)?
The output VCF from an analysis where I selected 4x for ploidy only has diploid genotypes (x/x, instead of x/x/x/x like from Freebayes for example), am I interprenting something wrong?
Thanks,
Edgardo
Hi Edgardo
Thanks for your interest in NGSEP. Regarding the first question, the filtering is performed by read, not by fragment. In your example, 5 bp will be ignored from the 5' end of the two reads (both ends of the fragment) and from the 3' end of each read (somewhere in the middle of each fragment.
For polyploids, after many rounds forth and back, we decided to maintain the GT field as if it were diploid (x/x) and add the format field ACN having the number of copies of each allele in the genotype call. For example, a heterozygous call with one alternative allele with format GT:ACN should look like (0/1:3,1)
Let me know if you have further questions
Jorge