Hello,
I have a VCF file containing genotypes from polyploid / mixed-ploid plant samples (obtained via freebayes --ploidy / --cnv-map). The vcf was filtered to obtain reprducible SNPs.
I was trying to convert the vcf data to structure format usinf NGSEP VCFConverter, but it seems that VCFconverter produces only haploid data sets.
Is there any possibility to create e.g. tetraplid strfucture input files from tetraploid vcf data?
Best,
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your interest in NGSEP. As a first comment i would have to recommend you to reanalyze the data using our variant caller because, at least based on our benchmarks, I think we do better than Freebayes, especially to infer allele dosages in polyploids.
Talking specifically about the converter, NGSEP produces a file with diploid genotype calls in the format of one row per individual. It is true though that our converter is not transferring the allele dosages information to the input format of structure. We can try to add this in future versions of NGSEP. Unfortunately I also do not know any tool that can generate the input for structure including allele dosages. Depending on how the data is exactly encoded on the VCF, it could be that a simple command could do the job. You can perhaps paste a couple of lines of your VCF to see if there is an easy way to make the conversion.
Best regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello, thank you for your answer (and sorry for my delayed response)
I have added the first 100 lines of the original dDocent vcf file, which indicates that mixed-ploid samples have been analysed.
I can provide more details if necessary.
Best regards,
Thanks for sharing this file. Unfortunately, I just remembered that we can not make a simple command to make this data conversion, mainly because the format of structure is transposed compared to the VCF format. I just added this feature to our roadmap as an improvement to the current converter of NGSEP. I will get back to you as soon as we have this released. My apologies for not being able to offer a quicker solution at this time.
Best regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I have a VCF file containing genotypes from polyploid / mixed-ploid plant samples (obtained via freebayes --ploidy / --cnv-map). The vcf was filtered to obtain reprducible SNPs.
I was trying to convert the vcf data to structure format usinf NGSEP VCFConverter, but it seems that VCFconverter produces only haploid data sets.
Is there any possibility to create e.g. tetraplid strfucture input files from tetraploid vcf data?
Best,
Hi
Thanks for your interest in NGSEP. As a first comment i would have to recommend you to reanalyze the data using our variant caller because, at least based on our benchmarks, I think we do better than Freebayes, especially to infer allele dosages in polyploids.
Talking specifically about the converter, NGSEP produces a file with diploid genotype calls in the format of one row per individual. It is true though that our converter is not transferring the allele dosages information to the input format of structure. We can try to add this in future versions of NGSEP. Unfortunately I also do not know any tool that can generate the input for structure including allele dosages. Depending on how the data is exactly encoded on the VCF, it could be that a simple command could do the job. You can perhaps paste a couple of lines of your VCF to see if there is an easy way to make the conversion.
Best regards
Hello, thank you for your answer (and sorry for my delayed response)
I have added the first 100 lines of the original dDocent vcf file, which indicates that mixed-ploid samples have been analysed.
I can provide more details if necessary.
Best regards,
Hi
Thanks for sharing this file. Unfortunately, I just remembered that we can not make a simple command to make this data conversion, mainly because the format of structure is transposed compared to the VCF format. I just added this feature to our roadmap as an improvement to the current converter of NGSEP. I will get back to you as soon as we have this released. My apologies for not being able to offer a quicker solution at this time.
Best regards