It would be super nice if Scalpel produced the INFO, FILTER and FORMAT fields in the header of vcf output, see http://www.1000genomes.org/node/101. This would add better compatibility with the Gatk tools.
It would also be great if the sample fields were populated, providing the GT, DP and maybe other tags.
done!
Thanks! In looking at the code and running some examples, I noticed that "$altcov" points to reference allele depth and "$mincov" to the alternative allele depth. I wonder if I'm missing something? Thus my proposition to use my $format_val = "$gt:$altcov,$mincov:$totcov"; (instead of my $format_val = "$gt:$mincov,$altcov:$totcov";)
$mincov is the minimum coverage of the specific mutation described in the vcf line
$altcov is the total coverage of any other allele (reference + other mutations)
Makes sense now!
Further, it would be nice if after FORMAT there'd a the sample name, picked e.g. from the @RG SM field. For example, for the below @RG the sample name would be 10-497-T:
@RG ID:2 PL:illumina PU:2_2014-02-18_tumor-paired SM:10-497-T