Sequencing technologies, such as Illumina sequencing, provide the sequences of
short "reads" of DNA that come from random positions on the genome. These reads
then must be assembled de-novo into the original genome, or, if there is a
reference genome available mapped onto the reference genome. These tasks, especially de-novo assembly, become more difficult if the genome is heterozygous.
het-smooth is a experimental program to smooth out heterozygosity in DNA sequence reads by identifying isolated SNPs and changing each one to only one of the heterozygous variants. It is intended for use in sequence data from diploid genomes. It accepts input data in FASTA or FASTQ format, and for each input file, it writes an output FASTA or FASTQ file containing the reads with a reduced rate of heterozygosity. See the README for more details.
There is no official release yet; please check out the code from the git repository.
Be the first to post a review of het-smooth!