Menu

prepare quality file for LoRDEC corrected fasta file

Beide
2016-03-29
2016-05-12
  • Beide

    Beide - 2016-03-29

    Hi everyone,
    I'm assemblying a genome from contigs and pacbio data. The contigs were assembled from Illumina paired-end reads. And the pacbio data were in fasta file and were corrected by Illumina reads with LoRDEC v0.6. Therefore, the basepair in fasta were a mixture of upper nucleotide (ATCG) and lower nucleotide (atcg). The upper nucleotide were corrected sequences and lower nucleotide were not corrected by Illumina reads. The example is as following:

    m151127_093359_42266_c100917902550000001823205604301685_s1_p0/8/0_9757
    aaaagagagaggatatAAAGTTTGGATACACCTTCTAATTCAAAGAGTTTGCTTTATTTTCATGACTATG
    GCAATTGTAGATTCACACTGAAAAACTCTGAATTAACACATGTGGAATTATTATATGGAATTATATACAT
    AACAAAAAAGTGTGATAAACTGAAAATATGTCATTTTTGTAGGTTCTTCAAAGTAGCTACCTATTGCTTT
    GATTACTGCTTTGCACACTCTTGTCTTTCTCTTGATGAGCTTCAAGGGGTATTCACCTGAAATGGTCTTC

    Now I want to prepare quality file for this corrected fasta file, but I found fakeQuals.py can only give the same quality for all nucleotide in one fasta file. I'm wondering whether there is a software that can give different qualtiy for upper nucleotide and lower nucleotide in one fasta file? Thanks
    Beide

     

    Last edit: Beide 2016-03-29
    • Beide

      Beide - 2016-03-29

      I've wrote a shell script to do the job above. But which value should I give for the uncorrected nucleotide (atcg) and the corrected nucleotide (ATCG)?

       
  • Adam English

    Adam English - 2016-05-12

    PBJelly doesn't rely on fastq base qualities, so whatever values you fake for your sequences will be fine.

     

Log in to post a comment.