Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo
I'm using PBjelly to improve my genome assembly using long sequence reads.
The run completes successfully, and I'm now trying to understand what has
been changed in the assembly and why. This has resulted in a couple of
questions on the output generated:
1a) Why and how is the assignment of the new reference names done by
PBJelly? For example, I have 3 reference scaffolds, that after renaming by
PBJelly have names that look like:
2) In the gap_fill_status.txt file I see that a gap gets filled in
What do the .1 and .2 mean in ref0158577.1e3_ref0158577.2e5?
How do I figure out in which of the three original scaffolds (that all
have a new name containing 'ref0158577') have been filled?
3) In jelly.out.fasta the scaffolds have new names looking like Contig0
etc. Is there somewhere where I can find back from which of the original
scaffolds this contig is derived?
4) In gap_fill_status.txt I see the following types of changes:
-What does 'overfilled' mean? Is this an inter-scaffold connection? If not,
how can I see which inter-scaffold connections have been made (if any)?
-In jelly.out.fasta the smallest contig is ~300 nt, while in my input file
the smallest scaffold was 1000 nt. This indicates that pbjelly has split
something up. Where can I find back why one or more scaffolds have been
split up, and which ones?
-Could you give a short explanation on what a line like this in the
gapInfo.bed file means:
na na ref0155877_0_0 3
Sorry for all the questions, and thanks a lot for your help!
With kind regards,
---------- Forwarded message ----------
From: Nikkie van bers email@example.com
Date: 15 July 2014 17:11
Subject: questions on pbjelly output