GUI vs XMFA Text File
Brought to you by:
koadman
Hello all,
I have aligned 2 whole genomes, one reference FASTA provided by NCBI and our own de novo assembled FASTA. When viewing the alignment on the GUI, we find a large insertion (~3000bp) at around 2.7Mb. When we pull the actual positions for this insertion from the text file, the insertion is at 3.7Mb. Why are these positions different? (see attachments for screenshots of both the text file and GUI) And which one should we consider as correct?
Any help is appreciated!
Thank you,
Sarah Ramirez-Busby
Biomedical and Medical Informatics
San Diego State University
Hi Sarah, the text screenshot does not appear to correspond to an XMFA file. Could that be from the .backbone or .bbcols file instead? If the numbers resulted from some processing of the XMFA data on your end, then I would suggest checking the code which processed the XMFA and/or posting that code here so we can better understand the issue.
You're correct, the format of the text is output from a code I found in a
publication. The code pulls out information from the XMFA file. I used a
published code that parses the XMFA file and extracts indels. Code is
attached with corresponding publication.
We are needing positions of indels so if you have any suggestions as to how
to extract that from the XMFA file, we'd appreciate it. I found the code
searching through Google on what others have done to find indels.
Thanks for any help!
Sarah
Sarah Ramirez-Busby
Master's Candidate, Bioinformatics
San Diego State University
Valafar Lab, GMCS 403
On Thu, Jan 15, 2015 at 5:27 PM, Aaron Darling koadman@users.sf.net wrote:
Related
Bugs: #38
Thanks Sarah, I'll check out that code.
In the meantime, I suggest you have a look at the .backbone file which progressiveMauve generates by default. It contains information on indels and I believe would serve your purpose.
Hi Aaron,
I've been looking over the .backbone file (attached) but I'm seeing
negative numbers in the first sequence (our de novo assembled genome). I'm
not sure how to interpret these.
Two questions:
Lines 5-7 in sequence 0 would be considered deleted from sequence 1 or were
inserted into sequence 0?
Line 8 deleted from sequence 0 or inserted into sequence 1?
Thank you for your help!
Sarah
Sarah Ramirez-Busby
Master's Candidate, Bioinformatics
San Diego State University
Valafar Lab, GMCS 403
On Thu, Jan 15, 2015 at 6:29 PM, Aaron Darling koadman@users.sf.net wrote:
Related
Bugs: #38
Hi Sarah, your attachment didn't show up. I would suggest reading this page to understand the format of the file, meaning of negative numbers, etc:
http://darlinglab.org/mauve/user-guide/files.html
See the section entitled "The Progressive Mauve backbone file"