Miguel Coelho - 2016-07-26

Hi all

I have used SoftSV to detect translocations in the yeast genome and would like to know if anyone knows of a simple way to convert the output file.txt into the circos links file format:

1- We need to calculate the size of the translocation by screening the list and finding two translocation calls between same two chromosomes

chromosome1 Position1 Chromosome2 Position2 Support(Paired-End) Support(Split-read) Breakpoint-sequence1 Breakpoint-sequence2

chrref|NC_001133| 212848 chrref|NC_001140| 533392 5 4 CATACTAGTTTCATCCTTAGTAGATTATGTGGAGCAATTACGTATTTTTCCCATATCAGCTTTGTTTTTACCCAACTTTAATA
chrref|NC_001133| 224419 chrref|NC_001140| 551087 6 4 GTGCCTACAAGACCGCTGCTGTGACCGTTTCCAATACGGAAAGGAAACGGTACTGGGA

2- and convert them into:

Duplication_name yeast_chromosome1 Position1 Position 2
Duplication_name yeast_chromosome2 Position 1 Position 2

segdup01 NC_001133 212848 224419
segdup01 NC_001140 533392 551087

This will then be used in Circos to call the links - and make the nice visual plots of my yeast genome rearrangements.

circos-0.69-3

I will try to write a script to 1) pair duplications, 2) re-order data, 3) output a links.conf file. I have very limited programming skills so if anyone has an idea how to do this, that info is most welcome!

thanks

Miguel