Menu

Walkthrough_toy

Robert Kofler

Introduction

Despite our best efforts, it may sometimes be unclear how our DSL for describing TE landscape is behaving in detail. In this case, you have the option to either contact the author or test the behaviour yourself using toy examples.

While you are always welcome to contact me (Robert Kofler), evaluating the behaviour of SimulaTE with toy examples has several key advantages:

  • even authors of tools tend to forget details of the implementation; so a response must not necessarily be correct
  • you can answer any questions immediatelly; so its not necessary to wait for the author response
  • you gain confidence in the tool (because you know the behaviour from first hand experience)

Walkthrough: behaviour of TSD algorithm

the questions

In this walkthrough we are interested in the behaviour of the algorithm generating the TSD. In particular we want to find out i) which bases are duplicated and ii) whether child TEs inherit the TSD of the parent TEs. (Of course these details are also explained in the manual; we use these walkthrough solely to demonstrate the generation of toy examples).

the toy example

To answer our question we create the following pgd file

chassis="123456789"
parent="TTT"+2bp
child=parent-{2:"CC"}
3 parent *
7 * child

Note: we used a chassis reference genome of "123456789" to facilitate identification of the exact insertion site. Of course, apart from toy examples a chassis consisting solely of "ATCG" should be used
Note the parent has the sequence TTT and a TSD of 2bp is specificed
Note the child has the sequence of the parent and a nested insertion (CC) at position 2
Note we generate two sequences; one with the TE parent inserted at position 3 and one with the TE child inserted at position 7

We generate the population genome file using:

python build-population-genome.py --pgd toy.pgd --output toy.pg

the result

Following the output (e.g. less toy.pg)

>hg1
123TTT23456789
>hg2
1234567AACCA6789

These two sequences demonstrate
the 5' bases are used for the TSD (the sequence 23 is duplicated)
the child inherits the TSD of the parent (the sequence 67 is duplicated, although we did not specifiy a TSD for the child TE)


Related

Wiki: Home

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.