Name | Modified | Size | Downloads / Week |
---|---|---|---|
code | 2023-05-05 | ||
annotation | 2023-05-05 | ||
data | 2023-05-05 | ||
reference | 2023-05-05 | ||
supplement | 2023-05-05 | ||
README.txt | 2023-05-05 | 3.5 kB | |
Totals: 6 Items | 3.5 kB | 0 |
#Anna Maria Langmueller #Vetmeduni Vienna #May 2023 #Project: The genomic distribution of transposable elements is driven by spatially variable purifying selection #################### # README # #################### This text document contains a description of the provided source code to the project "The genomic distribution of transposable elements is driven by spatially variable purifying selection." ############### # annotation # ############### Contains D. simulans annotation used for the analysis of the experimental populations in this work (see also Palmieri et al., 2015) ############ # code # ############ PFlH_AML_20230203_code_prep-TEcalls.R: R Code for annotating individual P-element insertion sites for 5 annotation features, as well as ORCs and shared sites (i.e., shared P-element insertion sites between D. simulans and D. melanogaster, see also Kofler et al., 2015). P-element insertion sites were called as described in the M&M section of this work. Necessary annotation tracks (5UTR, exon, mRNA, CDS, 3UTR) were extracted from the D. simulans annotation (annotation folder) using rtracklayer and stored as .bed files beforehand. ORCs and shared sites are provided as Supplementary Files (Supplementary File 6 & 7). PFlH_AML_20230207_code_analysis-ms.Rmd: R Markdown that generates all main & supplementary figures, except for Figure S1 (see Supplementary File 1 for raw data) and Figure S2 (see Supplementary File 3 for raw data), and conducts all statistical tests of this work. Requires the following input files (see section Input Data in the RMarkdown for more details): 1. annotated P-element insertion sites (Supplementary File 2 (joint analysis) or 4 (separate analysis)) 2. P-element counts inside/outside ORCs/shared sites (generated by PFlH_AML_20230203_code_prep-TE.calls.R) 3. type of P-element call (use 1 for "joint", and 3 for "separate") 4. annotation tracks of D. simulans reference genome (annotation folder) in .bed format 5. main chromosome lengths (data/genome.mainChr.sizes) 6. ORC .bed file (Supplementary File 6) 7. D. simulans recombination rate (data/dsim.rr_LDjump-LOESS-0.1-sex-mimicree2.txt, MimicrEE2 format) 8. shared sites .bed file (Supplementary File 7) 9. level of infraspecific polymorphisms (Supplementary File 8) 10. interspecific conservation scores (Supplementary File 9) 11. persistent P-elements mapped to D. melanogaster reference genome v6 (data/PFlH_AML_20220512_data_Psim-in-Dmel-v6.txt (joint analysis), data/PFlH_AML_20230305_data_Psim-in-Dmel-v6-repCall.txt (separate analysis)) 12. enrichment values (observed:expected) for P-element insertion sites across 5 annotation features + ORCs/shared sites of a natural D. simulans population (data/PFlH_AML_20230126_res_natPop-enrichment.txt) 13. frequency estimates for phase-1 P-element insertion sites (Supplementary File 3) ########## # data # ########## Contains additional files necessary to run PFlH_AML_20230207_code_analysis-ms.Rmd (see section Input Data in the RMarkdown for more details) ############# # reference # ############# Contains D. simulans reference genome used for the analysis of the experimental populations in this work (see also Palmieri et al., 2015) ############## # supplement # ############## Contains supplementary files accompanying this work - partly necessary to run PFlH_AML_20230207_code_analysis-ms.Rmd (see section Input Data in the RMarkdown for more details)