Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
Figures | 2021-01-15 | ||
Stochastic Simulations | 2021-01-15 | ||
NB_Software | 2020-09-10 | ||
RESTAMP_Analysis | 2020-09-07 | ||
PS_Analysis | 2020-09-07 | ||
README.txt | 2021-01-15 | 5.4 kB | |
Totals: 6 Items | 5.4 kB | 0 |
The purpose of the scripts contained in "Code_Mahmutovic_et_al_CSBJ_2020/" is to reproduce the plots in the manuscript Mahmutovic et. al., RESTAMP - Rate estimates by sequence tag analysis of microbial populations., CSBJ, 2020. In addition, the software for analyzing next-generation sequencing data to produce founder population size values is provided. All scripts are free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. "NB Software/" Contains the software for analyzing next generation sequencing data to produce founder population size values. The main file is estimateFounderPop.m which needs to be called from the matlab command window to start the program. Upon starting the user will be asked to provide information in a sequence of GUI windows. The information is in turn: 1) Select the strain barcode identifier corresponding to the DNA sequence for the strain barcode region. The mapping between the strain barcode identifier and the actual sequence can be found and changed at the top of estimateFounderPop.m. 2) Select the folder containing the sequence files. A test data set is given under "Test_Dataset". Note that the gzipped sequence files need to be named as 001, 002, etc. 3) Select the reference columns. The founder population size values are calculated relative to the initial state of the culture at t=0 (see the manuscript for more information). Typically the reference state is sampled more than once. If these samples correspond to 001, 002 and 003 then the reference columns are 1,2,3. 4) Enter the number of expected barcodes. The user should try the software on the test data set to check that all the required matlab packages are installed. The strain barcode to use for analyzing the test data set is 2710, the number of unique barcodes is 1000 and the reference columns are 1,2,3. The output is written to the folder "output/" created in the same folder as the sequence files. The output folder contains excel and csv files with the founder population size values (estimateMatrix.xlsx/csv) and the tally of the barcode sequences (bigMatAllData.xlsx/csv). The output folder in addition contains a variable "f0.mat" which contains the mean proportion of subpopulations for all tags in the reference columns. WARNING: Determining the founder population size values is protocol-specific (see manuscript). This means that the equation for calculating founder population size values from tag frequencies changes depending on the protocol. Specifically, the equation changes depending on the number of experimental bottlenecks in the protocol (see equation 7). This is discussed in section 2.2 in the manuscript and exemplified in the results section for the results in Figure 4 and Figure 5. The user will need to manually define the equation for the founder population size called BSize on line 228 in estimateFounderPop.m to correspond to the specific experimental protocol. The default programmed value is for m=1 bottlenecks, i.e. the sequencing bottleneck. "PS Analysis/" Contains the scripts for determining division and death rates from plasmid segregation data. It's subdivided in "Emulated Death" and "Birth_Death_LB" corresponding to figure 4 and figure 5 in the manuscript, respectively. Further subdivision are with respect to trials/repetitions of the experiment. By running doAnalysis.m the user will get information in terms of plots with the death rates and division rates are contained in the variables delta and beta. The calculation of rates for plasmid segregation is desribed in section 5.4 in the manuscript. "RESTAMP Analysis/" Contains the scripts for determining division and death rates from founder population size values and colony forming units. It's subdivided in "Emulated Death" and "Birth_Death_LB" corresponding to figure 4 and figure 5 in the manuscript, respectively. Further subdivision are with respect to trials/repetitions of the experiment. The trial folders contain the NB and CFU values together with correction factor for the emulated death experiment (see section 2.4) and the script (getRates.m) that implements equations 6ab for calculating rates. In the folder "makePlots" are all rates saved in multiple variables which are used in the makePlot_XXX scripts to make the plots in the manuscript. "Stochastic Simulations/" Contains the code for executing stochastic birth-death tau-leaping simulations for a population of cells composed of distinghuishable subpopulations. The main script is execStochSimulations.m and requires the program stochKit to function (see the comments in execStochSimulations.m). The output from a stochastic simulation run for two different sets of division/death rates are in the folders delta_xxx_beta_yyy_geometric. To reproduce figure 3 and S2 in the manuscript the user need to copy-paste the model_output# files and the input.m file to "Stochastic Simulations/" and run makePlot3_S2.m. See the comments in makePlot3_S2.m for more information. Likewise makePlot1.m and makePlot6.m reproduce the results illustrated in figures S1 and S6, respectively. The user can run stochastic simulation for the plasmid segregation method using execStochSimulations_PS.m.