Download Latest Version 018.gz (7.3 MB)
Email in envelope

Get an email when there's a new version of RESTAMP

Home / Code
Name Modified Size InfoDownloads / Week
Parent folder
Figures 2021-01-15
Stochastic Simulations 2021-01-15
NB_Software 2020-09-10
RESTAMP_Analysis 2020-09-07
PS_Analysis 2020-09-07
README.txt 2021-01-15 5.4 kB
Totals: 6 Items   5.4 kB 0
The purpose of the scripts contained in "Code_Mahmutovic_et_al_CSBJ_2020/" is to reproduce the plots in the
manuscript Mahmutovic et. al., RESTAMP - Rate estimates by sequence tag analysis of microbial 
populations., CSBJ, 2020. In addition, the software for analyzing 
next-generation sequencing data to produce founder population size values is provided.  

All scripts are free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

"NB Software/"
Contains the software for analyzing next generation sequencing data to produce founder
population size values. The main file is estimateFounderPop.m which needs to be called
from the matlab command window to start the program. Upon starting the user will be asked
to provide information in a sequence of GUI windows. The information is in turn:
 
1) Select the strain barcode identifier corresponding to the DNA sequence for the strain barcode 
region. The mapping between the strain barcode identifier and the actual sequence can be found
and changed at the top of estimateFounderPop.m. 

2) Select the folder containing the sequence files. A test data set is given under "Test_Dataset". 
Note that the gzipped sequence files need to be named as 001, 002, etc. 

3) Select the reference columns. The founder population size values are calculated relative to the
initial state of the culture at t=0 (see the manuscript for more information). Typically the reference
state is sampled more than once. If these samples correspond to 001, 002 and 003 then the reference columns
are 1,2,3. 

4) Enter the number of expected barcodes. 

The user should try the software on the test data set to check that all the required matlab
packages are installed. The strain barcode to use for analyzing the test data set is 2710, the number of unique barcodes is 1000 and the
reference columns are 1,2,3. The output is written to the folder "output/" created in the same folder as the sequence files. 
The output folder contains excel and csv files with the founder population size
values (estimateMatrix.xlsx/csv) and the tally of the barcode sequences (bigMatAllData.xlsx/csv). The output folder
in addition contains a variable "f0.mat" which contains the mean proportion of subpopulations for all tags in the reference
columns.

WARNING: Determining the founder population size values is protocol-specific (see manuscript). 
This means that the equation for calculating founder population size values from tag frequencies changes depending
on the protocol. Specifically, the equation changes depending on the number of experimental bottlenecks
in the protocol (see equation 7). This is discussed in section 2.2 in the manuscript and exemplified in the results
section for the results in Figure 4 and Figure 5. The user will need to manually define the equation 
for the founder population size called BSize on line 228 in estimateFounderPop.m to correspond 
to the specific experimental protocol. The default programmed value is for m=1 bottlenecks, i.e. the sequencing
bottleneck.

"PS Analysis/"
Contains the scripts for determining division and death rates from plasmid segregation
data. It's subdivided in "Emulated Death" and "Birth_Death_LB" corresponding to figure 4 and 
figure 5 in the manuscript, respectively. Further subdivision are with respect to trials/repetitions
of the experiment. By running doAnalysis.m the user will get information in terms of plots with 
the death rates and division rates are contained in the variables delta and beta. The calculation
of rates for plasmid segregation is desribed in section 5.4 in the manuscript.

"RESTAMP Analysis/"
Contains the scripts for determining division and death rates from founder population size values
and colony forming units. It's subdivided in "Emulated Death" and "Birth_Death_LB" corresponding to figure 4 and 
figure 5 in the manuscript, respectively. Further subdivision are with respect to trials/repetitions
of the experiment. The trial folders contain the NB and CFU values together with correction factor
for the emulated death experiment (see section 2.4) and the script (getRates.m) that implements equations 6ab 
for calculating rates. In the folder "makePlots" are all rates saved in multiple variables which are used in the
makePlot_XXX scripts to make the plots in the manuscript.

"Stochastic Simulations/"
Contains the code for executing stochastic birth-death tau-leaping simulations for a population of cells 
composed of distinghuishable subpopulations. The main script is execStochSimulations.m and requires the
program stochKit to function (see the comments in execStochSimulations.m). The output from a stochastic
simulation run for two different sets of division/death rates are in the folders delta_xxx_beta_yyy_geometric.
To reproduce figure 3 and S2 in the manuscript the user need to copy-paste the model_output# files and the input.m file to "Stochastic Simulations/" and 
run makePlot3_S2.m. See the comments in makePlot3_S2.m for more information. Likewise makePlot1.m and makePlot6.m reproduce the results illustrated in figures S1 and S6, respectively. 
The user can run stochastic simulation for the plasmid segregation method using execStochSimulations_PS.m.  
 

Source: README.txt, updated 2021-01-15