ESTReMo Home

An evolutionary simulator of transcription regulatory networks

Status: Alpha

Brought to you by: ivanerill, pon2, rforder2

Output

A brief description of the output format for ESTReMo. Each generation, statistics on the fittest organism in the population are written to the output file. They have the following form:

 Gen, R_c,   R_f,  R_s,  MI,   gGC,  nGC,  sumBG,    sumNCR,   TF#, PB err,   chemPot, beta, 
 127, -0.08, 4.97, 0.46, 8.32, 0.50, 0.50, 1.38e+03, 6.97e+01, 10,  3.91e-05, -3.11,   1.66, 

 Bg,   Fitness   Max. Fit, Org ID
 1000, 6.39e-04, 1.20e-03, 0x000031CA

 Site,     p, E(p),    Exp    Targ,  GC%,  E(1),   E(2), ...
 TGTAACCG, 0, -15.53,  0.62,  0.50,  0.50, -15.53
 TGTTCCCG, 0, -15.90,  0.75,  0.50,  0.62, -15.90
 AAGACCAG, 0, -15.88,  0.74,  0.50,  0.50, -15.88
 TTCACCAT, 0, -15.82,  0.72,  0.50,  0.38, -15.82
 TGCAACAC, 0, -15.27,  0.52,  0.50,  0.50, -15.27
 TGGTTACG, 0, -15.49,  0.60,  0.50,  0.50, -15.49
 ATGACCAG, 0, -15.86,  0.74,  0.50,  0.50, -15.86
 TGTACGAC, 0, -15.11,  0.47,  0.50,  0.50, -15.11
 ATGTAAGC, 0,  -0.85,  0.00,  0.50,  0.38,  -0.85
 TACATCAG, 0, -15.26,  0.52,  0.50,  0.38, -15.26
 GGGAACAT, 0, -15.74,  0.69,  0.50,  0.50, -15.74
 AAGTACAG, 0, -15.48,  0.60,  0.50,  0.38, -15.48
 GACCACAA, 0,  -0.13,  0.00,  0.50,  0.50,  -0.13
 AATACCAT, 0, -15.41,  0.57,  0.50,  0.25, -15.41
 TGGATACT, 0, -15.20,  0.50,  0.50,  0.38, -15.20
 TGGAACAC, 0, -15.84,  0.73,  0.50,  0.50, -15.84

Field	Explanation
Gen.	Number of generations which have elapsed (number of iterations of genetic algorithm).
R_c	Corrected Rsequence (corrects for small sample bias). Rsequence is a measure of the column wise information content in the motif.
R_f	This is the expected value of Rsequence (a measure of the minimum information content required to identify each of the binding sites.
R_s	Rsequence value prioer to correction.
MI	Mutual information. A measure of the dependency between positions in sites.
gGC	Average GC% of all samples taken from the background
nGC	Average GC% of all samples taken from first genomic segments
sumGC	Sum of exponentials of energy levels which the recognizer assigns to sites in the background (non-binding sites).
sumNCR	Sum of scores assigned to binding sites.
TF#	Quantity of transcription factor molecules in organism.
PB Err	The abs of the difference between the TF# and the sum of the probability mass over the genome.
Org ID	A unique hex number that identifies the organism
BG	Number of times the background is sampled (analgous to the size of the genome).
Fitness	How fit is the organism (lower is better, zero is perfect).
Site	The binding site.
p	The position in the NCR of the best site.
E(p)	The energy level of the best site in this NCR.
Exp	Expression level. How "turned on" the gene associated with this binding site is.
Targ	Minimum activation level to be achieved for perfect fitness
GC%	The GC% for this NCR.
E(n)	Energy level of the n-th position in the NCR.
Target	Minimum activation level required for perfect fitness.