Read Me
RegnANN, software for Reverse Engineering Gene Networks with ANN using GPGPU
(C) 2010 Marco Grimaldi <marco.grimaldi@gmail.com>
This program is free software: you can redistribute it and/or modify
it underthe terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
1.0 COMPILING
To compile the software you are required to install CUDA SDK [1] for your platform.
We provide a reference Makefile that should be good enough for standard installations(32bits).
To compile:
$> make
2.0 RUNNING
To run:
$> ./regnann
the software, when called with no parameters will print the help page on the standard out.
It takes an input file with extesion .gel (gene expression levels - ascii file)
which is the gene expression matrix (NxM matrix with values _SPACE_ separated), gene along
the columns, experimental values on the rows.
regnann prints on standard ouput some informations plus the inferred adiacency matrix.
The adiacency matrix is also saved in absolute value on the file regnann.out
3.0 TESTING
In the subdirectory './data/' we provide some test expression files (Escherichia coli[2]) and
ground truth (adjacency matrix) of the network. Expression profiles are NOT normalized, thus
can and should NOT be used with 'regnann' (values are supposed to be normalized between -1
and +1)
The software has been tested on OSX 10.5.8, Ubuntu 9.04, Ubuntu 10.04 and RedHat EHL 5.
To avoid memory allocation issues we recommend to use CUDA progammable GPUs with at least
512 MB of VRAM. 1GB of VRAM is _highly_ recommended.
4.0 EXAMPLES
4.1 No arguments
./regnann
NAME:
regnann
SYNOPSIS:
regnann [OPTIONS] -train TRAIN_FILE -test TEST_FILE
OPTIONS:
-h print this help
-tlpgr perform graph regression using two-layer perceptron, sub-options:
-bias [value] set the bias of the network, usually [-1,0,+1] -- default for regression: 0
-nhidd [value] set the number of hidden neurons, default = sqrt[n_inputs*n_outputs]
-lrate [value] set the learning rate (float > 0.0, default = 0.010)
-epochs [value] set the learning epochs (integer > 0, default = 1000)
-mom [value] set the learning momentum (float > 0.0, default = 0.100)
-nodadm do not perform adjacency matrix discretization
-dthreshold [value] set threshold value for adjacency matrix discretization (float >= 0.0, default = 0.5)
-train [value] set train file [.gel]
-seed set the randomizer seed [default:-1, randomize from timer]
-GPUID set the ID of GPU to use [default:0]
4.2 Test file (not provided)
./regnann -tlpgr -train test_001.gel
loading file: test_001.gel
Dataset Info ------------------------------------
Number of Genes: 3
gene 1, instances: 110
gene 2, instances: 110
gene 3, instances: 110
------------------------------------ Dataset Info
Correlation Gene 1
TwoLayerPerceptron:
number of inputs: 1
number of hiddens: 1
number of outputs: 2
bias: 0.000000
Training Parameters:
learning rate: 0.010000
momentum: 0.100000
epochs: 1000
Correlation Gene 2
TwoLayerPerceptron:
number of inputs: 1
number of hiddens: 1
number of outputs: 2
bias: 0.000000
Training Parameters:
learning rate: 0.010000
momentum: 0.100000
epochs: 1000
Correlation Gene 3
TwoLayerPerceptron:
number of inputs: 1
number of hiddens: 1
number of outputs: 2
bias: 0.000000
Training Parameters:
learning rate: 0.010000
momentum: 0.100000
epochs: 1000
Results:
Adjacency Matrix ------------------------------------
+0.00 -0.00 +1.00
-0.00 +0.00 +0.00
+1.00 +0.00 +0.00
------------------------------------ Adjacency Matrix
Execution time: 1.574 secs
5.0 REFERENCES
[1] http://www.nvidia.com/object/cuda_home_new.html
[2] Peregrin-Alvarez, J., Xiong, X., Su, C., and Parkinson, J. (2009). The Modular Organization of Protein Interactions in Escherichia coli. PLOS Comput. Biol., 5(10), e1000523.