Read Me
drop
in the
bucket
neural
network
Because what the world needs is one more, lightweight neural network library
in C.
FEATURES
- written in straight C (not C++!)
- "lightweight" (~6000 lines of code!)
- piggybacks off the GNU Scientific Library (GSL) for minimization routines
- two types of neural network: node-by-node and term-by-term
INSTALLATION
- Make sure you have the GSL installed. See below for other dependencies.
- Download the library
- Edit the Makefile, changing path names, compilers and options to match
your system
- Type "make" then "install"
- To test the system, go to the 'test' directory and use the makefile provided
For the image-handling routines, you will need to have ImageMagick installed
and in your path as well as access to 'popen' and 'pclose' in the C IO library.
popen is part of the Posix standard.
USAGE
There are two main programs: 'dib_train' for training a neural network, and
'dib_predict' for making predictions from a pre-trained neural network. To get
the command syntax and a list of options, type the command name with the '-H'
switch. Each data file comprises a simple, binary dump of the data in row-major
order with a four-byte header giving the number of columns. There are several
file conversion routines supplied with the library, but for a more complete set
of routines, please try the 'libmsci' libary on Github.
The three most important switches are:
-c Cost function. Measures the "distance" between the modelled outputs
and the training data ordinates. Not used for least-squares methods.
-m Minimization algorithm: minimizes the above function.
-n Type and topography of neural network or machine learning model.
Currently there are two main types: node-by-node ('nn_gen' prefix)
and term-by-term ('nn_sl' prefix). Only node-by-node supports
convolutional networks.
Each cost function, minimization method, and machine learning model will
have parameters of its own that must be set.
Switch Option(s) Associated switches
-c lsq -C
-m anneal -f, -i, -N, -s, -t
-m gsl:nmsimplex* -e, -I, -N, -t
-m gsl:trs* -D, -M, -N, -R, -S, -t -T -U, -L
-m gsl_conjugate* -l, -N, -t
-n ff_gen:perceptron -b, -p
-n ff_gen:1hidden, ff_sl:1hidden -b, -h, -p, -O
Statistical Classification
Using the library to perform statistical classification is a bit more
complicated than simply training the NN with the training routine, then making
predictions with the prediction routine. Some pre- and post-processing are
necessary. Assume the classification data is in a simple, binary dump of
four-byte integers, 'classes.dat', and that there are eight (8) classes. First
we need to decide on a multiclass coding then convert the integer data to
floating point. To output a coding matrix chosen from five different ones, use
the 'dib_print_ecc' command:
> dib_print_ecc -Q 2 8 > coding_matrix.txt
(If five are not enough to choose from, I would once again suggest heading to
the 'libmsci' project.) Next, we use the result to code the multiclass classes
into binary classes:
> cls2vec -M coding_matrix.txt classes.dat classes.vec
Now, assuming we have some matching coordinate data, call it, 'coord.vec',
we can train the model:
> dib_train -C 1 -m gsl:trs_lm -n ff_gen:1hidden coord.vec classes.vec model.txt
If we have some unrelated coordinate data, call it 'test.vec', we can start
making predictions:
> dib_predict test.vec model.txt output.vec
Before we can use this, we need to reverse the coding:
> multiclass_solver output.vec coding_matrix.txt prob.vec
But we still aren't done. The result, contained in 'prob.vec' contains only
probabilities, not classes. To convert these probabilities to classes:
> dib_classify prob.vec result
where 'result.cls' contains the winning classes while 'result.con' contains
the winning probabilities normalized to lie between 0 and 1.
Designing Your Own Neural Networks
While there are few tools specifically for designing custom topologies (for the
beginnings of such an endeavour, see the header file, 'dib_ff_tools.h') the
implementation of the neural networks is fully general and can handle any
topology desired. The networks are stored in simple ASCII files and 'dib_train'
can take these files as input using the '-a' option.
There are two implementations: node-by-node, (see 'dib_ff.h') and term-by-term
(see 'dib_ff_sl.h'). The node-by-node implementation can handle any type of
feed-forward NN including convolutional neural networks. The files have the
following format:
- a one line header containing the type number of the NN (1)
- a one line header containing: number of inputs, number of outputs,
number of nodes
- a one line header containing: maximum number of derivatives required for
storage (set this to the total number of nodes), value for constant term
- node-by-node listing with two lines for each node:
1. first line contains the number of terms followed by a colon then the
index of each parent node*
2. second line contains the name of the activation function followed by
the index of each corresponding coefficient
- trained NNs are rounded out by a list of trained coefficients
* Node indexing starts with the input vector. The constant value, if present,
is the last element in the input vector.
The term-by-term implementation can handle any type of feed-forward NN except
convolutional as there is a one-to-one correspondence between each term and
each coefficient. It is, however, slightly more efficient. Files have the
following format:
- a one line header containing the type number of the NN (4)
- a one line header containing: number of inputs, number of outputs,
number of nodes
- a one line header containing the scaling value for the constant term
- a one line header containing the number of coefficients/terms
- term-by-term listing with one line for each term and three records per line:
1. index of node*
2. index of parent node
3. name of activation function if it is the first term in the node,
otherwise it is empty
- trained NNs are rounded out by a list of trained coefficients
* Node indexing starts with the first node. Input vector has negative indices.
The constant value, if present, is the last element in the input vector.