MathExpression Code

Status: Alpha
Brought to you by: amyznikov
Tree [228d85] master /

History

HTTPS access
File	Date	Author	Commit
examples	2010-03-14	Andrey Myznikov	[228d85] 0.7.3 updates and bug fixes
include	2010-03-14	Andrey Myznikov	[228d85] 0.7.3 updates and bug fixes
msvc	2010-03-14	Andrey Myznikov	[228d85] 0.7.3 updates and bug fixes
Makefile	2009-05-05	A. Myznikov	[c7cd6d] initial commit from hot sources
README	2010-03-14	Andrey Myznikov	[228d85] 0.7.3 updates and bug fixes
make.inc	2009-07-06	Andrey Myznikov	[9faba7] 0.7.1 fixing current state
Read Me

MathExpression-0.7.1 Alpha

Authors     : Andrey Myznikov <andrey.myznikov@gmail.com>
              Sergey Prokopenko <serg.v.prokopenko@gmail.com>
              Elizaveta Evstifeeva <e.evstifeeva@gmail.com>

Last Update : Monday, July 06 2009


MathExpression is the flexible, extensible, fast and easy (I hope:)) to use C++ class
template for creating your own customized math expression parsers for both built-in and
user-defined numerical data types. User-defined functions, functional objects
(functors), operators, named constants, parameter binding, as well as arguments
are supported. The parsed three performs fast and can be used in multi-threaded OpenMP
applications.

This is not an replacement of many other existing calculators which can parse math expressions.
This intended for situations when your project need some additional level of flexibility
for the cost of some acceptable performance penalty. Try this parser if it is your case.

Be aware that this source still under alpha-development stage,
and therefore it can be redesigned frequently. Lookup for
updates at http://sourceforge.net/projects/mathexpression



INSTALL

To install just copy headers ./include/{*.h} into your favorite include
directory where your compiler will able to find them.

The examples were compiled at these platforms:

- SUSE Linux Enterprise Server 10 (ia64) on Itanium2
	ICC 10.0, GCC 4.1.2

- SUSE Linux Enterprise Server 10 (x86_64) on Intel(R) Xeon(R) CPU X5355
	ICC 10.0, GCC 4.1.2

- Red Hat Enterprise Linux Server release 5 (Tikanga) on Intel(R) Xeon(R) CPU 5160
	ICC 10.0, GCC 4.1.1

- Ubuntu 9.04 (jaunty) desktop on both ia32 and x86_64
	ICC 11.0, GCC 4.3.3

To build examples as is, the GNU make and GCC or ICC compiler are required.
Type 'make help' for the hints on configurable parameters.
The structure of examples is very simple, therefore I suggest that you can
easily modify make.inc and makefiles in the ./example/ subdirs as you prefer.

Notes for Windows: Elizaveta Evstifeeva made some modifications to allow compile
these sources using Visual Studio, but we still don't support this system. It may be
that this will compilable and even would work, but this still not supported.

Other note is that the math library implementation can vary between platforms,
therefore if you encounter that some functions referenced in MathExpression.h
are not defined, then just comment them out or provide correct equivalents
for your platform. Note here that the MathExpression.h is just an example,
is used for examples, and is not mandatory part of the library itself.

For GSL (http://www.gnu.org/software/gsl) examples you need to have GSL
library properly installed. On the Ubuntu you can install it typing
'sudo aptitude install libgsl0-dev'.

For the CLN (http://www.ginac.de/CLN) example you need the CLN library installed.
On Ubuntu install 'libcln-dev' package and dependencies.

The GNU levmar example includes the sources of levmar-2.4 obtained from
official web site http://www.ics.forth.gr/~lourakis/levmar. Most likely
you will need to configure the makefiles in the ./examples/gnu_levmar/levmar-2.4
directory to provide correct path to your favorite lapack implementation.
See the gnu levmar documentation for more info.


USE

The MathExpression is not a program and is not a library.
This is just a set of C++ template classes allowing you create your own
programs and libraries. These class templates are declared in MathExpressionBase.h.
The MathExpression.h is not mandatory header, it just contains a few examples of
user-defined parsers, although you can use it as is too.

The basing steps to create your own customized math parser are roughly follows:

1) Select the numerical data type of math expression. This type can be thought
as the 'result type' of math expression, and it will main type used during computation.
The examples here will suppose that you selected built-in 'double', although it can
seems trivial, the potential applications range is wide enough.

2) Derive your own class for MathExpression :

#include "MathExpressionBase.h"

class MathFunction
  : public MathExpressionBase<double>
  {
  typedef MathExpressionBase<double> mybase;
public:
  typedef double value_type;
  MathFunction();

protected:
  bool parse_number( double * value, const char * curpos, char ** endptr );
  };



3) Implement pure virtual routine parse_number() to convert string to the number.
  Obviously this can't be done at base class level, therefore you must made this.

bool MathFunction :: parse_number( double * value, const char * curpos, char ** endptr )
  { * value = strtod(curpos,endptr);
    return *endptr>curpos;
  }


4) Customize your parser providing your own:
  - binary and unary operators
  - functions
  - bound parameters
  - named constants
  - named arguments

You may made this at any time and at any point of code; below is an example of
doing this in the constructor body.

Note, that for most of functions there are 2 ways to add it to the MathExpression function list:

1) Adding the reference (pointer) to your function into the function list, like this:
     add_function(&sin,"sin","the sine function");

   At execution time the function will be called by reference;
   The tree node will store pointer to such function, and at compile time
   the compiler will not know to where this pointer points to.
   Thus, compiler can't made much optimizations in this case.

2) Using template instantiation notation:
     add_function<sin>("sin","the sine function");

   This will create special tree node which calls function directly, avoiding dereferencing.
   Compiler will known the address of the function at compile time. This allows to get notable
   performance improvenment when right compiler (and options) are used. For the intel compiler
   use -ip or even -ipo switches to get better performance.

   Note that in Visual Studio we encountered problems with this syntax. As windows is not supported platform,
   we did not much fixes, but it looks like you can solve most of these problems by providing strict function
   signatures as follows:
     add_function<double(*)(double),sin>("sin","the sine function");

MathFunction :: MathFunction()
    {
    // form the binary operations table
    BinaryOpTable.set_num_levels(3);
    BinaryOpTable[0]->add(std::plus<value_type>(),"+");
    BinaryOpTable[0]->add(std::minus<value_type>(),"-");
    BinaryOpTable[1]->add(std::multiplies<value_type>(),"*");
    BinaryOpTable[1]->add(std::divides<value_type>(),"/");
    BinaryOpTable[2]->add(&pow,"**");

    // add unary + and -
    UnaryOpTable.add(&mybase::operator_unary_minus,"-");
    UnaryOpTable.add(&mybase::operator_unary_plus,"+");

    // add math functions

    // 1. prividing pointers:
    add_function(&fabs,"abs");
    add_function(&sin,"sin");

    // 2. providing function names (prefered way):
    add_function<cos>("cos");
    add_function<tan>("tan");

    // add named constants
    add_named_constant(M_PI,"pi");
    add_named_constant(M_E,"e");
    add_named_constant(DBL_EPSILON,"eps");

    // add bound parameters
    bind(&some_global_variable,"myvar");

    // add arguments passed over the stack,
    // providing the name and zero-based positional index of each
    add_argument(0,"x");
    add_argument(1,"y");
    }

Again, this customization can be done at run-time at any place of code.

5) Having the MathFunction class defined, one can create as many instances as he want:

  MathFunction f;
  if( !f.parse("sin(x)*cos(y)" ) )
    { printf("parse error: %s : %s\n",f.get_error_msg(),f.get_error_pos());
    }
  else
    {
    for( double x=xmin; x<=xmax; x+=dx )
    for( double y=ymin; y<=ymax; y+=dy )
      { double args[] = {x,y};
        double z = f.eval(args);
        printf("%g\t%g\t%g\n",x,y,z);
      }
    }

6) For many cases it can be suitable to add operator () to the MathFunction class.
   This will allow to pass such functor object to anywhere a function is required,
   in particular to call one math expression from another. Note that you need provide
   correct signature for the your functor object as your class could have many different operators ().

  double MathFunction :: operator () (double x, double y)
    { double args[] = {x,y};
      return mybase::eval(args);
    }
  ...
  MathFunction f1;
  f1.parse("if(x==0,1,sin(x)/x)");

  MathFunction f2;
  f2.add_function<double(*)(double,double)>(&f1,"sinc");
  f2.parse("sinc(x)*sinc(y)");

  for( double x=xmin; x<=xmax; x+=dx )
  for( double y=ymin; y<=ymax; y+=dy )
    {
    z = f2(x,y);
    ...
    }

7) Other way to call one MathExpression from another is use of add_mathexpression() functions:

  MathExpression<double> f1;
  f1.add_argument(0,"x");
  f1.parse("exp(-x*x)");

  MathExpression<double> f2;
  f2.add_argument(0,"a");
  f2.add_argument(1,"b");
  f2.add_mathexpression(&f1,"f1");
  f2.parse(" f1( a ) * f1 ( b ) ");

  double args[] = {1,2};
  double z = f2.eval(args);

 Note that you not need to provide signature and number of arguments when passing MathExpression object;
 the number of arguments is queried at parse time.

See the ./examples/user-defined-functions for example of using functors.

8) Besides functions with fixed number of arguments you can create variable-argument-list functions.
Such functions should accept array of arguments and have signature as follows:
	double sum( size_t n, const double args[] )
          { return std:: accumulate(args+1,args+n,args[0]);
          }
You need provide the minimal and maximal number of arguments for parser when use add_function.
	add_function<sum>("sum", "the sum of arguments",1,-1);
The max number of arguments equal to -1 means thant fuction can accept unlimited (INT_MAX) number of arguments.

See the implementation of rand() and noise() functions in the ./include/MathExpression.h example header.


USER-DEFINED NUMERICAL TYPES

The MathExpressionBase<> can be parametrized by any data type, but you will need provide
the functions for the math operators to process them. In particular you can create some kind
of 'variant' data type to support run-time checks and conversions.
The performance was one from my main goals when I created the project, therefore I avoid
all run-time checks. If the performance is not your case, then you can use any suitable
data type and any run-time checks and conversions.


FURTHER DETAILS
I have not prepared an valuable documentation now, therefore look please over the examples
provided, as well as the MathExpressionBase.h and MathExpression.h sources itself for more
details. I hope that it should be clear enough how it is organized.


KNOW PROBLEMS
There are know issues, which can lead to some difficulties using the class:

1) Troubles with mixing different numerical types in the MathExpression<>. Ideally, I would
  like to have possibility to safely mix the functions of real, complex, integer, boolean
  and user-defined numerical arguments, but it is not possible with existing implementation.
  Currently all nodes should return the value of the same data type.
  Of course, this does not mean that user-defined functions should accept and return the same type,
  but the conversions could be non-trivial.

  Suppose for example that I would like to create an ComplexMathExpression class, and provide function
  'polar' returning complex number:
    complex polar( double r, double fi );
  This will not compile because there is no straight conversion from 'complex' to 'double'.
  (If this thesis is not clear, then image that you trying to parse something like "polar( sqrt(-1), sqrt(-1) )" )

  The possible workaround will require intensive run-time checks and throwng of exceptions to work properly.
  As an example of my temporary (unsatisfactory) solution see the std_complex example.

  As an better (but probably slower from performance side) solution it can be use of some
  'variant' data type for such conversions and checks. The performance was one from main goals
  when I created the project, therefore I have avoid all run-time checks.


3) C++ function overloading
   You may encounter problems using overloaded functions in C++.

   Assume for an example that you have defined 2 functions with the same name:
      const cln_N conjugate ( const cl_N& x );
      const cln_R conjugate ( const cl_R& x );

   Then, the try to add the pointer to function will cause compile-time error,
   because of compiler can't select the correct version byself:
      add_function(&conjugate,"conj"); // ERROR HERE

   To workaround the problem you need point compiler to which exactly function you mean:
      add_function((const cln_N (*) ( const cl_N& )) &conjugate, "conj");
   or:
      add_function((const cln_R (*) ( const cl_R& )) &conjugate, "conj");

   As alternative, declare pointer to the function you have in mind, and add it then:
      const cln_N (*conj_func)( const cl_N& ) = conjugate; // That's fine
      add_function(conj_func,"conj");


  Of course you can explicitly provide correct signature also in such way:
    add_fnction<const cln_R (*) ( const cl_R& ),conjugate>("conj");

To reduce the number of such type castings, the MathExpressionBase::add_function() has a lots of overloads,
but if you encounter some case which is not handled by these overloads, then direct compiler manually.




***

Performance for OMP_NUM_THREADS=1

_______________________________________________________________
host     : iceland
machine  : Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz
OS       : SUSE Linux Enterprise Server 10 (x86_64)
compiler : icc (ICC) 10.0 20070613

iceland examples/performance> ./performance-test n=100000000
-------------------------------------------------------------
TEST  1 : f(x) = log(x) / sqrt(x)
PARSED TREE : I1 = 0.3036620374447045	T1 = 5.96 sec
NATIVE CODE : I2 = 0.3036620374447045	T2 = 5.32 sec
RATIO (PARSED/NATVE) : 1.1203
-------------------------------------------------------------
TEST  2 : f(x) = sin(x) / cos(x)
PARSED TREE : I1 = 5.2290357946367072	T1 = 6.65 sec
NATIVE CODE : I2 = 5.2290357946367072	T2 = 4.28 sec
RATIO (PARSED/NATVE) : 1.55374
-------------------------------------------------------------
TEST  3 : f(x) = 2+3 * sin(4*x+5)
PARSED TREE : I1 = 0.6360672174988302	T1 = 5.74 sec
NATIVE CODE : I2 = 0.6360672174988302	T2 = 3.42 sec
RATIO (PARSED/NATVE) : 1.67836
-------------------------------------------------------------
TEST  4 : f(x) = erf(x*x/2)*sin(10*x)
PARSED TREE : I1 = -0.0791109330199248	T1 = 12.74 sec
NATIVE CODE : I2 = -0.0791109330199248	T2 = 10.22 sec
RATIO (PARSED/NATVE) : 1.24658
-------------------------------------------------------------
TEST  5 : f(x) = 2*exp(-x*x/2)+3*exp(-(x-1)*(x-1)/4)
PARSED TREE : I1 = 3.4490132813373204	T1 = 15.94 sec
NATIVE CODE : I2 = 3.4490132813373204	T2 = 5.22 sec
RATIO (PARSED/NATVE) : 3.05364



_______________________________________________________________
host     : iceland
machine  : Intel(R) Xeon(R) CPU           X5355  @ 2.66GHz
OS       : SUSE Linux Enterprise Server 10 (x86_64)
compiler : gcc (GCC) 4.1.2 20070115 (prerelease) (SUSE Linux)

iceland examples/performance> ./performance-test n=100000000
-------------------------------------------------------------
TEST  1 : f(x) = log(x) / sqrt(x)
PARSED TREE : I1 = 0.3036620374447045	T1 = 7.33 sec
NATIVE CODE : I2 = 0.3036620374447045	T2 = 8.76 sec
RATIO (PARSED/NATVE) : 0.836758 (hm... what this could mean? Is the time measured correctly?)
-------------------------------------------------------------
TEST  2 : f(x) = sin(x) / cos(x)
PARSED TREE : I1 = 5.2290357946367072	T1 = 8.84 sec
NATIVE CODE : I2 = 5.2290357946367072	T2 = 8.25 sec
RATIO (PARSED/NATVE) : 1.07152
-------------------------------------------------------------
TEST  3 : f(x) = 2+3 * sin(4*x+5)
PARSED TREE : I1 = 0.6360672174988302	T1 = 9.77 sec
NATIVE CODE : I2 = 0.6360672174988302	T2 = 8.12 sec
RATIO (PARSED/NATVE) : 1.2032
-------------------------------------------------------------
TEST  4 : f(x) = erf(x*x/2)*sin(10*x)
PARSED TREE : I1 = -0.0791109330199248	T1 = 16.22 sec
NATIVE CODE : I2 = -0.0791109330199248	T2 = 13.19 sec
RATIO (PARSED/NATVE) : 1.22972
-------------------------------------------------------------
TEST  5 : f(x) = 2*exp(-x*x/2)+3*exp(-(x-1)*(x-1)/4)
PARSED TREE : I1 = 3.4490132813373204	T1 = 13.59 sec
NATIVE CODE : I2 = 3.4490132813373204	T2 = 7.58 sec
RATIO (PARSED/NATVE) : 1.79288
MathExpression Code

Branches

Tree [228d85] master / Download Snapshot History

Read Me

Tree [228d85] master /

History