Design notes (GNU libmatheval manual)

4.1 Design notes

As usual with a free software project, ultimate reference for anyone willing to hack on it is its source code. Every effort is put to have source code properly commented; having in mind that GNU libmatheval is rather simple project, it is reasonable to expect that this would be enough for anyone interested in project internals to get acquainted with it. Still, this section will briefly explain project design. See Project structure section for description of where each functionality is located in source code.

Mathematical functions are represented as trees in computer memory. There are five different nodes in such a tree: number, constants, variables, functions, unary operations and binary operations. Single data structure is employed for tree nodes, while union is used over what is different among them. Numbers have unique value, unary and binary operations have unique pointer(s) to their operand(s) node(s). To represent constants, variables and functions, a symbol table is employed; thus constants, variables and functions have unique pointers to corresponding symbol table records (functions also have unique pointer to their argument node). All operations related to functions (e.g. evaluation or derivative calculation) are implemented as recursive operations on tree nodes. There exist a node operation that is not visible as external procedure and this is node simplification; this operation is very important regarding overall efficiency of other operations and is employed each time when new tree created.

Symbol table is implemented as hash table, where each bucket has linked list of records stored in it. Records store information of symbol name and type (variable or function), as well as some unique information related to evaluation: variable records store temporary variable value and function records store pointer to procedure used to actually calculate function value during evaluation. Hashing function described in A.V. Aho, R. Sethi, J.D. Ullman, “Compilers - Principle, Techniques, and Tools”, Addison-Wesley, 1986, pp 435-437 is used. Symbol tables are reference counted objects, i.e. could be shared.

Evaluator objects actually consists of function tree and reference to symbol table. Most of operations on evaluator objects are simply delegated to function tree root node.

For parsing strings representing mathematical functions, Lex and Yacc are employed. Scanner is creating symbol table records for variables, while for constants and functions it is only looking up existing symbol table records (before starting scanning, symbol table should be populated with records for constants and functions recognized by scanner). Parser is responsible for building function tree representation.

Couple error reporting procedures, as well as replacements for standard memory allocation routines are also present. These are rather standard for all GNU projects, so deserve no further discussion. Further present in project are couple procedures for mathematical functions not implemented by C standard library, like cotangent, inverse cotangent and some hyperbolic and inverse hyperbolic functions.

Also present in project are stubs for Fortran code calling library. These stubs uses knowledge of GNU Fortran 77 compiler calling conventions, take parameters from Fortran 77 calls, eventually mangle them to satisfy primary C library interface and call library procedures to actually do the work, finally eventually mangling return values to satisfy Fortran 77 calling conventions again.

Most important thing to know before criticizing library design is that it is intentionally left as simple as it could be. Decision is now that eventual library usage should direct its improvements. Some obvious and intended improvements if enough interest for library arise are enumerated in Intended improvements section. If having further suggestions, pleas see Bugs sections for contact information.