Home .NET Parser of C# mathematical expressions – an amateur’s experience

Parser of C# mathematical expressions – an amateur’s experience

by admin

Computer and human – how difficult it is for us to understand each other. Basically, the programming process is explaining to the machine what you want from it in a language it understands.

As an introduction

In my job, and as a hobby, I am associated with the process of writing code related to mathematical calculations. One of the last tasks was to write software in which the user would be able to enter and use in the calculation, data visualization and optimization of some mathematical expressions by himself. And given my natural laziness and unwillingness to constantly supplement the code library of special mathematical functions came up with the idea – why not implement a delusional student idea, and invent a bicycle parser of mathematical expressions.
Of course, before taking up the inventive process (again, in view of universal laziness), was quite a long rape Yandex and Google for existing implementations. And they found, of course, not a few. But unfortunately we could not find what we wanted to achieve in a particular implementation. And the search criteria were the following:

  • The parser must be implemented under .NET no older than 4.0;
  • It must handle all basic mathematical operators (+, -, *, /, ^, etc.), taking into account their priorities and brackets of different kinds;
  • It should recognize basic functions (like sin, cos), be able to add its own functions to the created parser object, specifying the delegates of methods that calculate their value (for any number of input variables);
  • It must be possible to use constants known to the parser, and add them to the list used in parsing the expression;
  • There must be a mechanism for working with parameters and variables. Variables of different types are required: variables that just store a numeric value, or variables that call an event in which external code defines their value at the time they are called;
  • The functionals mechanism must be implemented (minimum – integration, sum of series, and differentiation)
  • The result of parsing a string expression must be represented in the binary tree object model;
  • Most importantly, the binary tree must be able to map to a Linq.Expression tree and then compile it into a delegate that performs the computation at the speed of the .NET platform itself.

The purpose of bicycle building

Actually, the main purpose of the invention of the bicycle was to be able to compile a matrix expression string into a delegate with a high speed of execution of the value calculation process.
Unfortunately my meager search engine skills, lack of time, energy, and laziness did not yield a positive result in the search for a prototype and decided to embark on a journey on the rake.

Model and basic idea

The object model is two classes : a parser class and a mathematical expression class. Three additional classes are used internally: the tree class of the mathematical expression, the abstract node class of the mathematical expression tree, and the logical element class of the mathematical expression string – the term. In addition, the classes of function, variable, and functional and, respectively, collections of functions, variables, and functionals are implemented (they are nested in the matrix expression class).
The idea, which did not pretend to be a discovery or optimal solution, but was an attempt to approach the problem, was to break down the initial sequence of symbols of the mathematical expression string into some logical components. To form a hierarchical sequence of typed logical blocks of a mathematical expression. And on its basis build the tree of the mathematical expression.
The design began with an idea – "what would I want it to look like in its finished form, and that it would be easy and convenient to use?" I wanted to implement the following usage scenario :
A parser object is created, which is fixed somewhere in the view model or in the business logic. It is configured by adding the necessary constants and defining the functions it should work with. Event subscribers are added to handle unknown functions and variables. And the parser waits for the call of the Parce(string) method.
The main Parce() method takes as input a string of matrix expressions, and the result is a matrix expression object that contains the matrix expression tree.
The matrix object must have collections of functions, variables, and constants that are in it. It should be possible to change the values of these objects, and it should be possible to influence the expression tree in order to modify it. The matrix expression object must have a method for calculating a value (by traversing the matrix tree) that takes a set of input variables as parameters and outputs a numeric value as a result.
The mat.expression object must have a method to convert the mat.expression tree to a System.Linq.Expression object. And a method that allows to get a compiled delegate immediately based on using Linq.Expression mechanisms.
Unfortunately, there are no ready-made implementations of something like this, nor any methods describing the creation of such a parser, to this or that extent, anywhere described.

Object model description

Parser Class Parser of C# mathematical expressions - an amateur's experience
Life in an object (after creation) starts with a call to the Parse method.
public MathExpression Parse(string StrExpression)

/// <summary>Parse the string of the mathematical expression</summary>///<param name="StrExpression"> String representation of mathematical expression</param>///<returns> Mathematical expression</returns>[NotNull]public MathExpression Parse([NotNull] string StrExpression){Contract.Requires(!string.IsNullOrWhiteSpace(StrExpression));Contract.Ensures(Contract.Result<MathExpression> () != null);StrPreprocessing(ref StrExpression);OnStringPreprocessing(ref StrExpression);var expression = new MathExpression(StrExpression, this);ProcessVariables(expression);ProcessFunctions(expression);return expression;}

omitting the contracts, the sense of its work is reduced to preprocessing the string, calling the matrix expression constructor and postprocessing this expression to analyze its variables and functions.
Preprocessing involves two steps :
Private method StrPreprocessing, which removes unnecessary characters from the string :
protected void StrPreprocessing(ref string Str)

/// <summary> Preprocessing of input string expression</summary>/// <param name="Str"> Processed string</param>//Removes all characters from the string, from the set of forbidden charactersprotected virtual void StrPreprocessing([NotNull] ref string Str){Contract.Requires(!string.IsNullOrEmpty(Str));Contract.Ensures(!string.IsNullOrEmpty(Contract.ValueAtReturn(out Str)));Str = new string(Str.Where(f_ExcludeCharsSet.NotContains).ToArray());}

and a method to generate a string preprocessing event, so that the parser user can prepare the string for analysis by himself :
public event EventHandler<EventArgs<string> > StringPreprocessing

///<summary> Preprocessing event of the incoming string</summary>public event EventHandler<EventArgs<string> >StringPreprocessing;/// <summary> Generating the event preprocessing the incoming string</summary>/// <param name="args"> Event argument containing the string to be processed</param>protected virtual void OnStringPreprocessing([NotNull] EventArgs<string> args){Contract.Requires(args != null);Contract.Requires(args.Argument != null);Contract.Requires(args.Argument != string.Empty);Contract.Ensures(args.Argument != null);Contract.Ensures(args.Argument != string.Empty);StringPreprocessing?.Invoke(this, args);}/// <summary> Generating the preprocessing event of the incoming string</summary>/// <param name="StrExpression"> Processed string</param>private void OnStringPreprocessing([NotNull] ref string StrExpression){Contract.Requires(!string.IsNullOrEmpty(StrExpression));Contract.Ensures(Contract.ValueAtReturn(out StrExpression) != null);Contract.Ensures(Contract.ValueAtReturn(out StrExpression) != string.Empty);var args = new EventArgs<string> (StrExpression);OnStringPreprocessing(args);StrExpression = args.Argument;}

After the string is cleaned of garbage and prepared for parsing, it is passed to the matrix expression constructor. The parser itself, which is responsible for the processes of defining functions, constants and variables, is also passed to it.
We will return to the Parser class as we mention its members, but for now… The Mathematical Expression class :
Chart Parser of C# mathematical expressions - an amateur's experience
Constructor, where the call from the Parse method is passed to:
internal MathExpression(string StrExpression, ExpressionParser Parser)

///<summary> Initializing a new mathematical expression</summary>///<param name="StrExpression"> String representation of the expression</param>/// <param name="Parser"> Link to the parser</param>internal MathExpression([NotNull] string StrExpression, [NotNull] ExpressionParser Parser): this(){Contract.Requires(!string.IsNullOrEmpty(StrExpression));Contract.Requires(Parser != null);Contract.Ensures(Tree != null);var terms = new BlockTerm(StrExpression); // split the string into elementsvar root = terms.GetSubTree(Parser, this); // select the root of the tree from the first elementf_ExpressionTree = new ExpressionTree(root); // Create an expression tree from the root}

Again, omitting the block of contracts, first a hierarchical object structure of matrix expression terms is created based on the string. Then, from its very first block, the method for getting the root of the matrix tree is called. And on its basis the tree constructor works.
The first step in parsing a matrix expression is to combine the string representation (a sequence of symbols) into a sequence of logical blocks. Some of them can be recursively nested into each other.
Hierarchy of expression term classes Parser of C# mathematical expressions - an amateur's experience
The string is divided into 4 types of terms :

  • BlockTerm – a term containing an array of terms;
  • StringTerm – term containing a string value
  • CharTerm – a term containing a single character that cannot be associated with regular strings;
  • NumberTerm is a term containing an integer number.

Thus, the whole string is initially represented as one block term, inside which lie the constituent elements.
public BlockTerm(string Str)

/// <summary> New mathematical expression block</summary>/// <param name="Str"> String value of the block</param>public BlockTerm(string Str) : this("", Str, "") { }/// <summary> New expression block</summary>/// <param name="OpenBracket"> Opening bracket</param>/// <param name="Str"> String block value</param>/// <param name="CloseBracket"> Closing bracket</param>;public BlockTerm([NotNull] string OpenBracket, [NotNull] string Str, [NotNull] string CloseBracket): base(string.Format("{0}{2}{1}", OpenBracket ? "", CloseBracket ?? "", Str)){Contract.Requires(!string.IsNullOrEmpty(Str));f_OpenBracket = OpenBracket;f_CloseBracket = CloseBracket;f_Terms = GetTerms(Str);}

And the base class of the term :
abstract class Term {…}

/// <summary>Mathematical expression element</summary>abstract class Term{///<summary> String content</summary>protected string f_Value;/// <summary> Constructor element of a mathematical expression</summary>///<param name="Value"> String content</param>protected Term(string Value) { f_Value = Value; }///<summary> Method for extracting a subtree for a given element of a mathematical expression</summary>/// <param name="Parser"> Mathematical expression parser</param>/// <param name="Expression"> Mathematical expression</param>;///<returns> The math expression tree node, which is a subtree for the given element</returns>;[NotNull].public abstract ExpressionTreeNode GetSubTree([NotNull] ExpressionParser Parser, [NotNull] MathExpression Expression);/// <summary> String representation of a math expression element</summary>/// <returns> The string content of the matrix expression element</returns>public override string ToString() => f_Value;}

Substring partitioning into constituent elements is done by GetTerms method:
private static Term[] GetTerms(string Str)

/// <summary> Get list of elements of mathematical expression from string</summary>/// <param name="Str"> String representation of a mathematical expression</param>///<returns> An array of elements of a mathematical expression</returns>[CanBeNull]private static Term[] GetTerms([CanBeNull] string Str){if(Str == null) return null;if(Str.Length == 0) return new Term[0];var pos = 0;var len = Str.Length;var result = new List<Term> ();while(pos < len){var c = Str[pos];if(char.IsLetter(c) || c == '∫'){Term value = new StringTerm(GetNameString(Str, ref pos));if(pos < len)switch(Str[pos]){case '(':{var blokStr = Str.GetBracketText(ref pos);var block = new BlockTerm("(", blokStr, ")");value = new FunctionTerm((StringTerm)value, block);}break;case '[':{var blokStr = Str.GetBracketText(ref pos, "[", "]");var block = new BlockTerm("[", blokStr, "]");value = new FunctionTerm((StringTerm)value, block);}break;case '{':{var blokStr = Str.GetBracketText(ref pos, "{", "}");var block = new BlockTerm("{", blokStr, "}");value = new FunctionTerm((StringTerm)value, block);}break;}if(pos < len Str[pos] == '{')value = new FunctionalTerm((FunctionTerm)value, new BlockTerm("{", Str.GetBracketText(ref pos, "{", "}"), "}"));result.Add(value);}else if(char.IsDigit(c))result.Add(new NumberTerm(GetNumberString(Str, ref pos)));elseswitch(c){case '(':{var blokStr = Str.GetBracketText(ref pos);var block = new BlockTerm("(", blokStr, ")");result.Add(block);}break;case '[':{var blokStr = Str.GetBracketText(ref pos, "[", "]");var block = new BlockTerm("[", blokStr, "]");result.Add(block);}break;case '{':{var blokStr = Str.GetBracketText(ref pos, "{", "}");var block = new BlockTerm("{", blokStr, "}");result.Add(block);}break;default:result.Add(new CharTerm(Str[pos++]));break;}}return result.ToArray();}

The method starts with checks for empty input string and zero length. Then the current position of the character in the string and its length are fixed, and then in the loop until the end of the string is reached, the character in the current position is considered:
– If it is a letter, or an integral character, it tries to grab the name with the GetNameString method.
private static string GetNameString(string Str, ref int pos)

/// <summary> Get the name from the string</summary>///<param name="Str"> Source string</param>/// <param name="pos"> Position in the string</param>///<returns> Name string</returns>private static string GetNameString([NotNull] string Str, ref int pos){Contract.Requires(!string.IsNullOrEmpty(Str));Contract.Ensures(Contract.ValueAtReturn(out pos) > = 0);Contract.Ensures(Contract.ValueAtReturn(out pos) < Str.Length);var result = "";var L = Str.Length;var i = pos;while(i < L (char.IsLetter(Str[i]) || Str[i] == '∫'))result += Str[i++];if(i == L || !char.IsDigit(Str[i])){pos = i;return result;}while(i < L char.IsDigit(Str[i]))result += Str[i++];pos += result.Length;return result;}

After that, the current character is checked to see if it contains an opening parenthesis. If one of the parentheses is found, then a nested block, bounded by an opening parenthesis and its corresponding closing parenthesis, is extracted from the string. The block term created in this way is placed in the constructor of the functional term with the function name set earlier.
A substring bounded by opening and closing parentheses is selected from the string by the expanding method :
public static string GetBracketText(this string Str, ref int Offset, string Open, string Close)

/// <summary> /// Highlight a substring bounded by a start pattern and an end pattern of the string starting at a specified offset/// </summary>/// <param name="Str"> Input string</param>/// <param name="Offset">/// The offset in the input string at the beginning of the search - at the end of the method corresponds to the place where the search ends/// </param>/// <param name="Open"> Pattern of substring start</param>/// <param name="Close"> Substring ending pattern</param>;/// <returns> A substring enclosed between the specified start and end patterns</returns>;/// <exception cref="FormatException">/// If the line termination pattern is not found, or if the number of line end patterns exceeds/// the number of ending patterns in the input string/// </exception>public static string GetBracketText(this string Str, ref int Offset, string Open = "(", string Close = ")"){var Start = Str.IndexOf(Open, Offset, StringComparison.Ordinal);if(Start == -1) return null;var Stop = Str.IndexOf(Close, Start + 1, StringComparison.Ordinal);if(Stop == -1)throw new FormatException();var start = Start;do{start = Str.IndexOf(Open, start + 1, StringComparison.Ordinal);if(start != -1 start < Stop)Stop = Str.IndexOf(Close, Stop + 1, StringComparison.Ordinal);} while(start != -1 start < Stop);if(Stop == -1 || Stop < Start)throw new FormatException();Offset = Stop + Close.Length;Start += Open.Length;return Str.Substring(Start, Stop - Start);}

First, the indexes of the first occurrences of the beginning and end characters are determined. If the start character is not found, we return empty. If the end character is not found, then it is a format error.
The idea of the method is to sequentially search cyclically for patterns of start and end of a string. We try to find the next opening character. If it is found, and it stands before the closing character, we should update the index of the closing character. The cycle continues until there is a closing character, which is not preceded by an opening one.
A substring between the opening and closing characters gets into the result of the method.
If an opening curly bracket is found after the generated functional term, it starts the body of the functional. We select the contents of the curly brackets into a block and create the term functional, specifying to it the term function, which in this context will contain the name of the functional and its parameters, and the body will be the block in curly brackets.
If no parentheses were found, the name found is a literal (future variable… or constant).
– If the next character of the string was a digit, then an integer starts. Select the substring containing only digits.
private static string GetNumberString(string Str, ref int pos)

/// <summary> Get a numeric string</summary>/// <param name="Str"> Line under study</param>/// <param name="pos"> Initial position in the string</param>///<returns> Digital value string</returns>private static string GetNumberString([NotNull] string Str, ref int pos){Contract.Requires(!string.IsNullOrEmpty(Str));Contract.Ensures(Contract.ValueAtReturn(out pos) > = 0);Contract.Ensures(Contract.ValueAtReturn(out pos) < Str.Length);var p = pos;var l = Str.Length;while(p < l !char.IsDigit(Str, p)) p++;if(p > = l) return null;var start = p;while(p < l char.IsDigit(Str, p)) p++;pos = p;return Str.Substring(start, p - start);}

The result of this method, a string with numbers, goes into the constructor of the integer term.
– If the next character of the string is an opening parenthesis, the block starts. We select its substring with the GetBracketText extension method.
– If the next character is not a bracket, it is an undefined character, which turns into a character term.
All created terms are first gathered into a list, and then returned as an array.
The constructors of all other terms are less interesting. They simply store the resulting parameters in internal fields (possibly with type conversion).
The string will then be transformed into a logical hierarchical structure of nested sequences of terms of different kinds. From this sequence the binary tree of the matrix expression is recurrently constructed.
The basis of the tree is the base class of the matrix expression tree node.
Class hierarchy Parser of C# mathematical expressions - an amateur's experience
Each node is a class that stores references to the root node of the left subtree and the root node of the right subtree, as well as a reference to its ancestor. An abstract node class provides an interface to access ancestor/descendant nodes, traversal methods, functional indexers to get enumerations of nodes related to the current one, recursive methods to get variables, functions, etc., known to a given node (as the root of its subtree). Also, the basic node class provides a number of computable properties: attributes – whether a node is a left/right subtree, whether a node is a root, a tree root reference, a symbolic path to the current node from the tree root and an ancestor iterator.
Tree node Parser of C# mathematical expressions - an amateur's experience
The code of this class allows basic manipulation of tree elements to replace, rearrange, bypass and access them. I will give the individual methods as needed. The full text will take a lot of space. It can be viewed/copied from project sources.
All nodes in the tree can either be computable or used in parsing.
The parsing nodes include a string node, a character node, and an interval node. They are needed to augment certain method structures in the tree, like the integration function, where you have to specify an interval.
Computable nodes are the main ones in the resulting tree structure. They represent all elements that can be described in a matrix expression.
These include :

  • ComputedBracketNode – represents a bracket block
  • ValueNode is an abstract class that represents a value node. Its descendants :
    – ConstValueNode – a node of the numeric value
    – VariableValueNode – the node of a variable (which can be a constant like pi, e, …)
  • FunctionNode – a node representing a function object
  • FunctionalNode – a node, representing an object-function
  • OperatorNode is an abstract node-operator class. Its heirs
    – The simplest matrix operators +, -, *, /, ^;
    – Logical operators ==, !, > , <, , ||;
    – Condition operator"<result_work_of_logic> ?" and together with it the selection operator used "<option_1> :<option_2> "
    – Operator that implements access to a function argument and the function argument name access operator used in conjunction with it.

Process of tree building

The term class declares an abstract GetSubTree method, which allows to obtain a subtree from any term. The process of building a tree begins by calling this method on a block term formed from the initial string.
public override ExpressionTreeNode GetSubTree(ExpressionParser Parser, MathExpression Expression)

/// <summary>Get the root of the subtree of expressions</summary>/// <param name="Parser"> Expression parser</param>/// <param name="Expression"> Mathematical expression</param>;///<returns> Root of subtree</returns>public override ExpressionTreeNode GetSubTree(ExpressionParser Parser, MathExpression Expression){Contract.Requires(Parser != null);Contract.Requires(Expression != null);Contract.Ensures(Contract.Result<ExpressionTreeNode> () != null);var separator = Parser.ExpressionSeparator; // fix the expression separator character// divide a sequence of expression elements into groups, separated by the separator character// extract the expressions tree root from each group and put them into an arrayvar roots = Terms.Split(t => t is CharTerm ((CharTerm)t).Value == separator).Select(g => Parser.GetRoot(g, Expression)).ToArray();if(roots.Length == 1) return roots[0]; // if only one root is found, return it// Otherwise, many roots are foundExpressionTreeNode argument = null; // declare a reference to the argumentfor(var i = 0; i < roots.Length; i++) // pass through all the found roots{var root = roots[i];ExpressionTreeNode arg; // If another root of the treeif(root is FunctionArgumentNode) // - function argumentarg = root; // -- leave it unchangedelse if(root is FunctionArgumentNameNode) // - argument name node// -- create a new named argument of the function.arg = new FunctionArgumentNode(root as FunctionArgumentNameNode);else if(root is VariantOperatorNode root.Left is VariableValueNode)arg = new FunctionArgumentNode(((VariableValueNode)root.Left).Name, root.Right);else // - in all other casesarg = new FunctionArgumentNode("", root); // -- create a function argument without a nameif(argument == null) argument = arg; // if no argument was specified, then we save the resulting node as an argumentelse // otherwiseargument = argument.Right = arg; // store the resulting node in the right subtree of the argument}// if argument was not selected, something went wrong - a format errorif(argument == null) throw new FormatException("Function argument not defined");return argument.Root; // Return the root of the argument}

The method extracts the character that separates expressions in the block from the matrix object passed to it. By default, the delimiter ‘;’ is set to semicolon.
Then the entire array of nested terms in a Linq-sequence is split into subarrays by a delimiter – a character term containing an expression delimiter character. The Split method is responsible for this.
public static T[][] Split<T> (this T[] array, Func<T, bool> Splitter)

/// <summary> Divide the input array into subarrays using the specified method</summary>/// <typeparam name="T"> Type of array elements</typeparam>/// <param name="array"> Split array</param>/// <param name="Splitter"> Method that returns true when a new subarray</param> must be started;///<returns>/// An array of subarray elements of the original array, separated by the elements selected by the specified method./// Selected elements are not included in the result./// </returns>[NotNull].public static T[][] Split<T> ([NotNull] this T[] array, [NotNull] Func<T, bool> Splitter){Contract.Requires(array != null);Contract.Requires(Splitter != null);Contract.Ensures(Contract.Result<T[][]> () != null);var result = new List<T[]> (array.Length);var aggregator = new List<T> (array.Length);for(var i = 0; i < array.Length; i++){var value = array[i];if(Splitter(value) aggregator.Count != 0){result.Add(aggregator.ToArray());aggregator.Clear();}elseaggregator.Add(value);}if(aggregator.Count != 0)result.Add(aggregator.ToArray());return result.ToArray();}

For each sub-array of terms, the parser’s GetRoot method is called, which is designed to find the root of the tree for that group of terms. Then all the roots found are combined into an array.
GetRoot method:
internal ExpressionTreeNode GetRoot(Term[] Group, MathExpression MathExpression)

///<summary> Method for extracting a tree root from a sequence of elements of a mathematical expression</summary>/// <param name="Group"> group of elements of mathematical expression</param>/// <param name="MathExpression"> Reference to a mathematical expression</param>///<returns> Root of the math expression tree</returns>internal ExpressionTreeNode GetRoot([NotNull] Term[] Group, [NotNull] MathExpression MathExpression){Contract.Requires(Group != null);Contract.Requires(MathExpression != null);Contract.Ensures(Contract.Result<ExpressionTreeNode> () != null);// Reference to the last processed tree nodeExpressionTreeNode Last = null;for(var i = 0; i < Group.Length; i++) // loop through all group elements{var node = Group[i].GetSubTree(this, MathExpression); // extract the subtree for the current group element// If the next element of the group...if(Group[i] is NumberTerm) // ...the next element is a number, then{//...if there are two more elements ahead and a successful attempt to read the fractional number delimiter and mantissa has been madeif(i + 2 <Group.Length NumberTerm.TryAddFractionPart(ref node, Group[i + 1], DecimalSeparator, Group[i + 2])i += 2; //...then increase the index of the current group by two.}else if(Group[i] is BlockTerm) //...another block element (with parentheses)node = new ComputedBracketNode( // the next tree node is a calculated blocknew Bracket( // bracket view :(((BlockTerm)Group[i]).OpenBracket), // copy the opening bracket view((BlockTerm)Group[i]).CloseBracket), // copy the type of the closing bracketnode); //Within a tree node//Combine the current node with the previous node of the treeCombine(Last, Last = node); //and set the current node as the previous oneIf(Last.IsRoot Last is VariantOperatorNode Last.Left is VariableValueNode)Last = new FunctionArgumentNameNode(((VariableValueNode)Last.Left).Name);OnNewNodeAdded(ref Last);}// If there is no reference to the previous node, there is a format errorIf(Last == null) throw new FormatException();return Last.Root; // return the root of the tree of the current element}

Here we sequentially look through the input array of expression terms. From each term of the expression we extract the root of its tree (recursion occurs here). After that we have to check :
– If the current term is integer and is at least third from the end of the array, then we try to add a fractional part to the current node.
public static bool TryAddFractionPart(ref ExpressionTreeNode node, Term SeparatorTerm, char DecimalSeparator, Term FrationPartTerm)

/// <summary>Attempt to add a fractional number value</summary>/// <param name="node"> Expression node</param>/// <param name="SeparatorTerm"> Separator</param> Block;/// <param name="DecimalSeparator"> Block with integer part of number</param>/// <param name="FrationPartTerm"> Block with fractional part of a number</param>///<returns> True if the action is successful. False, if the following blocks do not contain the necessary information</returns>public static bool TryAddFractionPart(ref ExpressionTreeNode node, Term SeparatorTerm, char DecimalSeparator, Term FrationPartTerm){var value = node as ConstValueNode;if(value == null) throw new ArgumentException("Invalid tree node type");var separator = SeparatorTerm as CharTerm;if(separator == null || separator.Value != DecimalSeparator) return false;var fraction = FrationPartTerm as NumberTerm;if(fraction == null) return false;var v_value = fraction.Value;if(v_value == 0) return true;node = new ConstValueNode(value.Value + v_value / Math.Pow(10, Math.Truncate(Math.Log10(v_value)) + 1));return true;}

The method specifies a delimiter character for the integer and fractional parts of a decimal number, as well as the next two terms following the current one. If the second term is a character term and contains a delimiter character, and the third is a number term, then the node is replaced by a new node constant value
– The second check is that if the current term is blocky, then a node-block_brackets is formed.
When the checks are completed, a method is executed that combines the node created in the previous cycle with the current :
public virtual void Combine(ExpressionTreeNode Last, ExpressionTreeNode Node)

/// <summary> Combination of the previous and current tree nodes</summary>/// <param name="Last"> Previous tree node (already integrated into the tree)</param>///<param name="Node"> The current node to be inserted into the tree</param>;// ReSharper disable once CyclomaticComplexitypublic virtual void Combine([CanBeNull] ExpressionTreeNode Last, [NotNull] ExpressionTreeNode Node){Contract.Requires(Node != null);if(Last == null) return; // If the previous tree node is not specified, returnif(Node is CharNode) // If the current node is a character node, then{Last.LastRightChild = Node; // just assign it as the rightmost childreturn;}var operator_node = Node as OperatorNode; // represent the current node as an operator nodeif(operator_node != null) // if the current node is an operator...{//trying to get the operator of the previous node ://try to bring the previous node to the operator node type//or try to cast the parent node of the previous node to the type of operator nodevar parent_operator = Last as OperatorNode ?? Last.Parent as OperatorNode;if(parent_operator != null) // If the reference to the previous OperatorNode is obtained (and the current one is Operator)... then{// if the left subtree of the previous operator is empty and the parent of the previous operator is also an operator// op <-set the previous operator's parent// |// op// / // null ?if(parent_operator.Left == null parent_operator.Parent is OperatorNode)parent_operator = (OperatorNode)parent_operator.Parent;if(parent_operator.Left == null) // If the left subtree of the previous operator is empty...operator_node.Left = parent_operator; // set the previous operator as the left subtree of the current oneelse if(parent_operator.Right == null) // otherwise if the right subtree is emptyparent_operator.Right = Node; // set the current operator as the right subtree of the previous oneelse // Otherwise, if there is a conflict of priorities{var priority = operator_node.Priority; // retrieve the current node priority// If the current operator's priority is less than or equal to the previous oneif(priority <= parent_operator.Priority){// then it is necessary to climb up under the tree untilparent_operator = (OperatorNode)parent_operator.Parents// operators encountered along the way have a higher priority than the current operator.TakeWhile(n => n is OperatorNode priority <= ((OperatorNode)n).Priority)// take the last of the sequence.LastOrDefault() ? parent_operator; // if an empty reference is returned, then take the previous operator// at this moment, the previous operator has a higher priority than the current oneif(parent_operator.IsRoot) // if the previous operator is the root of the tree// If the priority of the tree root operator is greater than or equal to the priority of the current operatorif(priority <= parent_operator.Priority)// Assign the left subtree of the current operator to the previous oneoperator_node.Left = parent_operator;else // Otherwise, if the previous operator is not the root{var parent = parent_operator.Parent; // save the reference to the parent of the previous operatorparent.Right = Node; // write the current operator as the right subtreeoperator_node.Left = parent_operator;// write the previous operator as the left subtree of the current one}}else //if the current operator's priority is greater than the previous one{//then we have to descend to the right subtree untilparent_operator = (OperatorNode)parent_operator.RightNodes// operators encountered along the path have left subtrees and the operator priority is less than the current one.TakeWhile(n => n is OperatorNode n.Left != null ((OperatorNode)n).Priority < priority)// take the last of the sequence.LastOrDefault() ? parent_operator; // if an empty reference is returned, then take the previous operator// at this moment, the previous operator has a lower priority than the current onevar right = parent_operator.Right; // keep the right subtree of the previous operatorparent_operator.Right = Node; // the current one falls into the right subtree of the previous operatoroperator_node.Left = right; // the saved right subtree of the current operator is written into the left subtree}}}else // If the previous node is not an operator{var parent = Last.Parent;var is_left = Last.IsLeftSubtree;var is_right = Last.IsRightSubtree;operator_node.Left = Last; // write the previous node as the left subtree of the current oneif(is_left)parent.Left = operator_node;else if(is_right)parent.Right = operator_node;}return; //return}// if the current node is not an operatorif(Last is OperatorNode) // if the previous node is operator{Last.Right = Node; // add the current node to the right subtree of the previous onereturn; //return}//If neither the current nor the previous node is an operator//If previous node was a number or previous node was a bracket and current node is a bracketif(Last is ConstValueNode || (Last is ComputedBracketNode Node is ComputedBracketNode)){//Save the reference to the parent of the previous nodevar parent = Last.Parent;if(parent != null) // if there is a parent// write the multiplication operator of the previous node and the current node into the right subtree of the parentparent.Right = new MultiplicationOperatorNode(Last, Node);else // if the previous node was a tree node//create a new MultiplicationOperatorNode between the previous and the current node, which will be the new root of the treenew MultiplicationOperatorNode(Last, Node);return; // return.}Last.Right = Node;}

This is one of the central methods. It, according to the logic of the matrix tree being created, joins new nodes to the existing one, taking into account node-operators (their priorities). Of course, the method requires refactoring due to the size of its code volume. The logic of its work is given in comments in the code.
When the combination of two nodes is done, the last check is done: if the processed node is the root of the tree and the node is an options node, and in its left subtree the variable node is <Variable> :<option_2> , then it should be treated as the argument node of function <argument_name> :<value_argument> . In this case, the name of the argument becomes the variable name.
When the iteration is complete, a NewNodeAdded event is generated in the parser object, which passes the created node for external processing by the user. The node is passed by reference, so it is possible to completely override the tree being created.
After a subtree has been created for a group of terms by the parser and all roots of such subtrees have been combined into an array in the GetSubTree method of the block term, the method checks :

  • if the array contains only one element, that is what is returned as the result (this is the trivial case)
  • if there are many roots, each of them is packed into a FunctionArgumentNode argument node (in its right subtree), and the next argument is clustered in the left subtree.

The structure of the matrix tree

Thus the tree to be formed meets the following rules :

  • Operator nodes contain their operands in the left and right subtrees
  • Value nodes (variables, numeric values) must be tree leaves (but not necessarily)
  • Bracket block nodes store the value in the left subtree
  • The function nodes have an empty left subtree. The right subtree stores the first argument node.
  • The function argument node in the left subtree stores either the root of the argument tree or the argument name node. The right subtree stores a link to the next argument of the function;
  • Argument name node – The left subtree stores a noncomputable string-name node, and the right subtree stores the root of the argument tree.
  • The function node is a leaf. Parameters and expression are stored separately (allowing more flexibility in redefining the logic of working with them).

The functions and functions themselves are separated into separate objects.
The tree has been built. During the construction process, variable and function nodes were created, among others. Each such node was created by its corresponding term by direct call of the constructor of the node of the corresponding type :
class StringTerm : Term {…}

/// <summary> String element of expression</summary>class StringTerm : Term{/// <summary> The name of the string element</summary>[NotNull].public string Name => f_Value;/// <summary> New string element</summary>/// <param name="Name"> String element name</param>public StringTerm([NotNull] string Name) : base(Name) { Contract.Requires(!string.IsNullOrEmpty(Name)); }/// <summary> An element subtree consisting of a node-variable</summary>/// <param name="Parser"> Parser</param>/// <param name="Expression"> Mathematical expression</param>;///<returns> Tree node with a variable derived from Expression.Variable[Name]</returns>public override ExpressionTreeNode GetSubTree(ExpressionParser Parser, MathExpression Expression)=> new VariableValueNode(Expression.Variable[Name])}///<summary> Function definition block</summary>sealed class FunctionalTerm : FunctionTerm{/// <summary> Operator parameters</summary>[NotNull]public BlockTerm Parameters { get; set; }/// <summary> Initialization of the complex operator block</summary>/// <param name="Header"> Block header</param>/// <param name="Body"> Block Body</param>public FunctionalTerm([NotNull] FunctionTerm Header, [NotNull] BlockTerm Body) : base(Header.Name, Body){Contract.Requires(Header != null);Contract.Requires(Body != null);Parameters = Header.Block;}/// <summary> Get the subtree of the complex operator</summary>/// <param name="Parser"> Parser</param>/// <param name="Expression"> Mathematical expression</param>/// <returns> Node of the complex operator</returns>public override ExpressionTreeNode GetSubTree(ExpressionParser Parser, MathExpression Expression)=> new FunctionalNode(this, Parser, Expression);public override string ToString() => $"{Name}{Parameters}{Block}";}

When creating a node, it requires a collection of variables/functions from the expression in order to retrieve the desired object-variable/function by name.
internal FunctionNode(FunctionTerm Term, ExpressionParser Parser, MathExpression Expression)

/// <summary> Initialization of a new functional node</summary>///<param name="Term"> Function expression</param>/// <param name="Parser"> Expression parser</param>/// <param name="Expression"> Mathematical expression</param>internal FunctionNode(FunctionTerm Term, ExpressionParser Parser, MathExpression Expression): this(Term.Name){var arg = Term.Block.GetSubTree(Parser, Expression);if(!(arg is FunctionArgumentNode))if(arg is FunctionArgumentNameNode)arg = new FunctionArgumentNode((FunctionArgumentNameNode)arg);else if(arg is VariableValueNode)arg = new FunctionArgumentNode(null, arg);else if(arg is VariantOperatorNode arg.Left is VariableValueNode)arg = new FunctionArgumentNode(((VariableValueNode)arg.Left).Name, arg.Right);elsearg = new FunctionArgumentNode(null, arg);Right = arg;//Defining a function object from an expressionFunction = Expression.Functions[Name, ArgumentsNames];}

After creating the tree in the matrix object, the collections of variables and functions will contain lists of used objects. But their values will be empty. Some variables have to be classified as constants known to the parser (move them to the corresponding collection, for objects-functions we have to define the delegates that implement them.

Tree initialization

After you create a "raw" matrix tree, you need to fill it with values of variables and functions. Parser’s Parse method calls two methods ProcessVariables and ProcessFunctions one by one, passing into them the created "raw" tree.
Variable handling method :
internal void ProcessVariables(MathExpression Expression)

/// <summary> Variable handling</summary>///<param name="Expression"> Processed mathematical expression</param>internal void ProcessVariables([NotNull] MathExpression Expression){Contract.Requires(Expression != null);var tree_vars = Expression.Tree.Root.GetVariables().ToArray();Expression.Variable.Where(v => !tree_vars.Contains(v)).ToArray().Foreach(v => Expression.Variable.Remove(v));foreach(var variable in Expression.Variable.ToArray()){if(f_Constans.ContainsKey(variable.Name)){Expression.Variable.MoveToConstCollection(variable);variable.Value = f_Constans[variable.Name];}OnVariableProcessing(variable);}}

His job is to traverse the tree, find all the variable nodes, and extract the variable objects used in them. After that, you have to remove everything from the matrix expression variable collection that is not used in the tree.
After that, for each variable in the tree we check if its name is in the collection of known parser constants. If so, it is removed from the collection of matrix variables, put into the collection of matrix constants, initialized with a value known to the parser and it is flagged as a constant.
After that, a new variable detection event is called for it in the parser. When this event is processed, the parser user can redefine the value of this variable, or change the variable object itself.
The second ProcessFunctions method fills functions known to the matrix expression with delegates :
internal void ProcessFunctions(MathExpression Expression)

/// <summary> Function handling</summary>/// <param name="Expression"> Processed mathematical expression</param>[SuppressMessage("ReSharper", "CyclomaticComplexity")]internal void ProcessFunctions([NotNull] MathExpression Expression){Contract.Requires(Expression != null);foreach(var function in Expression.Functions)switch(function.Name){case "Sin":case "SIN":case "sin":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Sin);break;case "COS":case "Cos":case "cos":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Cos);break;case "TAN":case "Tan":case "tan":case "tn":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Tan);break;case "ATAN":case "ATan":case "Atan":case "atan":case "atn":case "Atn":if(function.Arguments.Length == 1)function.Delegate = new Func<double, double> (Math.Atan);else if(function.Arguments.Length == 2)function.Delegate = new Func<double, double, double> (Math.Atan2);else goto default;break;case "Atan2":case "atan2":if(function.Arguments.Length != 2)goto default;function.Delegate = new Func<double, double, double> (Math.Atan2);break;case "CTG":case "Ctg":case "ctg":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (x => 1 / Math.Tan(x));break;case "Sign":case "sign":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (x => Math.Sign(x));break;case "Abs":case "abs":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Abs);break;case "Exp":case "EXP":case "exp":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Exp);break;case "Sqrt":case "SQRT":case "√":case "sqrt":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Sqrt);break;case "log10":case "Log10":case "LOG10":case "lg":case "Lg":case "LG":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Log10);break;case "loge":case "Loge":case "LOGe":case "ln":case "Ln":case "LN":if(function.Arguments.Length != 1)goto default;function.Delegate = new Func<double, double> (Math.Log);break;case "log":case "Log":case "LOG":if(function.Arguments.Length != 2)goto default;function.Delegate = new Func<double, double, double> (Math.Log);break;default:var f = OnFunctionFind(function.Name, function.Arguments);if(f == null)throw new NotSupportedException($"Processing the function {function.Name} not supported");function.Delegate = f;break;}}

If the name of the function is among the variants of the case statement, then if the required number of arguments of the function matches, a delegate is assigned to it, which will calculate its value. If the function is not defined, the delegate is defined as the result of an unknown function detection event generation. The user can, however, define in the reaction to this event the required delegate by name and number (and names) of arguments.
This completes the generation of the matrix expression.

Using

Suppose we have the problem of calculating the integral of the function A*cos(2*x)/pi+G(x/2) divided by A and + 1, where G(x)=2cos(x). With A, say, equal to 5. And the integral should be taken in 0.05 increments.

var parser = new ExpressionParser();parser .FindFunction += (s, e) =>{if(e.SignatureEqual(name: "G", ArgumentsCount: 1))e.Function = new Func<double, double> (x => 2 * Math.Cos(x));};var expr = parser.Parse(@"Int[x=-10..10;dx=0.05]{A*cos(2x) + G(x/2)}/A + 1");expr.Variable["A"] = 5;var y = expr.Compute(); //y = 0.30928806858920344var f = expr.Compile();var y2 = f(); //y = 0.30928806858920344

Conclusion

Given the resulting length of the article, I put a semicolon here…not a period, but a semicolon. As a result of the above, it has been possible to :

  1. Get a general idea of the method for solving the problem posed;
  2. Generate an object model of the matrix expression parser, the matrix expression itself and its tree;
  3. Create an efficient method for parsing the string of a matrix expression into its logical components;
  4. Create an efficient method for building the tree of a mathematical expression, taking into account the peculiarities of brackets, operator priorities, and special constructions (functionals);
  5. Provide control over different stages of input data processing by the parser based on the event system;
  6. Add the ability to extend functionality.

What this article fails to describe :

  1. The logic of variables (their types and methods of obtaining and then replacing them);
  2. The structure of collections of variables, constants, and functions involved in the operation of a matrix expression;
  3. A method for calculating the value of a matrix expression by traversing its tree;
  4. A method for compiling a matrix expression tree into a delegate.

What has not yet succeeded in implementing the parser itself :

  1. Implement methods for optimizing a matrix expression tree;
  2. Remove crutches from a number of places;
  3. Add checks for input data to match the matrix expression format;
  4. Actually outline the boundaries of this format;
  5. Increase code coverage with unit tests;
  6. Conduct comparative performance studies of both parsing and computation phases.

In general, the work on this code is unfortunately of a background nature and has been going on for a couple of years, but at this stage it already solves the tasks assigned to it. Although in its current form it should not be allowed into production.
Full source codes can be found here

You may also like