Skip to content

How It Works

Arif Balik edited this page Aug 19, 2019 · 2 revisions

Parsing

First user should initialize the euler engine, by calling euler_init() function. Then user can call parse_query() function. This function takes only one argument;

  • euler is the euler_t structure. User should only load the query string in ascii member of the structure. Other fields will be filled by parser.

Here is an example about how to parse a simple expression.

char query[4] = "1+1";

euler_init();

euler.ascii = query;
parse_query(&euler);

if(euler.type == FRACTION)
        printf("result : %f\n", euler.resultn.fraction);

After this function call, parse process begins. Lexer immediately begins tokenizing the input and marking the matched tokens positions into the symbol table. Below is the symbol table of the example above.

1 + 1 \0
token address priority desc
RSV 0x0000 0 0
INT 0x0001 0 0
PLUS 0x0002 0 0
INT 0x0003 0 0
EOQ 0x0003 0 0

For this simple expression (where there are no block priorities etc.) euler immidiately calls the grammar and parses the expression.

Euler begins with pushing each token to the grammar, if the token is a number, lexer converts the token string to double type. In this case the string for the first number is between addresses of 0x0000 and 0x0001

Symbolic Expressions

The hard part of the parsing begins when there is a symbolic expression in the input stream. Consider following input;

sum(x^2+1, x, 0, 10)

As you can see there is a variable denoted as x. Parser first needs to understand that this is a sum function, after parsing this to a sum function, then parser also needs to evaluate expression in the first argument of the sum function, from 0 to 10. After evaluation parser needs another grammar check and parsing operation on evaluated expressions.

For this recursive parsing operation, euler uses something called partial parsing. It works when there is a symbolic expression inside the input stream. Let's look at the following input;

1 + sum(x+1, x, 0, 10) * 2 - sum(x-2, x, 0, 2)

When this input is fed to the tokenizer, it marks the beginning and the end of the symbolic expression as high priority for parser. So the parser can handle those expressions first. For the input above, we have the following table;

Clone this wiki locally