Skip to content

Declarative_parsing_syntax

Oliver Stueker edited this page May 5, 2015 · 2 revisions

<<toc></toc>>

General convention: important rules are written **in bold face**.

topTemplate.xml

The file called , which is located at the top folder for each filetype, e.g.,topTemplate.xml

is the first one which is read by JUMBO-Converters when the file is to be parsed; an NWChem logfile in this case.

It contains a list of subtemplates that will be tried to be matched in the whole logfile using the following format:

List of templates parsing and general order of precedence

As we will see later, each template will match (or not if it fails) with a chunk in the logfile. When it does, the template will **eat up the chunk** and it will no longer be available for the following templates in the list, only to the one that actually matched. This process **begins by the first template** in the list and it tries to match it **in the whole logfile** a specified number of times.

For example, in the case of the list in the previous section, the list includes the file environment.xml, together with several other template files related to the environment link.

This subtemplate file contains the following code:

Independently of what this template does with the chunk that it captures, we see that it detects the beginning of the chunk with the regular expression

which matches the lines in the logfile that read:

A **very important point** is that most of the parsing proceeds line by line.

The ending pattern that defines the end of the chunk is indicated in the option

which matches the line in the logfile that reads

Everything between these two lines, **excluding the last one**, is eaten up by this template and included into the module identified by the **id**:

I.e., in the output XML we will see

The option

means that this chunk will try to be matched any number the times in the logfile, and, everytime it matches, it will be eaten up in a module as described.

When no further match is found, the parser will proceed to the next item in the list, in this case nwchem.job.xml, which will do the same process but **without having access** to the already captured modules.

The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of environment.xml, where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module.

Clone this wiki locally