-
Notifications
You must be signed in to change notification settings - Fork 2
Declarative_parsing_syntax
<<toc></toc>>
General convention: important rules are written **in bold face**.
The file called , which is located at the top folder for each filetype, e.g.,topTemplate.xml
is the first one which is read by JUMBO-Converters when the file is to be parsed; an NWChem logfile in this case.
It contains a list of subtemplates that will be tried to be matched in the whole logfile using the following format:
As we will see later, each template will match (or not if it fails) with a chunk in the logfile. When it does, the template will **eat up the chunk** and it will no longer be available for the following templates in the list, only to the one that actually matched. This process **begins by the first template** in the list and it tries to match it **in the whole logfile** a specified number of times.
For example, in the case of the list in the previous section, the list includes the file environment.xml, together with several other template files related to the environment link.
This subtemplate file contains the following code:
Independently of what this template does with the chunk that it captures, we see that it detects the beginning of the chunk with the regular expression
which matches the lines in the logfile that read:
A **very important point** is that most of the parsing proceeds line by line.
The ending pattern that defines the end of the chunk is indicated in the option
which matches the line in the logfile that reads
Everything between these two lines, **excluding the last one**, is eaten up by this template and included into the module identified by the **id**:
I.e., in the output XML we will see
The option
means that this chunk will try to be matched any number the times in the logfile, and, everytime it matches, it will be eaten up in a module as described.
When no further match is found, the parser will proceed to the next item in the list, in this case nwchem.job.xml, which will do the same process but **without having access** to the already captured modules.
The final important point is that this process can be nested, i.e., there can be modules inside modules inside modules, as you can see in the contents of environment.xml, where a deeper list of templates appear. Of course, its items will only be matched against the chunk that was matched in the parent module.