rusty_lr

A Yacc-like, procedural macro-based parser generator for Rust supporting LR(1), LALR(1), and GLR parsing strategies.

RustyLR enables you to define context-free grammars (CFGs) directly in Rust using macros or build scripts. It constructs optimized finite state automata at compile time, ensuring efficient and reliable parsing.

Features

Automatic Optimization:: Reduces parser table size and improves performance by grouping terminals with identical behavior across parser states.
Multiple Parsing Strategies: Supports LR(1), LALR(1), and GLR parsers.
Procedural Macros: Define grammars using lr1! macro for compile-time parser generation.
Build Script Integration: Generate parsers via build scripts for complex grammars with detailed error messages.
Custom Reduce Actions: Define custom actions during reductions to build ASTs or perform computations.
Grammar Conflict Detection: Automatically detects shift/reduce and reduce/reduce conflicts during parser generation, providing informative diagnostics to help resolve ambiguities.

Installation

Add RustyLR to your Cargo.toml:

[dependencies]
rusty_lr = "..."

To use buildscript tools:

[build-dependencies]
rusty_lr = { version = "...", features = ["build"] }

Or you want to use executable version (optional):

cargo install rustylr

rusty_lr is designed for use with auto-generated code, either through lr1! macro (default), a build script (with build feature), or the rustylr executable. When using a buildscript or executable, you can get beautiful and detailed messages generated from your grammar.

Quick Start

Using Procedural Macros

Define your grammar using the lr1! macro:

// this define `EParser` struct
// where `E` is the start symbol
lr1! {
   %userdata i32;           // userdata type passed to parser
   %tokentype char;         // token type; sequence of `tokentype` is fed to parser
   %start E;                // start symbol; this is the final value of parser
   %eof '\0';               // eof token; this token is used to finish parsing

   // left reduction for '+' and '*'
   %left '+';
   %left '*';
   // operator precedence '*' > '+'

   // ================= Production rules =================
   Digit(char): ['0'-'9'];           // character set '0' to '9'

   Number(i32)                       // production rule `Number` holds `i32` value
       : ' '* Digit+ ' '*            // `Number` is one or more `Digit` surrounded by zero or more spaces
       { Digit.into_iter().collect::<String>().parse().unwrap() }; // this will be the value of `Number` (i32) by this production rule

   E(f32): E '*' e2=E { E * e2 }
         | E '+' e2=E {
             *data += 1;                       // access userdata by `data`
             println!( "{:?} {:?}", E, e2 );   // any Rust code can be written here
             E + e2                            // this will be the value of `E` (f32) by this production rule
         }
         | Number { Number as f32 } // Number is `i32`, so cast to `f32`
         ;
}

This defines a simple arithmetic expression parser.

Using Build Script

For complex grammars, you can use a build script to generate the parser. This will provide more detailed error messages when conflicts occur.

1. Create a grammar file (e.g., src/parser.rs) with the following content:

// Rust code of `use` and type definitions


%% // start of grammar definition

%tokentype u8;
%start E;
%eof b'\0';

E: b'(' E b')' 
 | a;

 ...

2. Setup build.rs:

// build.rs
use rusty_lr::build;

fn main() {
    println!("cargo::rerun-if-changed=src/parser.rs");

    let output = format!("{}/parser.rs", std::env::var("OUT_DIR").unwrap());
    build::Builder::new()
        .file("src/parser.rs") // path to the input file
        .build(&output);       // path to the output file
}

3. Include the generated source code:

include!(concat!(env!("OUT_DIR"), "/parser.rs"));

4. Use the parser in your code:

let mut parser = parser::EParser::new(); // create <StartSymbol>Parser class
let mut context = parser::EContext::new(); // create <StartSymbol>Context class
let mut userdata: i32 = 0;
for b in input.chars() {
    match context.feed(&parser, b, &mut userdata) {
        Ok(_) => {}
        Err(e) => {
            eprintln!("error: {}", e);
            return;
        }
    }
}
println!("{:?}", context);
context.feed(&parser, 0 as char, &mut userdata).unwrap(); // feed EOF

let result:i32 = context.accept(); // get value of start 'E'

The generated code will include several structs and enums:

<Start>Parser: A struct that holds the whole parser table. (docs-LR) (docs-GLR)
<Start>Context: A struct that maintains the current state and the values associated with each symbol. (docs-LR) (docs-GLR)
<Start>State: A type representing a single parser state and its associated table.
<Start>Rule: A type representing a single production rule. (docs)
<Start>NonTerminals: A enum representing all non-terminal symbols in the grammar. (docs)

You can get useful information from <Start>NonTerminals enum.

let non_terminal: <Start>NonTerminals = ...;

non_terminal.is_auto_generated(); // true if this non-terminal is auto-generated

You can also get contextual information from <Start>Context struct.

let mut context = <Start>Context::new();

// ... parsing ...

context.expected();         // get expected terminal symbols
context.expected_nonterm(); // get expected non-terminal symbols
context.can_feed( term );   // check if a terminal symbol can be fed
context.on_parsing();       // get all non-terminal symbols that are currently being parsed

The generated code will also include a feed method that takes a token and a mutable reference to the user data. This method will return an Ok(()) if the token was successfully parsed, or an Err if there was an error.

context.feed( &parser, term, &mut userdata ); // feed a terminal symbol and update the state machine

Note that the actual definitions are bit different if you are building GLR parser.

GLR Parsing

RustyLR offers built-in support for Generalized LR (GLR) parsing, enabling it to handle ambiguous or nondeterministic grammars that traditional LR(1) or LALR(1) parsers cannot process. See GLR.md for details.

Examples

Calculator: A calculator using u8 as token type.
Json Validator: A JSON validator
lua 5.4 syntax parser
Bootstrap: rusty_lr syntax parser is written in rusty_lr itself.

Cargo Features

build: Enable build script tools.
fxhash: Use FXHashMap instead of std::collections::HashMap for parser tables.
tree: Enable automatic syntax tree construction (For debugging purposes).
error: Enable detailed parsing error messages with backtrace (For debugging purposes).

Syntax

RustyLR's grammar syntax is inspired by traditional Yacc/Bison formats. See SYNTAX.md for details of grammar-definition syntax.

Contribution

Any contribution is welcome.
Please feel free to open an issue or pull request.

License (Since 2.8.0)

Either of

MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)

Name		Name	Last commit message	Last commit date
Latest commit History 225 Commits
.github		.github
example		example
images		images
rusty_lr		rusty_lr
rusty_lr_buildscript		rusty_lr_buildscript
rusty_lr_core		rusty_lr_core
rusty_lr_derive		rusty_lr_derive
rusty_lr_executable		rusty_lr_executable
rusty_lr_parser		rusty_lr_parser
.gitignore		.gitignore
Cargo.toml		Cargo.toml
GLR.md		GLR.md
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.md		README.md
SYNTAX.md		SYNTAX.md
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

rusty_lr

Features

Installation

Quick Start

Using Procedural Macros

Using Build Script

GLR Parsing

Examples

Cargo Features

Syntax

Contribution

License (Since 2.8.0)

About

Licenses found

Releases 28

Sponsor this project

Languages

License

Licenses found

ehwan/RustyLR

Folders and files

Latest commit

History

Repository files navigation

rusty_lr

Features

Installation

Quick Start

Using Procedural Macros

Using Build Script

GLR Parsing

Examples

Cargo Features

Syntax

Contribution

License (Since 2.8.0)

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 28

Sponsor this project

Languages