-
Notifications
You must be signed in to change notification settings - Fork 8
Declarative Syntax Specification
To print and parse the IR, pliron
provides the Printable
and Parsable
traits,
which can be implemented on Op
, Type
and Attribute
objects, and even on general
Rust types.
However manually implementing these traits could be boring or cumbersome. MLIR solves this problem via ODS, allowing specification in a table-driven manner, using LLVM's TableGen.
Rather than implementing a separate tool, pliron
chooses to take advantage of
Rust's powerful procedural macros. This article describes the current capabilities
for deriving the format of different IR entities via the #[format]
macro.
With the assumption that every field of a struct (or a tuple) already has
traits Printable
and Parsable
implemented, the #[format]
macro
can be used on a Rust struct
to automatically derive Printable
and
Parsable
for it.
#[format]
struct U64Wrapper {
a: u64
}
For an instance of this with a : 42
, the derived (generated) printer prints
{a=42}
.
While one can look at the generated printer and parser using cargo-expand, this one time, I provide the expansions below:
Generated `Printable` and `Parsable` for `U64Wrapper`
impl ::pliron::printable::Printable for U64Wrapper {
fn fmt(
&self,
ctx: &::pliron::context::Context,
state: &::pliron::printable::State,
fmt: &mut ::std::fmt::Formatter<'_>,
) -> ::std::fmt::Result {
::pliron::printable::Printable::fmt(&"{", ctx, state, fmt)?;
::pliron::printable::Printable::fmt(&"a", ctx, state, fmt)?;
::pliron::printable::Printable::fmt(&"=", ctx, state, fmt)?;
::pliron::printable::Printable::fmt(&self.a, ctx, state, fmt)?;
::pliron::printable::Printable::fmt(&"}", ctx, state, fmt)?;
Ok(())
}
}
impl ::pliron::parsable::Parsable for U64Wrapper {
type Arg = ();
type Parsed = Self;
fn parse<'a>(
state_stream: &mut ::pliron::parsable::StateStream<'a>,
arg: Self::Arg,
) -> ::pliron::parsable::ParseResult<'a, Self::Parsed> {
use ::pliron::parsable::IntoParseResult;
use ::combine::Parser;
use ::pliron::input_err;
use ::pliron::location::Located;
let cur_loc = state_stream.loc();
::pliron::irfmt::parsers::spaced(::combine::parser::char::string("{"))
.parse_stream(state_stream)
.into_result()?;
::pliron::irfmt::parsers::spaced(::combine::parser::char::string("a"))
.parse_stream(state_stream)
.into_result()?;
::pliron::irfmt::parsers::spaced(::combine::parser::char::string("="))
.parse_stream(state_stream)
.into_result()?;
let a = <u64>::parser(()).parse_stream(state_stream).into_result()?.0;
::pliron::irfmt::parsers::spaced(::combine::parser::char::string("}"))
.parse_stream(state_stream)
.into_result()?;
let final_ret_value = U64Wrapper { a };
Ok(final_ret_value).into_parse_result()
}
}
The #[format]
macro takes a string argument to specify a custom syntax.
- A named variable
$name
specifies a named struct field. - An unnamed variable
$i
specifies the i'th field of a tuple struct. - Literals are enclosed with backticks (
`
).
Example:
#[format("$upper `/` $lower")]
struct IntDiv {
upper: u64,
lower: u64,
}
An instance of this with upper = 42
and lower = 7
prints 42/7
.
Struct (or tuple) fields that are Option
or Vec
types, with the
inner type already implementing Printable
and Parsable
can have
their syntax auto derived using the opt
and vec
directives.
For example:
#[format("`<` opt($a) `;` vec($b, Char(`,`)) `>`")]
struct OptAndVec {
a: Option<u64>,
b: Vec<u64>,
}
Will print "<42;1,2,3>"
when a: Some(42)
and b: vec![1, 2, 3]
.
The #[format]
macro can derive Printable
and Parsable
for enum
s as well,
as long as all sub-elements have both Printable
and Parsable
derived.
For enum
s, the #[format]
macro does not take a custom
syntax specification argument, although another #[format]
, with
a custom format string, can be specified for its individual variants.
#[format]
enum Enum {
A(TypePtr<IntegerType>),
B { one: TypePtr<IntegerType>, two: IntWrapper },
C,
#[format("`<` $upper `/` $lower `>`")]
Op {
upper: u64,
lower: u64,
},
}
the printed values for each variant looks as below:
A(builtin.int <si64>)
B{one=builtin.int <si64>,two={inner=builtin.int <si64>}}
C
Op<42/7>
Since pliron
's Attribute
s and Type
s are Rust types (struct
s and enums
)
implementing their respective traits, specifying a format for these is same
as that for general Rust types described above, except that the
format_attribute
and format_type
macros are used instead.
Examples: an attribute and a type, with the format specified.
#[def_attribute("test.my_attr")]
#[format_attribute("`<` $ty `>`")]
#[derive(PartialEq, Clone, Debug)]
struct MyAttr {
ty: Ptr<TypeObj>,
}
impl_verify_succ!(MyAttr);
and
#[def_type("llvm.array")]
#[derive(Hash, PartialEq, Eq, Debug)]
#[format_type("`[` $size `x` $elem `]`")]
pub struct ArrayType {
elem: Ptr<TypeObj>,
size: u64,
}
Op
s are Rust struct
s with just one field, a Ptr
to the underlying
Operation
. Instead, semantics of Op
s are based on the underlying
Operation
's result types, operands, regions and attributes. So the custom
syntax rules for Op
s are different.
Only those syntax in which results appear before the opid are supported:
res1, ... = opid ...
The format string specifies what comes after the opid.
- A named variable $name specifies a named attribute of the operation.
- An unnamed variable $i specifies
operands[i]
, except when inside some directives. - The "type" directive specifies that a type must be parsed. It takes one
argument,
which is an unnamed variable
$i
withi
specifyingresult[i]
. - The "region" directive specifies that a region must be parsed. It takes
one argument, which is an unnamed variable
$i
withi
specifyingregion[i]
. - The "attr" directive can be used to specify attribute on an
Op
when the attribute's rust type is fixed at compile time. It takes two arguments, the first is the attribute name and the second is the concrete rust type of the attribute. This second argument can be a named variable$Name
(withName
being in scope) or a literal string denoting the path to a rust type (e.g.`::pliron::builtin::attributes::IntegerAttr`
). The advantage over specifying the attribute as a named variable is that the attribute-id is not a part of the syntax here, allowing it to be more succinct.
Examples:
#[format_op("`:` type($0)")]
#[def_op("test.one_result_zero_operands")]
#[derive_op_interface_impl(ZeroOpdInterface, OneResultInterface)]
struct OneResultZeroOperandsOp {}
impl_verify_succ!(OneResultZeroOperandsOp);
This looks like res0 = test.one_result_zero_operands : builtin.int <si64>;
#[format_op("$0 `:` type($0)")]
#[def_op("test.one_result_one_operand")]
#[derive_op_interface_impl(OneOpdInterface, OneResultInterface)]
struct OneResultOneOperandOp {}
impl_verify_succ!(OneResultOneOperandOp);
This looks like res1 = test.one_result_one_operand res0 : builtin.int <si64>;