Replies: 3 comments 4 replies
-
Concrete Problem 1: Issue: (convert to issue after this specific problem discussion concludes) Problem DescriptionThe node flag Possible SolutionsI see two solutions, in both cases we introduce a node and then throw away the flag:
One is at the child level and informs the parent, while the other is at the parent level and informs child handling. I think I prefer the second option. |
Beta Was this translation helpful? Give feedback.
-
These are some thoughts about a possible approach to handle AST (Abstract Syntax This "lowering" concept is key as we're effectively moving from one language to Idea Outline/SketchBriefly, the idea is to have a single AST/IR focusing on the sem, the parser, type
# introduce a "level" concept, representing language levels, or the result of
# a compiler phase (lowering or raising)
TAstLevel = enum
## the type and level names are illustrative, need improvement
lvlUntyped # from the parser, a template, macro, etc
lvlNorm # untyped but normalized
lvlMeta # metaprogramming annotations exist, macros/templates not applied
lvlGnrc # generics still exist, aren't instanced away
lvlInst # all generics are instanced, no generics exist
lvlTransf # for -> while, etc
...
TNode = object
level*: TAstLevel # use this to indicate the level of production Then rework sem procs to be more regular and uses the The Simultaneously, we can have consts and/or enums + conversion functions that It would clarify things for everyone such as pragmas, and why/when/how to deal
Looking across various phases (sem, seminst, transf, lowerings, etc) the A Brief Examination of Code ImpactI see compiler procs using it something like this, here are the first bits from # original
proc semProcAux(c: PContext, n: PNode, kind: TSymKind,
validPragmas: TSpecialWords, flags: TExprFlags = {}): PNode =
result = semProcAnnotation(c, n, validPragmas)
if result != nil: return result
result = n
checkMinSonsLen(n, bodyPos + 1, c.config)
# ... rest of code ...
# with levels
proc semProcAux(c: PCtx, n: PNode): PNode =
result = semProcAnnotation(c, n) # 1
if result != nil: return result # 2
result = n # 3
case n.kind # 4
of nkProcDef, nkFuncDef:
checkSonsLen(n, bodyPos + 1, c.config)
# ... rest of case and code ...
result.level = lvlGnrc # 5 With levels we can drop
Possible Extensions
Next StepsMaybe some discussion here and then try it out a cleaner form of the idea in a Inspiration/BackgroundThese ideas draw upon a lot of things, the most notable recent ones are:
|
Beta Was this translation helpful? Give feedback.
-
Stemming from "how make sem not hurt as much". Leaving it here in case someone has some insights. I'm pretty sure this might be a baking and conceptualization gap. Context:At the AST level, expansion/reduction, inheritance/synthesis, and production are all pretty clear. What gets fuzzy is when AST is hung of the symbol table, like in let/var/proc defs/etc part way through a semantic analysis procedure and we then do things like apply pragmas, further reason about types and the like -- Concerning part:In both cases that symbol and its AST start diverging from the AST that's passed further down the compilation pipeline. Only to be summoned at some later point, typically identifier lookup or dispatch. Additionally, we often re-sem the AST, which is in the past, aka decided, and that may result in mutations. Questions/Line of Inquiry:(Outlining my questions/what I'm trying to figure out, hopefully it's enough for someone to inject their own thoughts/questions that help get to ground.) I think the big fuzzy part is that we don't have clear names and description of the concepts at play wrt symbol's and their AST handling. Under the 'Concerning parts' section, the first concern relates to the compiler doing some sort of symbolic evaluation (?), some type level evaluation, as well. The application of the pragmas during the analysis of vars is a must because we need to know if it's a threadvar, which let's us check for an initialization error. But that error generates an nkError, which IIRC doesn't make it back into the AST. The second concern relates to the compiler potentially mutating what it shouldn't, but we don't have a clear way of differentiating symbol definition analysis vs AST generation. Ultimately, I do believe the divergence is likely dangerous, but at times entirely necessary. Danger-wise it should only be tolerated within a proc's lifetime as it's accumulating results but shouldn't leave the proc in a diverged way (not counting helpers). Even if I "just start" separating things into:
It clearly illustrates a big blank spot in the conceptual understanding and mental model, as indicated by the poor names. PS: It's a somewhat similar problem for types and their AST. |
Beta Was this translation helpful? Give feedback.
-
Summary
Introducing new or changing existing (merge/split) Node kinds (eventually Symbols and Types) to simplify the compiler. Current nkError, VM, and surrounding work are driving to a design/implementation style that strongly prefers node/type/symbol kinds that describe everything required for initial analysis dispatch. This means no if statements to cover nil checks, random flags we have everywhere, etc, just simple kind dictates most things and that these branches are independent of each other.
The rest is background and definitions and I'll start posting concrete problems so style can be formalized through discovery. The benefit of all this is we end up with a better language, nicer implementation, and helps unblock some key pieces to move to DOD.
Call to Action
Background:
The work on
nkError
has surfaced a "case/let" style along with a traditional compiler architecture cue of separate expansion/reduction/production style (descriptions below). Together these simplify the compiler code, improve human and compiler reasoning, and highlight design issues who's solutions improve the language all the way up to the syntax (interface). Overall, I consider this a big win.The design issues remain and we need a way to solve them. The style does need to be well defined, but that happens by working through it rather than trying to write it all out too early when we have the least amount of information (discovery, not invention). So I'll post the style below as a temporary home and then this discussion thread can talk about concrete issues and solutions.
The particular strategy I'm putting forth here is to formalize the style and cataloge and/or fix the issues, both in an co-evolving iterative fashion. So instead of trying to solve a grand problem right off the bad, I'm going to post under this discussion concrete issues and as they're solved further start building out a style guide (which at the moment is pointing at a proc like
semBlock
).Definitions
Summary of Principles
Some key principles that these styles drive towards:
case/let style
Used to describe a programming style in the compiler where:
let
opting for commentsA clear example of this at time of writing is
semBlock
:nimskull/compiler/sem/semexprs.nim
Line 2941 in b4790ec
This style has a number of advantages, it's immediately clear what this proc expects, any additional control flow complexity stands out like a store thumb, and the exhaustive nature of case statements means better reasoning at time of writing or reading, to name a few.
Expansion/Reduction/Production Style
Compiler Architecture suggest semantic analysis architecture in the following way:
sem1
called on nodeA
, we are in theexpansion phase
sem1
receives the node as primary input along with the environment(s) for queryingsem1
's caller might have dervied some facts (attributes) and these are consideredinherited
sem1
can do some early synthesis of attributes and/or queries, but all the children are not analysed so this is usually minimalsem1
analyses any of its children via analysis sub-routines, which again inherit attributessem1
may interleave child analysis sub-routines completetion with addtional analysis, as it moves into thereduction phases
sem1
finalizes theproduction
(returned node) with all attribute data, whichsem1
's caller will inheritThe influence of this is far more subtle today, but the
let
usage and the wayresult
is used should hint at the influences.reference: https://www.tutorialspoint.com/compiler_design/compiler_design_semantic_analysis.htm
Beta Was this translation helpful? Give feedback.
All reactions