Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add module loading support #49

Merged
merged 53 commits into from
Dec 18, 2023
Merged

Add module loading support #49

merged 53 commits into from
Dec 18, 2023

Conversation

jeaye
Copy link
Member

@jeaye jeaye commented Dec 16, 2023

jeaye added 30 commits September 9, 2023 11:08
This isn't loading anything yet, but it's traversing the class path to
scan directories and JARs so it can register which files belong to which
classes.
Previously, jank only supported array maps, which had a max size of 8
k/v pairs. Now those get promoted to a hash map. Each of these has a
distinct object type and sequence object type, but they share a good
chunk of the same implementation through base types.

Since there are multiple types of map now, meta maps had to be
type-erased; we can't know which kind of map it will be.

Also, a nasty, bug was lingering in the native_array_map implementation,
to do with incorrect indices. That was found and fixed.
This is distinct from option<E> in that the value is clearly an error.
This involves compiling namespaces to .cpp files within the class path
and then loading them up again. Namespaces are written to their own
file, with a special `__init` file also written. The `__init` file needs
to be loaded prior to the namespace file being loaded, since it contains
all of the dependencies. Any functions created within the namespace are
nested modules, using the same `foo.bar$spam` syntax as the JVM. They're
all written to individual files, same as with Clojure, and they're
loaded using the `__init` module.

Having this allows us to load `clojure.core` from pre-compiled .cpp
files, rather than from jank source. The startup time of the compiler,
when loading `clojure.core` this way, drops from 8.7s to 3.7s, which
means it's more than twice as fast now. We can drop this further by
compiling the .cpp files to LLVM IR files, or, even further, to object
files or pre-compiled modules (PCMs). The framework for this is present,
but Cling doesn't actually have an interface to load either of those, so
some more intricate work will be needed. I'm going to stick with the
.cpp files for now and flesh out the rest of the module loading. This
bout of work is not primarily focused on performance gains.
The command-line parsing applies to all existing features and some not
yet implemented. The RT profiling will be exanded with some additional
tooling, going forward, and I have some scripts already done which
aren't quite ready to commit. We'll be able to easily build charts by
just adding `--profile` to the jank invocation and then running a script
on the generated file. The goal is for this to apply to both compiler
and RT internals, as well as user code. We'll see, once there's actually
user code.

Part of this CLI parsing is the addition of a very simple REPL, coming
from `jank repl`. There's no readline, history, completion, coloring, or
server support yet, but it allows for some quick jankin' around.
This has per-session history support, searching with ^R, and
modifications with chords like ^W. History is not yet persistent to
disk. It also doesn't yet support multi-line inputs.
There was an issue when loading pre-compiled (.cpp or otherwise)
modules, where we'd try to use analysis information on fns within that
module to determine if we can unbox values. However, since we didn't
analyze the jank code for that module in this session, we don't have
that info. Instead, we should just rely on the meta and the var in that
scenario. When we do have the analysis info, though, we make sure to
verify what we can.
This will help speed up the tests and ensure that jank is ready to use a
pre-compiled clojure.core right after being built. Changing any of the
jank source will trigger a rebuild, as it should.
I'm not doing anything different with it, but I've never liked having a
header-only lib provide my main for me using a custom define. Call me
old fashioned.
This includes the `alias` fn as well as support for resolving aliases
for both keywords and symbols during parsing. Fortunately, this allowed
us to remove some state from keyword objects, too.
This also fixes a pretty awful bug in the highest arity of
`dynamic_call`, which didn't copy all parameters properly.
The Clojure versions are lazy, using a custom seq, but these work for
now.
It's missing a couple of bits, like `:rename`, and `:exclude`, and I
haven't actually been able to test it yet, due to a bug with variadic
arg positions. Will come back to this.
When dynamically calling a function, we need to know three things:

1. Is the function variadic?
2. Is there an ambiguous fixed overload?
3. How many fixed arguments are required before the packed args?

We cannot perform the correct call without all of this information. Since function calls
are on the hottest path there is, we pack all of this into a single byte. Questions
1 and 2 each get a bit and question 3 gets 4 bits to store the fixed arg count.

From there, when we use it, we strip out the bit for question 2 and we switch/case on
the rest. This allows us to do a O(1) jump on the combination of whether it's variadic
and the required fixed args. Finally, we only need the question 2 bit to disambiguate
one branch of each switch.

The ambiguity comes in this case:

```
(defn ambiguous
  ([a] 1)
  ([a & args] args))
(ambiguous :a)
```

When we call `ambiguous` with a single arg, we want it to match the fixed unary arity.
However, given just questions 1 and 3, we will see that we've met the required args
and that the function is variadic and we'll instead dispatch to the variadic arity, with
an empty sequence for `args`.
jeaye added 23 commits December 2, 2023 11:48
This is important for referring vars within another ns.
This was affecting macros like `defmacro`, which were doing a `cons`
onto some packed args.
Yay, more macros. Things start to feel more like a proper Clojure.
This also implement the `cons` concept for sets and fixes an issue with
their equality comparator. Sets actually do things now.
This also fixes an off-by-one in sequence length counting.
WIP, still have borked things with `complement`.
There's more to do here, for vars and constants, but that's just a
performance/memory usage win. This was a blocker.
jank is starting to feel like a proper Clojure now. :)
@jeaye jeaye merged commit 6119929 into main Dec 18, 2023
2 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant