-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add module loading support #49
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This isn't loading anything yet, but it's traversing the class path to scan directories and JARs so it can register which files belong to which classes.
Previously, jank only supported array maps, which had a max size of 8 k/v pairs. Now those get promoted to a hash map. Each of these has a distinct object type and sequence object type, but they share a good chunk of the same implementation through base types. Since there are multiple types of map now, meta maps had to be type-erased; we can't know which kind of map it will be. Also, a nasty, bug was lingering in the native_array_map implementation, to do with incorrect indices. That was found and fixed.
This is distinct from option<E> in that the value is clearly an error.
This involves compiling namespaces to .cpp files within the class path and then loading them up again. Namespaces are written to their own file, with a special `__init` file also written. The `__init` file needs to be loaded prior to the namespace file being loaded, since it contains all of the dependencies. Any functions created within the namespace are nested modules, using the same `foo.bar$spam` syntax as the JVM. They're all written to individual files, same as with Clojure, and they're loaded using the `__init` module. Having this allows us to load `clojure.core` from pre-compiled .cpp files, rather than from jank source. The startup time of the compiler, when loading `clojure.core` this way, drops from 8.7s to 3.7s, which means it's more than twice as fast now. We can drop this further by compiling the .cpp files to LLVM IR files, or, even further, to object files or pre-compiled modules (PCMs). The framework for this is present, but Cling doesn't actually have an interface to load either of those, so some more intricate work will be needed. I'm going to stick with the .cpp files for now and flesh out the rest of the module loading. This bout of work is not primarily focused on performance gains.
The command-line parsing applies to all existing features and some not yet implemented. The RT profiling will be exanded with some additional tooling, going forward, and I have some scripts already done which aren't quite ready to commit. We'll be able to easily build charts by just adding `--profile` to the jank invocation and then running a script on the generated file. The goal is for this to apply to both compiler and RT internals, as well as user code. We'll see, once there's actually user code. Part of this CLI parsing is the addition of a very simple REPL, coming from `jank repl`. There's no readline, history, completion, coloring, or server support yet, but it allows for some quick jankin' around.
This has per-session history support, searching with ^R, and modifications with chords like ^W. History is not yet persistent to disk. It also doesn't yet support multi-line inputs.
There was an issue when loading pre-compiled (.cpp or otherwise) modules, where we'd try to use analysis information on fns within that module to determine if we can unbox values. However, since we didn't analyze the jank code for that module in this session, we don't have that info. Instead, we should just rely on the meta and the var in that scenario. When we do have the analysis info, though, we make sure to verify what we can.
This will help speed up the tests and ensure that jank is ready to use a pre-compiled clojure.core right after being built. Changing any of the jank source will trigger a rebuild, as it should.
I'm not doing anything different with it, but I've never liked having a header-only lib provide my main for me using a custom define. Call me old fashioned.
This includes the `alias` fn as well as support for resolving aliases for both keywords and symbols during parsing. Fortunately, this allowed us to remove some state from keyword objects, too.
This also fixes a pretty awful bug in the highest arity of `dynamic_call`, which didn't copy all parameters properly.
The Clojure versions are lazy, using a custom seq, but these work for now.
It's missing a couple of bits, like `:rename`, and `:exclude`, and I haven't actually been able to test it yet, due to a bug with variadic arg positions. Will come back to this.
When dynamically calling a function, we need to know three things: 1. Is the function variadic? 2. Is there an ambiguous fixed overload? 3. How many fixed arguments are required before the packed args? We cannot perform the correct call without all of this information. Since function calls are on the hottest path there is, we pack all of this into a single byte. Questions 1 and 2 each get a bit and question 3 gets 4 bits to store the fixed arg count. From there, when we use it, we strip out the bit for question 2 and we switch/case on the rest. This allows us to do a O(1) jump on the combination of whether it's variadic and the required fixed args. Finally, we only need the question 2 bit to disambiguate one branch of each switch. The ambiguity comes in this case: ``` (defn ambiguous ([a] 1) ([a & args] args)) (ambiguous :a) ``` When we call `ambiguous` with a single arg, we want it to match the fixed unary arity. However, given just questions 1 and 3, we will see that we've met the required args and that the function is variadic and we'll instead dispatch to the variadic arity, with an empty sequence for `args`.
This is important for referring vars within another ns.
This was affecting macros like `defmacro`, which were doing a `cons` onto some packed args.
This also implement the `cons` concept for sets and fixes an issue with their equality comparator. Sets actually do things now.
This also fixes an off-by-one in sequence length counting.
WIP, still have borked things with `complement`.
There's more to do here, for vars and constants, but that's just a performance/memory usage win. This was a blocker.
jank is starting to feel like a proper Clojure now. :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://jank-lang.org/blog/2023-12-17-module-loading/