-
Notifications
You must be signed in to change notification settings - Fork 634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metal optimizations #2062
Metal optimizations #2062
Conversation
- Significant changes to optimize memory usage. - Adjusted formatting of the metal shader code to improve readability. - Normalized naming conventions in shader code. - Abstracted repetitive code for attribute descriptors to a helper function.
@mitchellh I'm happy to test the builds on my Intel macOS 13 potato, but that machine is not set up for Zig builds. Could you enable the build workflows for this PR so I can grab the artefacts for testing? (I'm also open for suggestions what to test specifically to exercise all the codepaths.) |
Apparently I can't build the release from 3rd party forks :( Let me see what I can do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks amazing, some really tiny comments. I'll try to figure out the Intel builds.
@@ -845,6 +678,41 @@ fn initImagePipeline(device: objc.Object, library: objc.Object) !objc.Object { | |||
return pipeline_state; | |||
} | |||
|
|||
fn autoAttribute(T: type, attrs: objc.Object) void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well well well aren't we fancy. ❤️
asciinema: bug.cast.zip
|
That's one way, I guess :) Happy to report this build looks fine and even fixes the regression with the earlier build I reported on Discord. |
This PR significantly reworks the Metal shaders for memory efficiency. Additionally I've adjusted the formatting of the shader code for readability, and normalized the naming conventions.
There are two main changes that affect memory efficiency.
cols * rows
totaluchar4
s representing the cell colors, which are drawn with a single fullscreen shader.[2]i32
to[2]i16
, it seems like a sane expectation to me that the bearings will never be more than 32K pixels in a given direction.Beyond this, I have added a couple other small optimizations, such as a fullscreen vertex shader that uses a single triangle, and using pixel coordinates to sample the glyph texture so that conversion to normalized coordinates isn't necessary.
Results
Experimentally I saw a ~20% reduction in total GPU memory use, and a ~80% reduction in max allocated buffer size.
TODO
Caution
There is a very real chance that this PR could have issues under macOS 13 for Intel processors, due to the known bugginess of the macOS 13 Intel Metal drivers. Comprehensive testing on an Intel mac running macOS 13 must be performed before this can be merged.