-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize NEON encoding for high dynamic range blocks #663
Optimize NEON encoding for high dynamic range blocks #663
Conversation
We can improve the encoding performance on Apple silicon configurations with the following enhancements - Replace the switch with a for loop - Invert qlp_factors & remove else conditions as branches not taken are returned out 13% - 17% improvement depending on compiler, configuration and .wav Clang 16 or GCC 13 necessary
What kind of audio did you use to get these numbers? Perhaps somewhat rude, but I am currently not inclined to accept this PR. This change is not covered by fuzzing, provides a modest improvement for a very niche use. So, I feel like the maintenance burden of this addition is not proportional to the gains. |
Hi @ktmf01 The optimization is designed to improve high bit rate audio. The test WAVs are detailed below. It's all stereo, 24bps, music files across many genres. This table was against macOS 14.4 + 16 inch MBP M2.
We'd be happy to review the PR further, including if you'd like additional fuzzing tests. A performance improvement for high res audio seems inline with FLAC use cases, i.e. #669 |
Okay, nevermind, this function is indeed used for 24-bit material. But why have you made this "NEON-exclusive"? It seems to me this code would work just fine for AMD64 as well and provide a performance benefit. Considering fuzzing, the problem is not in the code, but in the hardware. Currently fuzzing is not run on ARMv8/v9 hardware, meaning this code is not fuzzed, because it is exclusive for that hardware. If it were not exclusive, fuzzing would not be a problem |
Appreciate the dialog, thank you for taking the time to review and consider these changes! We made it exclusive because the benefit was much more pronounced on that architecture and flat on others. You are correct that it could be enabled for other architectures and that would also resolve the fuzzing issue. We'd be happy to modify the PR to not constrain it to only NEON. If there are specific WAVs and presets the project would like to see it tested against let us know! We will update with the results using those inputs and the expanded architecture. |
While this was not the intent of this PR, thank you for (indirectly) pointing me to this. I've merged two PRs, #700 and #702, which should make Could you perhaps rebase this PR against the current state of the repository, and re-evaluate this PR? |
Of course, standby. |
We can improve the encoding performance on Apple silicon configurations with the following enhancements
13% - 17% improvement depending on compiler, configuration and .wav
Clang 16 or GCC 13 necessary