-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SWAR Demos - stage of converting eight ASCII bytes to int and string length, including AVX2 implementation. #72
Conversation
…swar benchmarks to share between Google benchmark and Catch2
Co-authored-by: thecppzoo <thecppzoo@gmail.com>
SWAR Demos on ARM
@@ -23,7 +29,7 @@ uint32_t parse_eight_digits_swar(const char *chars) { | |||
|
|||
// Note: eight digits can represent from 0 to (10^9) - 1, the logarithm base 2 | |||
// of 10^9 is slightly less than 30, thus, only 30 bits are needed. | |||
auto lemire_as_zoo_swar(const char *chars) { | |||
uint32_t lemire_as_zoo_swar(const char *chars) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truly astonishing to see you return a concrete type. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an exported symbol, I have to manually set the type in the header, I may as well set it manually in the implementation. Good to see that you noticed
namespace zoo { | ||
|
||
//constexpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raises the question of why it isn't constexpr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would not be exportable.
Also, all of SIMD support is un-usable in constexpr. I should delete the comment, but that's the reason why we have to dial down constexpr quite a lot in many parts of the library.
// speculative, it might read bytes outside of the actual string. | ||
// It is safe to read within the page where the string occurs, and to | ||
// guarantee that, simply make aligned reads because the size of the SWAR | ||
// base size will always divide the memory page size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this comment go in a SWAR-library level document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I have not yet created such a document. Because this is introductory code we can keep it like this, right? once we have a body of improvements we can populate other parts of the library such as design documentation.
The most critical piece that only lives as a comment is the one describing associative iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be valuable to set up a Doxygen generation from this and host it somewhere?
Starting to have some documentation infra could motivate more and better documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once in a blue moon I try Doxygen. Sadly, it just does not understand my code. I still use doxygen comments in the code.
// AVX does not offer a practical way to generate a mask of all ones in | ||
// the least significant positions, thus we cant invoke adjustFor_strlen. | ||
// We will do the first iteration as a special case to explicitly take into | ||
// account misalignment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were any of the comments generated by ChatGPT? I think that could be good to know too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I opened a can of worms by crediting ChatGPT 4.
Yes, some comments are from the tool, the others mine, and I forgot which is who's.
What's your opinion about crediting the AI? I think it is sufficient to indicate that a tool was involved and the result is hybrid.
What I wanted was to let the reader know I did not have to do the heavy lifting of selecting the intrinsics.
The ARM/NEON flavor was a different experience, though, I did not manage to get usable code from the tool, I had to decode the GLIBC assembler to learn the counter-intuitive technique of, to narrow bytes to nibbles, to start by making byte-pairs to shift them by 4 bits (a nibble) so the upper and lower nibbles straddle the byte boundary, then reduce to bytes, which in reality is a pair of nibbles.
If I indicate what comments were mine, sure, but when does it stop? do I also indicate my prompts? what about my user configuration, so that people can replicate exactly? then I'd also have to indicate the day when I had the session, since the tool can change radically one day to the next.
ARM Strlen from Godbolt
No description provided.