-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup Trilogy#escape by 3 to 5x #212
Conversation
bad7364
to
9c56566
Compare
inc/trilogy/buffer.h
Outdated
* TRILOGY_OK - The character was appended to the buffer | ||
* TRILOGY_SYSERR - A system error occurred, check errno. | ||
*/ | ||
int trilogy_buffer_puts(trilogy_buffer_t *buffer, const uint8_t *ptr, size_t len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a nitpick, but puts
to me implies a newline (like C's puts or Ruby's puts). Maybe we call this trilogy_buffer_write
?
src/client.c
Outdated
// so might as well pre-expand the buffer. | ||
CHECKED(trilogy_buffer_expand(b, len)); | ||
|
||
uint8_t *cursor = (uint8_t *)str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's technically valid as-is, but we should be able to preserve str's const
uint8_t *cursor = (uint8_t *)str; | |
const uint8_t *cursor = (const uint8_t *)str; |
src/client.c
Outdated
CHECKED(trilogy_buffer_putc(b, '\'')); | ||
if (conn->server_status & TRILOGY_SERVER_STATUS_NO_BACKSLASH_ESCAPES) { | ||
while (cursor < end) { | ||
uint8_t *next_backslash = memchr(cursor, '\'', (size_t)(end - cursor)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually the next backslash, it's the next quote. Maybe we call this next_escape
?
Looks great to me other than minor naming issues! |
Copying characters one by one is very inneficient, especially since each time the buffer capacity has to be checked again and again. Using `memcpy` for chunks of bytes that don't need any escaping, we can save a lot of time. Before: ``` ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23] Warming up -------------------------------------- 2k-plain 23.802k i/100ms 2k-few-escape 22.693k i/100ms Calculating ------------------------------------- 2k-plain 239.746k (± 0.5%) i/s (4.17 μs/i) - 1.214M in 5.063403s 2k-few-escape 227.900k (± 0.6%) i/s (4.39 μs/i) - 1.157M in 5.078454s ``` After: ``` Warming up -------------------------------------- 2k-plain 94.975k i/100ms 2k-few-escape 61.998k i/100ms Calculating ------------------------------------- 2k-plain 1.039M (± 1.0%) i/s - 5.224M in 5.025882s 2k-few-escape 658.756k (± 1.3%) i/s - 3.348M in 5.082975s ``` Bench: ```ruby require "trilogy" require "benchmark/ips" t = Trilogy.new(database: "test") plain = "A" * 2000 few_escapes = (("a" * 19) + "\n") * 100 Benchmark.ips do |x| x.report("2k-plain") { t.escape(plain) } x.report("2k-few-escape") { t.escape(few_escapes) } end ```
Thank you! |
Copying characters one by one is very inneficient, especially since each time the buffer capacity has to be checked again and again.
Using
memcpy
for chunks of bytes that don't need any escaping, we can save a lot of time.Before:
After:
Bench:
@jhawthorn @matthewd