Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup Trilogy#escape by 3 to 5x #212

Merged
merged 1 commit into from
Dec 19, 2024

Conversation

byroot
Copy link
Contributor

@byroot byroot commented Dec 3, 2024

Copying characters one by one is very inneficient, especially since each time the buffer capacity has to be checked again and again.

Using memcpy for chunks of bytes that don't need any escaping, we can save a lot of time.

Before:

ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
            2k-plain    23.802k i/100ms
       2k-few-escape    22.693k i/100ms
Calculating -------------------------------------
            2k-plain    239.746k (± 0.5%) i/s    (4.17 μs/i) -      1.214M in   5.063403s
       2k-few-escape    227.900k (± 0.6%) i/s    (4.39 μs/i) -      1.157M in   5.078454s

After:

Warming up --------------------------------------
            2k-plain    94.975k i/100ms
       2k-few-escape    61.998k i/100ms
Calculating -------------------------------------
            2k-plain      1.039M (± 1.0%) i/s -      5.224M in   5.025882s
       2k-few-escape    658.756k (± 1.3%) i/s -      3.348M in   5.082975s

Bench:

require "trilogy"
require "benchmark/ips"
t = Trilogy.new(database: "test")

plain = "A" * 2000
few_escapes = (("a" * 19) + "\n") * 100
Benchmark.ips do |x|
  x.report("2k-plain") { t.escape(plain) }
  x.report("2k-few-escape") { t.escape(few_escapes) }
end

@jhawthorn @matthewd

@byroot byroot force-pushed the opt-escape branch 8 times, most recently from bad7364 to 9c56566 Compare December 3, 2024 16:05
* TRILOGY_OK - The character was appended to the buffer
* TRILOGY_SYSERR - A system error occurred, check errno.
*/
int trilogy_buffer_puts(trilogy_buffer_t *buffer, const uint8_t *ptr, size_t len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nitpick, but puts to me implies a newline (like C's puts or Ruby's puts). Maybe we call this trilogy_buffer_write?

src/client.c Outdated
// so might as well pre-expand the buffer.
CHECKED(trilogy_buffer_expand(b, len));

uint8_t *cursor = (uint8_t *)str;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's technically valid as-is, but we should be able to preserve str's const

Suggested change
uint8_t *cursor = (uint8_t *)str;
const uint8_t *cursor = (const uint8_t *)str;

src/client.c Outdated
CHECKED(trilogy_buffer_putc(b, '\''));
if (conn->server_status & TRILOGY_SERVER_STATUS_NO_BACKSLASH_ESCAPES) {
while (cursor < end) {
uint8_t *next_backslash = memchr(cursor, '\'', (size_t)(end - cursor));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually the next backslash, it's the next quote. Maybe we call this next_escape?

@jhawthorn
Copy link
Member

Looks great to me other than minor naming issues!

Copying characters one by one is very inneficient, especially since each
time the buffer capacity has to be checked again and again.

Using `memcpy` for chunks of bytes that don't need any escaping, we can
save a lot of time.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
            2k-plain    23.802k i/100ms
       2k-few-escape    22.693k i/100ms
Calculating -------------------------------------
            2k-plain    239.746k (± 0.5%) i/s    (4.17 μs/i) -      1.214M in   5.063403s
       2k-few-escape    227.900k (± 0.6%) i/s    (4.39 μs/i) -      1.157M in   5.078454s
```

After:

```
Warming up --------------------------------------
            2k-plain    94.975k i/100ms
       2k-few-escape    61.998k i/100ms
Calculating -------------------------------------
            2k-plain      1.039M (± 1.0%) i/s -      5.224M in   5.025882s
       2k-few-escape    658.756k (± 1.3%) i/s -      3.348M in   5.082975s
```

Bench:

```ruby

require "trilogy"
require "benchmark/ips"
t = Trilogy.new(database: "test")

plain = "A" * 2000
few_escapes = (("a" * 19) + "\n") * 100
Benchmark.ips do |x|
  x.report("2k-plain") { t.escape(plain) }
  x.report("2k-few-escape") { t.escape(few_escapes) }
end
```
@jhawthorn jhawthorn merged commit b517fcf into trilogy-libraries:main Dec 19, 2024
9 checks passed
@jhawthorn
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants