-
Notifications
You must be signed in to change notification settings - Fork 1.7k
gzip: in_forward: Fix concatenated gzip payloads gzip concatenated #10259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gzip-concatenated
Are you sure you want to change the base?
gzip: in_forward: Fix concatenated gzip payloads gzip concatenated #10259
Conversation
39ced5c
to
6bb63a5
Compare
I fixed the windows and centos7 build issues and separated the commit into two commits based on my reading of https://github.com/fluent/fluent-bit/blob/master/CONTRIBUTING.md#commit-changes. |
This commit 6bb63a5 does not have Signed-off line. So, we need to mark there. |
The implementation of flb_gzip_count is flawed as it relies on looking for valid gzip headers. A gzip payload can be generated that includes a valid gzip header in the gzip body - see test_header_in_gzip_body. Removed flb_gzip_count and associated handling in favor of utilizing mz_inflate to find the boundaries between concatenated gzip payloads during decompression. mz_inflate will stop when it reaches the end of the gzip body and mz_stream.in_avail contains the bytes left in the buffer for processing. Signed-off-by: Brandon Strub <brandon.strub@veritas.com>
Utilize new flb_gzip_uncompress_multi method to support concatenated gzip payloads. Signed-off-by: Brandon Strub <brandon.strub@veritas.com>
f5e9580
to
6ed4a01
Compare
@cosmo0920 it seems the DCO is fixed now, would you please validate the forward/zip compat side ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented minor style issues.
I confirmed that this PR works against our internal reproducible.
Thanks for fixing this long standing issue. 😄
} else if (status == MZ_STREAM_END) { | ||
/* Successfully completed */ | ||
buffer_lengths[buffer_index] = FLB_GZIP_BUFFER_SIZE - stream.avail_out; | ||
break; | ||
|
||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, it works perfectly.
I found some of the minor coding style violations.
We need to add newlines before else if
and else
clauses.
Other behavior is great.
Fix concatenated gzip payload handling.
The implementation of flb_gzip_count is flawed as it relies on looking
for valid gzip headers. A gzip payload can be generated that includes
a valid gzip header in the gzip body - see test_header_in_gzip_body.
Removed flb_gzip_count and associated handling in favor of utilizing
mz_inflate to find the boundaries between concatenated gzip payloads
during decompression. mz_inflate will stop when it reaches the end of
the gzip body and mz_stream.in_avail contains the bytes left in the
buffer for processing.
With this change we are no longer able to allocate the exact memory
required by reading the decompressed length out of the gzip footer
(since we don't know where it is). Instead we allocate buffers of size
FLB_GZIP_BUFFER_SIZE (1MB) as an intermediate location to put the
decompressed data and use MZ_SYNC_FLUSH instead of MZ_FINISH
when reading to allow mz_inflate to return back to us when it needs
more buffer space. In the case of requiring more space we allocate
another buffer (up to FLB_GZIP_MAX_BUFFERS (100)) and call
mz_inflate again. Once complete we allocate the final buffer and copy
the data from the buffers in. This means that we use at least twice the
amount of memory as before for the short period where we are
copying the data from the intermediate buffers to the final buffer.
Addresses #9058.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Forward output plugin side:
Forward input plugin side:
fluentbit_debug_output.txt
run_code_analysis_output.txt
I tried running valgrind in an actual running environment as well, but nearly all TLS communication failed. I can look into this more if required.
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.