Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Breakpad format #477

Merged
merged 8 commits into from
Jan 18, 2024
Merged

Add support for Breakpad format #477

merged 8 commits into from
Jan 18, 2024

Conversation

danielocfb
Copy link
Collaborator

@danielocfb danielocfb commented Jan 9, 2024

Please see individual commits for descriptions.

Error reporting of the existing Breakpad parsers is pretty insufficient:
it merely can tell that something failed and along with a high level
classification (e.g., whitespace error), but in all but trivial files
that is not enough to root cause a problem.
This change improves the error reporting capabilities of the Breakpad
parser along with the plumbing of translating such errors into those
handling by our crate.

Signed-off-by: Daniel Müller <deso@posteo.net>
This change adds support for inlined function reporting to our Breakpad
logic. Note that for this format inlined functions are currently parsed
(and stored) unconditionally when the entire file is parsed. That is
different to other formats that try to do that in a more lazy fashion.
We could potentially refine the logic later on, but the effort does not
justify the gains at this point. We still honor the user's desire and
only look it up and report it if asked for, of course.

Signed-off-by: Daniel Müller <deso@posteo.net>
Copy link

codecov bot commented Jan 9, 2024

Codecov Report

Attention: 65 lines in your changes are missing coverage. Please review.

Comparison is base (8c2e23f) 92.81% compared to head (d6734f3) 93.20%.

❗ Current head d6734f3 differs from pull request most recent head 5101ad4. Consider uploading reports for the commit 5101ad4 to get more accurate results

Files Patch % Lines
src/breakpad/file.rs 71.28% 29 Missing ⚠️
src/breakpad/parser.rs 97.32% 24 Missing ⚠️
src/breakpad/resolver.rs 91.20% 8 Missing ⚠️
src/breakpad/types.rs 94.73% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #477      +/-   ##
==========================================
+ Coverage   92.81%   93.20%   +0.38%     
==========================================
  Files          41       45       +4     
  Lines        6500     7710    +1210     
==========================================
+ Hits         6033     7186    +1153     
- Misses        467      524      +57     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@d-e-s-o d-e-s-o marked this pull request as ready for review January 11, 2024 18:51
@d-e-s-o d-e-s-o requested a review from anakryiko January 11, 2024 18:51
@d-e-s-o
Copy link
Collaborator

d-e-s-o commented Jan 11, 2024

This is probably ready for review now. @anakryiko can you take a look when you get a chance?

FYI @salvatorebenedetto @simpleton; feel free to try it out.

With the introduction of the dump_syms build dependency we can no longer
run ASAN on C & C++ code. The reason is that symbolic-demangle, a
transitive dependency, fails to link in an ASAN enabled setup when used
from a build script (it works fine when used as a regular dependency).
That seems to have to do with them using clang for compilation (as they
assume LLVM specific headers, because their Swift support is a dump from
LLVM), whereas build scripts appear to be built using gcc. At least that
is what it looks like to me and I couldn't find a knob to change that.
This change removes the CFLAGS & CXXFLAGS settings from the sanitizer
runs, causing us to stop sanitizing C & C++ code. We don't have that in
our crate anyway, so there is no loss in coverage.

Signed-off-by: Daniel Müller <deso@posteo.net>
This change adds support for symbolization of Breakpad files (typically
carrying the .sym extension) to blazecli.

  $ cargo run -p blazecli -- symbolize breakpad --path data/test-stable-addresses.sym 0x100 0x200 0x000
  > 0x00000000000100: factorial @ 0x100+0x0 data/test-stable-addresses.c:8
  > 0x00000000000200: factorial_inline_test @ 0x200+0x0 data/test-stable-addresses.c:31
  > 0x00000000000000: <no-symbol>

Signed-off-by: Daniel Müller <deso@posteo.net>
This change adds a benchmark for symbolization of an address using our
Breakpad resolver, for comparison of performance with other
symbolization sources such as Gsym and DWARF.

> main/symbolize_breakpad
>  time:   [243.71 ms 244.05 ms 244.42 ms]
> main/symbolize_elf
>  time:   [9.3714 ms 9.4070 ms 9.4449 ms]
> main/symbolize_dwarf
>  time:   [78.358 ms 78.750 ms 79.174 ms]
> main/symbolize_gsym
>  time:   [18.901 µs 19.192 µs 19.541 µs]

Signed-off-by: Daniel Müller <deso@posteo.net>
Copy link
Member

@anakryiko anakryiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of rubber stamping here. On the cursory high-level reading everything makes sense (I didn't look to much into parser itself, of course). Seems like a natural addition to existing API infrastructure, which is nice.

Thank for including benchmark numbers, it's quite funny that it's few times slower than DWARF (and probably few times larger file as well), but whatever Google wants for Android, I guess :)

Great addition for feature completeness of blazesym, nevertheless.

@d-e-s-o
Copy link
Collaborator

d-e-s-o commented Jan 18, 2024

Thanks for the review!

Thank for including benchmark numbers, it's quite funny that it's few times slower than DWARF (and probably few times larger file as well)

Yeah, I suspect (but haven't checked) that the piece-meal file reading could have some impact and we do two binary searches when conceptually we should only be doing a single one, and we allocate much more. We can see if we can optimize some more if need be in the future. Interestingly, size-wise it's actually not all that bad:

-rw-r--r-- 1 1000 1000 328911328 Jan 17 13:44 vmlinux-5.17.12-100.fc34.x86_64.dwarf
-rw-r--r-- 1 1000 1000  98608176 Jan 17 13:44 vmlinux-5.17.12-100.fc34.x86_64.elf
-rw-r--r-- 1 1000 1000  60360789 Jan 17 13:44 vmlinux-5.17.12-100.fc34.x86_64.sym
-rw-r--r-- 1 1000 1000  18214628 Jan 17 13:44 vmlinux-5.17.12-100.fc34.x86_64.gsym

(and that's with all the CFI stuff)

In the future we would like to support the Breakpad format for use as a
symbolization source. The closest suitable crate providing this
functionality is breakpad-symbols [0]. However, it contains a slew of
other functionality that we have no use for, including a HTTP based symbol
retriever, a stack walker, as well as plumbing & dependencies for async
APIs. On top of that, their parsers error reporting is insufficient at
best.
With this change we pull in a snapshot of the core parser logic and
adjust it as per our needs. Right now that mostly means removing of
async bits, eliminating the range-map and various other dependencies,
and removing unnecessary bits from the parser (e.g., for stack unwinding
information). The snapshot is based on rust-minidump [1] at
commit a0689028d5b13232c82141672edc3f4d70817bd7.
This functionality is still unused. Subsequent changes will add a
resolver that relies on it.

[0] https://crates.io/crates/breakpad-symbols
[1] https://github.com/rust-minidump/rust-minidump

Signed-off-by: Daniel Müller <deso@posteo.net>
In order to test the Breakpad functionality we are going to need .sym
files. The dump_syms tool is a common way for creating such files from
an existing ELF/DWARF input [0]. Alas, it's basically not packaged for
any distribution that is of use to us [1]. Lucky for us, there exist a
Rust crate of the same name that produces this data as well and so with
this change we pull it in to create a .sym file for our
test-stable-addresses.bin test executable.

[0] https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/linux_starter_guide.md#producing-symbols-for-your-application
[1] https://wiki.ubuntu.com/ErrorTracker/BreakpadApplicationSupport

Signed-off-by: Daniel Müller <deso@posteo.net>
This change introduces a resolver for the Breakpad symbol format along
with the necessary infrastructure (e.g., symbolization source).
Currently it does not have support for zero-copy symbolization, as we
have for most other formats. This can be improved in the future, should
the need arise. The functionality is guarded by the 'breakpad' feature,
which is disabled by default.

Signed-off-by: Daniel Müller <deso@posteo.net>
@d-e-s-o d-e-s-o merged commit c7cdcdf into libbpf:main Jan 18, 2024
31 checks passed
@d-e-s-o d-e-s-o deleted the topic/breakpad branch January 18, 2024 17:26
@d-e-s-o d-e-s-o mentioned this pull request Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants