Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retsnoop improvements for LBR, stitched stacks, and interim stacks #72

Merged
merged 9 commits into from
Aug 30, 2024

Conversation

anakryiko
Copy link
Owner

No description provided.

Handle cases when ksym can't be found better (print address instead of
nothing at all). And also make sure that stack trace symbolization only
considers function symbols, not data symbols.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
If some deep function errored out, but then that error was ignored,
entire stack isn't really an error stack, so mark it as such.

One simple way to produce such situation is by tracing with retsnoop:

  sudo retsnoop -e bpf_prog_load -a pop_stack -T -E -n simfail

And triggering successful BPF program load with simfail:

  sudo simfail bpf-simple-obj

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Retsnoop uses a bunch of special unicode characters, extract all of them
into named #defines.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Logging from BPF side shouldn't use explicit newlines in string
literals, as they are appended automatically.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
When we stich together call stack due to intervening non-failing traced
functions, we should preserve the deepest nested stack trace, as that
gives us the most information.

The difference can be easily seen with the following trigger (using
veristat on one of BPF selftest failing programs):

  sudo veristat tools/testing/selftests/bpf/test_global_func10.bpf.o -f global_func10

Now, tracing the following with retsnoop (no intervening functions, no
stitching required):

  sudo retsnoop -T -e 'bpf_prog_load' -n veristat -E \
      -a do_check_common -a do_check -a check_func_call \
      -a btf_check_subprog_call -a check_stack_access_within_bounds \
      -a check_stack_range_initialized -a check_helper_mem_access

... gives the following useful output:

  20:36:15.718917 -> 20:36:15.718991 TID/PID 3782767/3782767 (veristat/veristat):

  FUNCTION CALLS                                                       RESULT     DURATION
  ------------------------------------------------------------------   ---------  --------
  → bpf_prog_load
      → bpf_check
          → do_check_common
              → do_check
                  ↔ check_stack_access_within_bounds                   [0]         2.701us
                  → check_func_call
                      → btf_check_subprog_call
                          → check_helper_mem_access
                              → check_stack_range_initialized
                                  ↔ check_stack_access_within_bounds   [-EINVAL]   2.035us
                              ← check_stack_range_initialized          [-EINVAL]  13.428us
                          ← check_helper_mem_access                    [-EINVAL]  18.536us
                      ← btf_check_subprog_call                         [-EINVAL]  21.193us
                  ← check_func_call                                    [-EINVAL]  23.204us
              ← do_check                                               [-EINVAL]  32.541us
          ← do_check_common                                            [-EINVAL]  37.686us
      ← bpf_check                                                      [-EINVAL]  59.636us
  ← bpf_prog_load                                                      [-EINVAL]  71.932us

                      entry_SYSCALL_64_after_hwframe+0x6c  (entry_SYSCALL_64 @ arch/x86/entry/entry_64.S:130)
                      do_syscall_64+0x58                   (arch/x86/entry/common.c:83)
                      . do_syscall_x64                     (arch/x86/entry/common.c:52)
                      __x64_sys_bpf+0x18                   (kernel/bpf/syscall.c:5765)
                      . __se_sys_bpf                       (kernel/bpf/syscall.c:5765)
                      . __do_sys_bpf                       (kernel/bpf/syscall.c:5767)
                      __sys_bpf+0x33d                      (kernel/bpf/syscall.c:0)
      71us [-EINVAL]  bpf_prog_load+0x59a                  (kernel/bpf/syscall.c:2895)
      59us [-EINVAL]  bpf_check+0x894                      (kernel/bpf/verifier.c:21307)
                      . do_check_main                      (kernel/bpf/verifier.c:20637)
      37us [-EINVAL]  do_check_common+0x38a                (kernel/bpf/verifier.c:20546)
      32us [-EINVAL]  do_check+0x1b03                      (kernel/bpf/verifier.c:0)
      23us [-EINVAL]  check_func_call+0x96                 (kernel/bpf/verifier.c:0)
      21us [-EINVAL]  btf_check_subprog_call+0x191         (kernel/bpf/verifier.c:9459)
                      . btf_check_func_arg_match           (kernel/bpf/verifier.c:0)
                      check_mem_reg+0x127                  (kernel/bpf/verifier.c:7421)
      18us [-EINVAL]  check_helper_mem_access+0x142        (kernel/bpf/verifier.c:0)
      13us [-EINVAL]  check_stack_range_initialized+0xaf   (kernel/bpf/verifier.c:7122)
  !    2us [-EINVAL]  check_stack_access_within_bounds

Now, if we add `-a verbose`, verbose screws up actual stack trace (note
no offsets, no file:line info, etc):

  sudo retsnoop -T -e 'bpf_prog_load' -n veristat -E \
      -a do_check_common -a do_check -a check_func_call \
      -a btf_check_subprog_call -a check_stack_access_within_bounds \
      -a check_stack_range_initialized -a check_helper_mem_access \
      -a verbose

  20:37:21.703351 -> 20:37:21.703440 TID/PID 3785005/3785005 (veristat/veristat):

  FUNCTION CALLS                                                       RESULT     DURATION
  ------------------------------------------------------------------   ---------  --------
  → bpf_prog_load
      → bpf_check
          → do_check_common
              → do_check
                  ↔ check_stack_access_within_bounds                   [0]         2.617us
                  → check_func_call
                      → btf_check_subprog_call
                          → check_helper_mem_access
                              → check_stack_range_initialized
                                  → check_stack_access_within_bounds
                                      ↔ verbose                        [void]      1.985us
                                  ← check_stack_access_within_bounds   [-EINVAL]   4.568us
                              ← check_stack_range_initialized          [-EINVAL]  13.933us
                          ← check_helper_mem_access                    [-EINVAL]  19.404us
                      ← btf_check_subprog_call                         [-EINVAL]  21.383us
                      ↔ verbose                                        [void]      1.276us
                  ← check_func_call                                    [-EINVAL]  24.489us
              ← do_check                                               [-EINVAL]  33.878us
          ← do_check_common                                            [-EINVAL]  39.250us
          ↔ verbose                                                    [void]      1.196us
          ↔ verbose                                                    [void]      0.903us
          ↔ verbose                                                    [void]      0.872us
          ↔ verbose                                                    [void]      0.875us
          ↔ verbose                                                    [void]      0.735us
          ↔ verbose                                                    [void]      0.794us
          ↔ verbose                                                    [void]      2.842us
      ← bpf_check                                                      [-EINVAL]  72.025us
  ← bpf_prog_load                                                      [-EINVAL]  85.944us

                      entry_SYSCALL_64_after_hwframe+0x6c  (entry_SYSCALL_64 @ arch/x86/entry/entry_64.S:130)
                      do_syscall_64+0x58                   (arch/x86/entry/common.c:83)
                      . do_syscall_x64                     (arch/x86/entry/common.c:52)
                      __x64_sys_bpf+0x18                   (kernel/bpf/syscall.c:5765)
                      . __se_sys_bpf                       (kernel/bpf/syscall.c:5765)
                      . __do_sys_bpf                       (kernel/bpf/syscall.c:5767)
                      __sys_bpf+0x33d                      (kernel/bpf/syscall.c:0)
      85us [-EINVAL]  bpf_prog_load+0x59a                  (kernel/bpf/syscall.c:2895)
  !   72us [-EINVAL]  bpf_check
  !⌇  39us [-EINVAL]  do_check_common
  !⌇  33us [-EINVAL]  do_check
  !⌇  24us [-EINVAL]  check_func_call
  !⌇  21us [-EINVAL]  btf_check_subprog_call
  !⌇  19us [-EINVAL]  check_helper_mem_access
  !⌇  13us [-EINVAL]  check_stack_range_initialized
  !⌇   4us [-EINVAL]  check_stack_access_within_bounds

With the fixes in this patch, we get much more complete and useful output:

  20:39:59.010233 -> 20:39:59.010330 TID/PID 3796075/3796075 (veristat/veristat):

  FUNCTION CALLS                                                       RESULT     DURATION
  ------------------------------------------------------------------   ---------  --------
  → bpf_prog_load
      → bpf_check
          → do_check_common
              → do_check
                  ↔ check_stack_access_within_bounds                   [0]         2.279us
                  → check_func_call
                      → btf_check_subprog_call
                          → check_helper_mem_access
                              → check_stack_range_initialized
                                  → check_stack_access_within_bounds
                                      ↔ verbose                        [void]      2.304us
                                  ← check_stack_access_within_bounds   [-EINVAL]   5.028us
                              ← check_stack_range_initialized          [-EINVAL]  15.827us
                          ← check_helper_mem_access                    [-EINVAL]  20.187us
                      ← btf_check_subprog_call                         [-EINVAL]  22.995us
                      ↔ verbose                                        [void]      1.437us
                  ← check_func_call                                    [-EINVAL]  26.565us
              ← do_check                                               [-EINVAL]  37.165us
          ← do_check_common                                            [-EINVAL]  43.172us
          ↔ verbose                                                    [void]      1.237us
          ↔ verbose                                                    [void]      0.973us
          ↔ verbose                                                    [void]      0.959us
          ↔ verbose                                                    [void]      0.914us
          ↔ verbose                                                    [void]      0.784us
          ↔ verbose                                                    [void]      1.858us
          ↔ verbose                                                    [void]      2.434us
      ← bpf_check                                                      [-EINVAL]  77.849us
  ← bpf_prog_load                                                      [-EINVAL]  93.830us

                      entry_SYSCALL_64_after_hwframe+0x6c  (entry_SYSCALL_64 @ arch/x86/entry/entry_64.S:130)
                      do_syscall_64+0x58                   (arch/x86/entry/common.c:83)
                      . do_syscall_x64                     (arch/x86/entry/common.c:52)
                      __x64_sys_bpf+0x18                   (kernel/bpf/syscall.c:5765)
                      . __se_sys_bpf                       (kernel/bpf/syscall.c:5765)
                      . __do_sys_bpf                       (kernel/bpf/syscall.c:5767)
                      __sys_bpf+0x33d                      (kernel/bpf/syscall.c:0)
      93us [-EINVAL]  bpf_prog_load+0x59a                  (kernel/bpf/syscall.c:2895)
      77us [-EINVAL]  bpf_check+0x894                      (kernel/bpf/verifier.c:21307)
                      . do_check_main                      (kernel/bpf/verifier.c:20637)
   ⌇  43us [-EINVAL]  do_check_common+0x38a                (kernel/bpf/verifier.c:20546)
   ⌇  37us [-EINVAL]  do_check+0x1b03                      (kernel/bpf/verifier.c:0)
   ⌇  26us [-EINVAL]  check_func_call+0x96                 (kernel/bpf/verifier.c:0)
   ⌇  22us [-EINVAL]  btf_check_subprog_call+0x191         (kernel/bpf/verifier.c:9459)
   ⌇                  . btf_check_func_arg_match           (kernel/bpf/verifier.c:0)
                      check_mem_reg+0x127                  (kernel/bpf/verifier.c:7421)
   ⌇  20us [-EINVAL]  check_helper_mem_access+0x142        (kernel/bpf/verifier.c:0)
   ⌇  15us [-EINVAL]  check_stack_range_initialized+0xaf   (kernel/bpf/verifier.c:7122)
  !⌇   5us [-EINVAL]  check_stack_access_within_bounds

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Similarly to how we preserve deepest kernel stack trace for stitched
stacks, do the same with LBRs, which should give user more useful
information about deepest nested failing function.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
This also enables sending less data over BPF ringbuf.
BPF logic will now concatenate stitched parts to form one logical
stitched stack, user-space part now only needs to know at which point
the stitching happened (for visualization), which is provided through
stitched_pos field.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
retsnoop used to support --intermediate-stacks (-A) before function args
capture was added. It's still useful feature for some deeper and more
advanced debugging scenarios.

So add it back, taking into account all the session concepts added. Now
it's "interim stacks" controlled by --interim-stacks (-I) flag.

Retsnoop will attempt to minimize the amount of interim stacks reported,
so whenever stack stitching preserves all the captured information for
later, we'll postpone submitting interim stack. Only when data loss is
occurring (i.e., we are overwriting some of the captured function call
information), will retsnoop emit interim (incomplete, in progress) trace
and call stack information (together with its LBR data, of course).

Note, previously we wouldn't emit function trace data for interim
stacks. We change that now and emit relevant portions of it, as it
significantly helps in debugging by augment call stack with extra
information.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
This is a nice way to get LBR and stack traces deep inside kernel
functions, if necessary.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
@anakryiko anakryiko merged commit 240b0be into master Aug 30, 2024
2 checks passed
@anakryiko anakryiko deleted the retsnoop-inj-probe-lbr branch August 30, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant