Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown deadlock #1218

Closed
MrAlias opened this issue Oct 25, 2024 · 2 comments · Fixed by #1220
Closed

Shutdown deadlock #1218

MrAlias opened this issue Oct 25, 2024 · 2 comments · Fixed by #1220
Assignees
Labels
bug Something isn't working
Milestone

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Oct 25, 2024

Describe the bug

When the Instrumentation is stopped, it will hang "Waiting for probes to stop...".

{"time":"2024-10-25T17:51:10.92856579Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.(*Analyzer).Analyze","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/analyze.go","line":95},"msg":"found function","function_name":{"Name":"go.opentelemetry.io/auto/sdk.(*tracer).start","Offset":835840,"ReturnOffsets":[835900]}}
{"time":"2024-10-25T17:51:10.928653918Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.(*Analyzer).Analyze","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/analyze.go","line":95},"msg":"found function","function_name":{"Name":"go.opentelemetry.io/auto/sdk.(*span).ended","Offset":842336,"ReturnOffsets":[842381]}}
{"time":"2024-10-25T17:51:10.928731735Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.Allocate","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/allocate.go","line":37},"msg":"Requesting memory allocation","size":262144,"page size":4096,"cpu count":8}
{"time":"2024-10-25T17:51:10.928839757Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":17}
{"time":"2024-10-25T17:51:10.928872159Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":19}
{"time":"2024-10-25T17:51:10.928895469Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":20}
{"time":"2024-10-25T17:51:10.928918944Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":21}
{"time":"2024-10-25T17:51:10.928951209Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":22}
{"time":"2024-10-25T17:51:10.928975996Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":23}
{"time":"2024-10-25T17:51:10.928999503Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.NewTracedProgram","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":125},"msg":"attach successfully","tid":24}
{"time":"2024-10-25T17:51:10.929040001Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.remoteAllocate","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/allocate.go","line":80},"msg":"Set memlock on process successfully"}
{"time":"2024-10-25T17:51:10.929202449Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.(*TracedProgram).Madvise","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":237},"msg":"Detected linux kernel version","version":"6.11.0"}
{"time":"2024-10-25T17:51:10.929305417Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process/ptrace.(*TracedProgram).Mlock","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/ptrace/ptrace_linux.go","line":249},"msg":"mlock ret","ret":0}
{"time":"2024-10-25T17:51:10.929314121Z","level":"INFO","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.remoteAllocate.func1","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/allocate.go","line":70},"msg":"Detaching from process","pid":17}
{"time":"2024-10-25T17:51:10.929342031Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/process.Allocate","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/process/allocate.go","line":48},"msg":"mmaped remote memory","start_addr":"0x75f1ba500000","end_addr":"0x75f1ba540000"}
{"time":"2024-10-25T17:51:10.929354214Z","level":"INFO","source":{"function":"go.opentelemetry.io/auto.NewInstrumentation","file":"/usr/src/go.opentelemetry.io/auto/instrumentation.go","line":113},"msg":"target process analysis completed","pid":17,"go_version":"1.23.2","dependencies":{"go.opentelemetry.io/auto/sdk":"0.0.0-00010101000000-000000000000","go.opentelemetry.io/otel":"1.31.0","go.opentelemetry.io/otel/trace":"1.31.0","std":"1.23.2"},"total_functions_found":2}
{"time":"2024-10-25T17:51:10.929408942Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":5,"InstrumentedPkg":"github.com/segmentio/kafka-go"}}
{"time":"2024-10-25T17:51:10.929431357Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":3,"InstrumentedPkg":"go.opentelemetry.io/otel/internal/global"}}
{"time":"2024-10-25T17:51:10.929468012Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":3,"InstrumentedPkg":"google.golang.org/grpc"}}
{"time":"2024-10-25T17:51:10.9294835Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":2,"InstrumentedPkg":"google.golang.org/grpc"}}
{"time":"2024-10-25T17:51:10.929501227Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":2,"InstrumentedPkg":"net/http"}}
{"time":"2024-10-25T17:51:10.929523081Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":3,"InstrumentedPkg":"net/http"}}
{"time":"2024-10-25T17:51:10.929538895Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":4,"InstrumentedPkg":"github.com/segmentio/kafka-go"}}
{"time":"2024-10-25T17:51:10.929549982Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).FilterUnusedProbes","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":149},"msg":"no functions found for probe, removing","name":{"SpanKind":3,"InstrumentedPkg":"database/sql"}}
{"time":"2024-10-25T17:51:10.929656218Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).mount","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":341},"msg":"Mounting bpffs","allocations_details":{"StartAddr":129681073373184,"EndAddr":129681073635328,"NumCPU":8}}
{"time":"2024-10-25T17:51:10.931431768Z","level":"INFO","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).load","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":326},"msg":"loading probe","name":{"SpanKind":3,"InstrumentedPkg":"go.opentelemetry.io/auto"}}
{"time":"2024-10-25T17:51:11.259044502Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).load","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":335},"msg":"loaded probes to memory","total_probes":1}
{"time":"2024-10-25T17:51:11.25925563Z","level":"INFO","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).ConfigLoop","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":239},"msg":"Configuration provider closed, configuration updates will no longer be received"}
{"time":"2024-10-25T17:51:11.259304659Z","level":"DEBUG","source":{"function":"main.(*App).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/test/runner/main.go","line":111},"msg":"sending signal to target"}
{"time":"2024-10-25T17:51:11.259315699Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":220},"msg":"reading record..."}
{"time":"2024-10-25T17:51:11.25946948Z","level":"DEBUG","source":{"function":"main.(*App).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/test/runner/main.go","line":113},"msg":"sent signal to target"}
{"time":"2024-10-25T17:51:11.259555677Z","level":"DEBUG","source":{"function":"main.(*App).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/test/runner/main.go","line":121},"msg":"waiting for command to complete"}
2024/10/25 17:51:11 got signal
{"time":"2024-10-25T17:51:11.261056598Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":222},"msg":"read record"}
{"time":"2024-10-25T17:51:11.261099443Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":230},"msg":"read bpf record","size":404,"lost":0}
{"time":"2024-10-25T17:51:11.261122631Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":237},"msg":"processing record..."}
{"time":"2024-10-25T17:51:11.26115058Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":101},"msg":"decoded size","size":395}
{"time":"2024-10-25T17:51:11.261188538Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":109},"msg":"decoded span data","size":395}
{"time":"2024-10-25T17:51:11.261680422Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":242},"msg":"processed record"}
2024/10/25 17:51:11 done
{"time":"2024-10-25T17:51:11.261716792Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":249},"msg":"sending event..."}
{"time":"2024-10-25T17:51:11.261753114Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":251},"msg":"sent event"}
{"time":"2024-10-25T17:51:11.261790031Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":220},"msg":"reading record..."}
{"time":"2024-10-25T17:51:11.261887456Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":222},"msg":"read record"}
{"time":"2024-10-25T17:51:11.261943319Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":230},"msg":"read bpf record","size":676,"lost":0}
{"time":"2024-10-25T17:51:11.261980021Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":237},"msg":"processing record..."}
{"time":"2024-10-25T17:51:11.262010378Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":101},"msg":"decoded size","size":672}
{"time":"2024-10-25T17:51:11.262040241Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":109},"msg":"decoded span data","size":672}
{"time":"2024-10-25T17:51:11.261941262Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/opentelemetry.(*Controller).Trace","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/opentelemetry/controller.go","line":56},"msg":"got event","kind":"client","pkg":"go.opentelemetry.io/auto","attrs":[],"traceID":"ac8d9700af917a5a8f2634ad0969abaf","spanID":"5f0a15116843ca4b"}
{"time":"2024-10-25T17:51:11.262150963Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/opentelemetry.(*Controller).Trace","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/opentelemetry/controller.go","line":75},"msg":"getting tracer","name":"go.opentelemetry.io/auto/internal/test/e2e/autosdk","version":"v1.23.42","schema":"https://some_schema"}
{"time":"2024-10-25T17:51:11.262797639Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":242},"msg":"processed record"}
{"time":"2024-10-25T17:51:11.262855992Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":249},"msg":"sending event..."}
{"time":"2024-10-25T17:51:11.262916473Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":251},"msg":"sent event"}
{"time":"2024-10-25T17:51:11.262964307Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":220},"msg":"reading record..."}
{"time":"2024-10-25T17:51:11.263004577Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":222},"msg":"read record"}
{"time":"2024-10-25T17:51:11.263039211Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":230},"msg":"read bpf record","size":1068,"lost":0}
{"time":"2024-10-25T17:51:11.263080665Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":237},"msg":"processing record..."}
{"time":"2024-10-25T17:51:11.263204186Z","level":"DEBUG","source":{"function":"main.(*App).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/test/runner/main.go","line":130},"msg":"closing instrumentation"}
{"time":"2024-10-25T17:51:11.263242708Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":101},"msg":"decoded size","size":1063}
{"time":"2024-10-25T17:51:11.263304662Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk.(*converter).decodeEvent","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/bpf/go.opentelemetry.io/auto/sdk/probe.go","line":109},"msg":"decoded span data","size":1063}
{"time":"2024-10-25T17:51:11.263096683Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/opentelemetry.(*Controller).Trace","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/opentelemetry/controller.go","line":56},"msg":"got event","kind":"client","pkg":"go.opentelemetry.io/auto","attrs":[{"Key":"user","Value":{"Type":"STRING","Value":"Alice"}},{"Key":"admin","Value":{"Type":"BOOL","Value":true}}],"traceID":"ac8d9700af917a5a8f2634ad0969abaf","spanID":"b33d1418c6661b76"}
{"time":"2024-10-25T17:51:11.263719408Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/opentelemetry.(*Controller).Trace","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/opentelemetry/controller.go","line":75},"msg":"getting tracer","name":"go.opentelemetry.io/auto/internal/test/e2e/autosdk","version":"v1.23.42","schema":"https://some_schema"}
{"time":"2024-10-25T17:51:11.263858557Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":286},"msg":"Shutting down all probes"}
{"time":"2024-10-25T17:51:11.264593922Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":242},"msg":"processed record"}
{"time":"2024-10-25T17:51:11.264638913Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":249},"msg":"sending event..."}
{"time":"2024-10-25T17:51:11.309678114Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation/probe.(*Base[...]).Close","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/probe/probe.go","line":286},"msg":"Closed","Probe":{"SpanKind":3,"InstrumentedPkg":"go.opentelemetry.io/auto"}}
{"time":"2024-10-25T17:51:11.312166116Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).cleanup","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":361},"msg":"Cleaning bpffs"}
{"time":"2024-10-25T17:51:11.312471724Z","level":"DEBUG","source":{"function":"go.opentelemetry.io/auto/internal/pkg/instrumentation.(*Manager).Run","file":"/usr/src/go.opentelemetry.io/auto/internal/pkg/instrumentation/manager.go","line":290},"msg":"Waiting for probes to stop..."}

(additional logging added)

Environment

  • OS: Linux
  • Go Version: 1.23.2
  • Version: 098f594

To Reproduce

Steps to reproduce the behavior:

  1. Run the autosdk example a few times.

Expected behavior

Clean shutdown

Additional context

The probes all send on a chan from the Manager:

When the manager is stopped it stops all the probes, waits for them to return, then closes that chan:

m.probeMu.Lock()
m.logger.Debug("Shutting down all probes")
err := m.cleanup(target)
// Wait for all probes to stop before closing the chan they send on.
m.runningProbesWG.Wait()
close(m.eventCh)
m.state = managerStateStopped
m.probeMu.Unlock()
return errors.Join(err, ctx.Err())

The issue is, the Probe may still be trying to write to the eventCh. Since there is no reader of this chan while the Manager is shutting down, that can result in a full chan and the probe indefinitely trying to make the write. This results in the Manager indefinitely waiting for the probe's to Run function to return and the gouroutine to decrement its WaitGroup:

@MrAlias MrAlias added the bug Something isn't working label Oct 25, 2024
@MrAlias MrAlias added this to the v0.17.0-alpha milestone Oct 25, 2024
@MrAlias MrAlias self-assigned this Oct 25, 2024
@MrAlias
Copy link
Contributor Author

MrAlias commented Oct 25, 2024

Two things:

  1. It would be best to adhere to the idiom: "the writer should close the chan". This would mean that a probe's Run func needs to return the chan it writes to and only closes it when it returns. This would also mean the Manager will need to dynamically multiplex chan selection.
  2. The close functionality needs to be done in a separate goroutine from the chan processing.

MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Oct 25, 2024
Fix open-telemetry#1218

Close probes in a different goroutine than the one draining their event
channel.
MrAlias added a commit to MrAlias/opentelemetry-go-instrumentation that referenced this issue Oct 25, 2024
Fix open-telemetry#1218

Close probes in a different goroutine than the one draining their event
channel.
@MrAlias
Copy link
Contributor Author

MrAlias commented Oct 25, 2024

  1. It would be best to adhere to the idiom: "the writer should close the chan". This would mean that a probe's Run func needs to return the chan it writes to and only closes it when it returns. This would also mean the Manager will need to dynamically multiplex chan selection.

This should be addressed in #1219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant