This is a short demo of two instrumeneted Rust toy-microservices, sending traces to Tempo with the OTLP protocol, and using the Jaeger propagation format
Make sure all files in the conf
directory are world readable
chmod 0644 ./conf/*
Then start all docker containers
docker compose up --abort-on-container-exit --build --force-recreate [--timestamps]
-
You can start by visiting the un-instrumented microservice 1 at http://uservice1.localhost/ in a browser and watch the
docker compose
logs for the call touservice2
. Before forwarding the HTTP request to microservice 1, Traefik adds auber-trace-id
header. You can observe this in docker compose logsgrafana-tempo-rust-demo-uservice1-1 | Request: { grafana-tempo-rust-demo-uservice1-1 | "host": "uservice1.localhost", grafana-tempo-rust-demo-uservice1-1 | ... grafana-tempo-rust-demo-uservice1-1 | "uber-trace-id": "5737c004a641f5eb:1f1e2d6167105390:5737c004a641f5eb:1", grafana-tempo-rust-demo-uservice1-1 | ... grafana-tempo-rust-demo-uservice1-1 | }
You will notice that the request takes anywhere from ~1100 milliseconds to ~2200 milliseconds. Also, note that the operator, without examining log timestamps, has no way of knowing which part of the internal flow delayed the response to the caller
-
You can continue by visiting the instrumented microservice 3 at http://uservice3.localhost/ in a browser and watch the
docker compose
logs for the call touservice4
as well as theuber-trace-id
headers. You can now visit Grafana -> Explore -> Tempo -> Search and search for the Trace that was generated. You can explore the Trace's spans in the right Tempo pane. Note that, by looking at the span information you can deduce how long each of the mock calls to Redis and the Database took
These services comprise the infra part of this demo and we won't go into their detailed configuration in this document, except for a few key points
-
Tempo has
OTLP
andJaeger
listeners configured inconf/tempo.yaml
distributor: receivers: otlp: # for microservices protocols: grpc: jaeger: # for Traefik protocols: thrift_http:
-
Traefik is configured to generate spans and forward them to Tempo in
docker-compose.yaml
services: ... traefik: ... - "--tracing.jaeger=true" - "--tracing.jaeger.collector.endpoint=http://tempo:14268/api/traces?format=jaeger.thrift"
-
uservice1
uservice1
waits for an incoming HTTP request. Upon receipt it prints the request details, runs a mock redis command lasting 100-199 milliseconds and callsuservice2
. It then printsuservice2
's response and responds to the caller with the contents ofuservice1/hello-world.html
-
uservice2
uservice2
waits for an incoming HTTP request. Upon receipt it prints the request details, does a mock database query lasting 1000-1999 milliseconds and then responds to the caller with the contents ofuservice2/hello-world.html
-
uservice3
uservice3
3 is the same as microservice 1 with some example instrumentation added-
main()
opentelemetry::global::set_text_map_propagator(opentelemetry_jaeger::Propagator::default()); let resources = OsResourceDetector.detect(std::time::Duration::from_secs(0)) .merge(&ProcessResourceDetector.detect(std::time::Duration::from_secs(0))) .merge(&Resource::new(vec![opentelemetry::KeyValue::new("service.name", "uservice3")])); opentelemetry_otlp::new_pipeline() .tracing() .with_exporter(opentelemetry_otlp::new_exporter().tonic()) .with_trace_config( opentelemetry::sdk::trace::config() .with_id_generator(opentelemetry::sdk::trace::RandomIdGenerator::default()) .with_sampler(opentelemetry::sdk::trace::Sampler::AlwaysOn) .with_resource(resources), ) .install_simple() .unwrap();
opentelemetry::global::set_text_map_propagator(opentelemetry_jaeger::Propagator::default())
is setting up Jaeger native propagation format. This is used later to extract theuber-trace-id
HTTP header from incoming requests and initialize a tracing contextlet parent_cx = opentelemetry::global::get_text_map_propagator(|propagator| { propagator.extract(&HeaderExtractor(req.headers())) });
and also inject a tracing context via the same HTTP header to outgoing requests
opentelemetry::global::get_text_map_propagator(|propagator| { propagator.inject_context(&cx, &mut HeaderInjector(req.headers_mut().unwrap())) });
OsResourceDetector.detect(std::time::Duration::from_secs(0))
adds aos.type=linux
resource to generated spansSimilarly
ProcessResourceDetector.detect(std::time::Duration::from_secs(0)))
addsprocess.command_args
andprocess.pid
resources. These can be used in the Tempo query to filter traces as shown in the screenshot aboveResource::new(vec![opentelemetry::KeyValue::new("service.name", "uservice3")])
sets the microservice name displayed by Tempoopentelemetry_otlp::new_pipeline()
sets up Opentelemetry to forward spans via the OTLP protocol to Tempo. Tempo's address is determined via the- OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://tempo:4317
environment variable, defined indocker-compose.yaml
. There is also anopentelemetry::sdk::export::trace::stdout
example, to aid with debugging, that writes spans to stdout instead of sending them to Tempo.with_trace_config()
sets some properties of the generated spans suitable for this demo, which may or may not be useful in a production environment. Specifically,.with_id_generator(opentelemetry::sdk::trace::RandomIdGenerator::default())
sets up a random ID generator for the spans, and.with_sampler(opentelemetry::sdk::trace::Sampler::AlwaysOn)
ensures that all generated spans are send to Tempo (no sampling). Finally,install_simple()
makes OpenTelemetry ship spans to Tempo, one-by-one as generated and should not be used in production environments. For production environments,install_batch()
ought to be more appropriateSome more options can be found in Kitchen Sink Full Configuration
-
handle_incoming_call()
Incoming requests are handled by
async fn handle_incoming_call(req: Request<Body>) -> Result<Response<Body>, Infallible> {
async fn handle_incoming_call(req: Request<Body>) -> Result<Response<Body>, Infallible> { let parent_cx = opentelemetry::global::get_text_map_propagator(|propagator| { propagator.extract(&HeaderExtractor(req.headers())) }); let tracer = opentelemetry::global::tracer("uservice3"); let span = tracer .span_builder("handle_incoming_call") .with_kind(opentelemetry::trace::SpanKind::Server) .start_with_context(&tracer, &parent_cx); let cx = Context::current_with_span(span); println!("Request: {:#?}", req.headers()); }
The code here extracts a context from the incoming HTTP request. This is needed so that the new span created here (kind:
Server
) is a child span of Traefik'sClient
span (OpenTelemetry - SpanKind explains the different kinds of spans). This is what allows Tempo to determine that both Traefik's span anduservice3
's span are part of a single trace and display them nestedLooking at the stdout tracer's output mentioned above, may help you understand this more clearly
grafana-tempo-rust-demo-uservice3-1 | SpanData { grafana-tempo-rust-demo-uservice3-1 | span_context: SpanContext { grafana-tempo-rust-demo-uservice3-1 | trace_id: 0000000000000000771a728620b4bd35, grafana-tempo-rust-demo-uservice3-1 | span_id: 7f8daf512c6da6da, grafana-tempo-rust-demo-uservice3-1 | ... grafana-tempo-rust-demo-uservice3-1 | }, grafana-tempo-rust-demo-uservice3-1 | parent_span_id: 36996754408fb4a8, grafana-tempo-rust-demo-uservice3-1 | span_kind: Server, grafana-tempo-rust-demo-uservice3-1 | ... grafana-tempo-rust-demo-uservice3-1 | }
cx.span().add_event( "Request decoded!", vec![opentelemetry::KeyValue::new("happened", true)], );
cx.span().add_event()
adds an event within the spancx.span() .set_status(opentelemetry::trace::Status::error("no error"));
cx.span().set_status()
sets the span's statuscall_redis().with_context(cx).await;
Finally,
call_redis()
is called and the current context is passed on, so that any new spans will be created as children of the current one. More details about this can be found in OpenTelemetry - async active spans. Upon completion of thehandle_incoming_call
the span is `end()~ed -
call_redis()
async fn call_redis() { let tracer = opentelemetry::global::tracer("uservice3"); let span = tracer .span_builder("redis_call") .with_kind(opentelemetry::trace::SpanKind::Internal) .start(&tracer); let cx = Context::current_with_span(span); cx.span() .set_attribute(KeyValue::new("test-attribute-1", "test-values-1")); // mock redis call lasting 100-199 ms std::thread::sleep(std::time::Duration::from_millis( rand::thread_rng().gen_range(100..199), )); call_other_uservice().with_context(cx).await; }
call_redis()
creates a child span sets an example attribute and passes the context tocall_other_uservice()
-
call_other_uservice()
call_other_uservice()
works similarly to the above, with the addition that it injects the current context in the HTTP headers of the call touservice4
with this codeopentelemetry::global::get_text_map_propagator(|propagator| { propagator.inject_context(&cx, &mut HeaderInjector(req.headers_mut().unwrap())) });
-
-
uservice4
uservice4
is the same asuservice2
with some example instrumentation added, similarly touservice3
- uservice1 - direct
- uservice1 - through Traefik
- uservice2 - direct
- uservice2 - through Traefik
- uservice3 - direct
- uservice3 - through Traefik
- uservice4 - direct
- uservice4 - through Traefik
-
The implementation above is a minimal implementation, for demo purposes only, that is not production-ready
-
It looks like Traefik will support sending traces to Tempo with the OTLP protocol from version 3.0 onwards
-
docker-compose.yaml
is based on Official Tempo docker-compose 'Local Storage' example and Traefik docker compose example -
uservice1
,uservice2
,uservice3
anduservice4
are based on The Rust Programming Language (the book) and -
To create a
README.md
fromREADME.org
, you can runpandoc --from org --to gfm --standalone --toc README.org --output README.md