-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit metrics via otel instead of custom format #4762
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Ramon Smits <ramon.smits@gmail.com>
b986142
to
14c3474
Compare
@lailabougria can you spot anything off in terms of how we emit the telemetry? naming etc |
new DetectSuccessfulRetriesEnricher(), | ||
new SagaRelationshipsEnricher() | ||
}.Concat(auditEnrichers).ToArray(); | ||
var enrichers = new IEnrichImportedAuditMessages[] { new MessageTypeEnricher(), new EnrichWithTrackingIds(), new ProcessingStatisticsEnricher(), new DetectNewEndpointsFromAuditImportsEnricher(endpointInstanceMonitoring), new DetectSuccessfulRetriesEnricher(), new SagaRelationshipsEnricher() }.Concat(auditEnrichers).ToArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreasohlund update your Rider formatting so that this will not be put on a single line
readonly Meter batchSizeMeter; | ||
readonly Meter batchDurationMeter; | ||
readonly Counter receivedMeter; | ||
readonly Histogram<long> auditBatchSize = AuditMetrics.Meter.CreateHistogram<long>($"{AuditMetrics.Prefix}.batch_size_audits"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called batch_size_audits
when the application is particular.servicecontrol.audit
?
readonly Counter<long> storedAudits = AuditMetrics.Meter.CreateCounter<long>($"{AuditMetrics.Prefix}.stored_audit_messages"); | ||
readonly Counter<long> storedSagas = AuditMetrics.Meter.CreateCounter<long>($"{AuditMetrics.Prefix}.stored_saga_audits"); | ||
readonly Histogram<double> auditBulkInsertDuration = AuditMetrics.Meter.CreateHistogram<double>($"{AuditMetrics.Prefix}.bulk_insert_duration_audits", unit: "ms"); | ||
readonly Histogram<double> sagaAuditBulkInsertDuration = AuditMetrics.Meter.CreateHistogram<double>($"{AuditMetrics.Prefix}.bulk_insert_duration_sagas", unit: "ms"); | ||
readonly Histogram<double> auditCommitDuration = AuditMetrics.Meter.CreateHistogram<double>($"{AuditMetrics.Prefix}.commit_duration_audits", unit: "ms"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are better off managing all these metrics in one class with different methods for signaling a counter.
This way such method could also increase multiple counters or one instance counter and a lenght counters.
This also makes it much clearer what actual metrics exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially because here its also not clear what the name will be. I think its better if all names are listed in a single location including there prefix/scope.
{ | ||
if (!Uri.TryCreate(settings.OtelMetricsUrl, UriKind.Absolute, out var otelMetricsUri)) | ||
{ | ||
throw new UriFormatException($"Invalid OtelMetricsUrl: {settings.OtelMetricsUrl}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was returned in a search:
The environment variable OTEL_EXPORTER_OTLP_ENDPOINT exists as a standardized way to specify the OpenTelemetry OTLP endpoint. By default, it points to http://localhost:4317 for gRPC and http://localhost:4318 for HTTP/protobuf.
There are also some related environment variables:OTEL_EXPORTER_OTLP_PROTOCOL for specifying the protocol (grpc, http/protobuf)
OTEL_EXPORTER_OTLP_HEADERS for additional headers
OTEL_EXPORTER_OTLP_TIMEOUT for request timeouts
Maybe instead we use a more verbose name?
OTEL_EXPORTER_OTLP_ENDPOINT
If not maybe immediately we should ensure that environment variable expansion is active so you could do:
<add key="ServiceControl.Audit/OTEL_EXPORTER_OTLP_ENDPOINT" value="%OTEL_EXPORTER_OTLP_ENDPOINT%" />
or for that matter:
SERVICECONTROL_AUDIT_OTEL_EXPORTER_OTLP_ENDPOINT=%OTEL_EXPORTER_OTLP_ENDPOINT%
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, while testing envvar expansion behavior I stumbled upon the settings reader:
ServiceControl/src/ServiceControl.Configuration/EnvironmentVariableSettingsReader.cs
Lines 37 to 52 in ea0cb32
// The 3 app namespaces can be loaded from env vars without the namespace, so settings like TransportType can be | |
// shared between all instances in a container .env file. Settings in all other namespaces must be fully specified. | |
if (CanSetWithoutNamespaceList.Contains(settingsNamespace)) | |
{ | |
var nameOnly = regex.Replace(name, "_"); | |
yield return nameOnly.ToUpperInvariant(); | |
yield return nameOnly; | |
} | |
} | |
static readonly HashSet<SettingsRootNamespace> CanSetWithoutNamespaceList = | |
[ | |
new SettingsRootNamespace("ServiceControl"), | |
new SettingsRootNamespace("ServiceControl.Audit"), | |
new SettingsRootNamespace("Monitoring") | |
]; |
This means we can't use it with a prefix as that logic would still load the value directly from OTEL_EXPORTER_OTLP_ENDPOINT
.
} | ||
builder.Services.AddOpenTelemetry() | ||
.ConfigureResource(b => b.AddService( | ||
serviceName: "Particular.ServiceControl.Audit", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be added to AuditMetrics
static class AuditMetrics | ||
{ | ||
public const string MeterName = "Particular.ServiceControl"; | ||
public static readonly Meter Meter = new(MeterName, "0.1.0"); | ||
public static readonly string Prefix = "particular.servicecontrol.audit"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should better define its intend:
- No need for Audit in the name as the namespace already is clear
- Its for OpenTelemetry
- These are constants
static class AuditMetrics | |
{ | |
public const string MeterName = "Particular.ServiceControl"; | |
public static readonly Meter Meter = new(MeterName, "0.1.0"); | |
public static readonly string Prefix = "particular.servicecontrol.audit"; | |
} | |
static class TelemetryGlobals | |
{ | |
public const string Source = "Particular.ServiceControl"; | |
public const string Scope = "particular.servicecontrol.audit"; | |
public static readonly Meter Meter = new(MeterName, "0.1.0"); | |
} |
@@ -109,6 +109,7 @@ public string RootUrl | |||
public int Port { get; set; } | |||
|
|||
public bool PrintMetrics => SettingsReader.Read<bool>(SettingsRootNamespace, "PrintMetrics"); | |||
public string OtelMetricsUrl { get; set; } = SettingsReader.Read<string>(SettingsRootNamespace, nameof(OtelMetricsUrl)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment on OTEL_EXPORTER_OTLP_ENDPOINT
public string OtelMetricsUrl { get; set; } = SettingsReader.Read<string>(SettingsRootNamespace, nameof(OtelMetricsUrl)); | |
const string OTEL_EXPORTER_OTLP_ENDPOINT = nameof(OTEL_EXPORTER_OTLP_ENDPOINT); | |
[] | |
public string OtelExporterOltpEndpointUrl { get; init; } = SettingsReader.Read<string>(SettingsRootNamespace, OTEL_EXPORTER_OTLP_ENDPOINT); |
var auditSw = Stopwatch.StartNew(); | ||
await unitOfWork.RecordProcessedMessage(processedMessage, context.Body); | ||
auditBulkInsertDuration.Record(auditSw.ElapsedMilliseconds); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of things done in this method. Even multiple measurements.
I think we should consider doing the measures in the unitOfWork
methods itself. The measurement is then simply that whole method and isn't making this method even worse then it already is :-)
readonly Histogram<long> auditBatchSize = AuditMetrics.Meter.CreateHistogram<long>($"{AuditMetrics.Prefix}.batch_size_audits"); | ||
readonly Histogram<double> auditBatchDuration = AuditMetrics.Meter.CreateHistogram<double>($"{AuditMetrics.Prefix}.batch_duration_audits", unit: "ms"); | ||
readonly Histogram<double> auditMessageSize = AuditMetrics.Meter.CreateHistogram<double>($"{AuditMetrics.Prefix}.audit_message_size", unit: "kilobytes"); | ||
readonly Counter<long> receivedAudits = AuditMetrics.Meter.CreateCounter<long>($"{AuditMetrics.Prefix}.received_audits"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is something wrong with the naming here. THese shouldn't have a prefix like this:
readonly Counter<long> receivedAudits = AuditMetrics.Meter.CreateCounter<long>($"{AuditMetrics.Prefix}.received_audits"); | |
readonly Counter<long> receivedAudits = AuditMetrics.Meter.CreateCounter<long>( | |
"received_total", | |
description: "Total number of audit messages received" | |
); |
readonly Meter bulkInsertCommitDurationMeter; | ||
readonly IMessageSession messageSession; | ||
readonly Lazy<IMessageDispatcher> messageDispatcher; | ||
readonly Counter<long> storedAudits = AuditMetrics.Meter.CreateCounter<long>($"{AuditMetrics.Prefix}.stored_audit_messages"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, these prefixes shouldn't be used and isn't according to spec. The Meter
should define the prefix/scope AFAIK:
var meter = meterFactory.Create("Particular.ServiceControl.Audit");
stored_messages_total = meter.CreateCounter<int>("stored_messages_total");
or
var meter = meterFactory.Create("Particular.ServiceControl");
stored_messages_total = meter.CreateCounter<int>("audit.stored_messages_total");
but as we have seperate instance my preference would be the 1st.
This replaces the homegrown metrics in the audit instance with OTEL to allow for easier consumption in visualization tools like Prometheus, Azure Monitor etc