-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sdk-metrics] Refactor and improve interlocking code for doubles in MetricPoint #5378
[sdk-metrics] Refactor and improve interlocking code for doubles in MetricPoint #5378
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #5378 +/- ##
==========================================
- Coverage 83.38% 83.36% -0.03%
==========================================
Files 297 278 -19
Lines 12531 11936 -595
==========================================
- Hits 10449 9950 -499
+ Misses 2082 1986 -96
Flags with carried forward coverage won't be shown. Click here to find out more.
|
public static double Read(ref double location) | ||
=> Interlocked.CompareExchange(ref location, double.NaN, double.NaN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this to be a CompareExchange rather than just Volatile.Read? You need a full fence for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code before was...
this.snapshotValue.AsDouble = Interlocked.CompareExchange(ref this.runningValue.AsDouble, 0.0, double.NegativeInfinity);
Now it is calling this helper...
this.snapshotValue.AsDouble = InterlockedHelper.Read(ref this.runningValue.AsDouble);
What that code needs is no stale reads. Does Interlocked.CompareExchange
(full fence) give us 100% guaranteed up-to-date values? My understanding with Volatile.Read
is it could give us back something stale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What that code needs is no stale reads.
Define stale. Volatile.Read will ensure it's not just using some enregistered value. But beyond that, I don't see how CompareExchange(ref location, sentinel, sentinel) would provide strong guarantees about "up-to-date". The exact moment after you read the value but before you do anything with it it could be updated concurrently, at which point the value isn't "up to date".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How OpenTelemetry SDK is working is many threads are recording values. For this double case it is either of these:
case AggregationType.DoubleSumIncomingDelta:
{
InterlockedHelper.Add(ref this.runningValue.AsDouble, number);
break;
}
case AggregationType.DoubleSumIncomingCumulative:
{
Interlocked.Exchange(ref this.runningValue.AsDouble, number);
break;
}
There is a dedicated thread which does a collection operation periodically. Its job is to grab a snapshot of the current/running state. That is this block:
case AggregationType.DoubleSumIncomingDelta:
case AggregationType.DoubleSumIncomingCumulative:
{
if (outputDelta)
{
double initValue = InterlockedHelper.Read(ref this.runningValue.AsDouble);
this.snapshotValue.AsDouble = initValue - this.deltaLastValue.AsDouble;
this.deltaLastValue.AsDouble = initValue;
this.MetricPointStatus = MetricPointStatus.NoCollectPending;
// Check again if value got updated, if yes reset status.
// This ensures no Updates get Lost.
if (initValue != InterlockedHelper.Read(ref this.runningValue.AsDouble))
{
this.MetricPointStatus = MetricPointStatus.CollectPending;
}
}
else
{
this.snapshotValue.AsDouble = InterlockedHelper.Read(ref this.runningValue.AsDouble);
}
break;
}
The delta case needs to compute the change since the last collect:
double initValue = InterlockedHelper.Read(ref this.runningValue.AsDouble);
this.snapshotValue.AsDouble = initValue - this.deltaLastValue.AsDouble;
this.deltaLastValue.AsDouble = initValue;
It's OK if the value changes right before or right after but we need to make sure whatever we got was true and not some cached or local thing. Does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK if the value changes right before or right after but we need to make sure whatever we got was true and not some cached or local thing.
I still don't understand. With a CompareExchange (that's not also updating anything atomically, just reading), you might read the value, then the thread might be context switched out, and 10 minutes later it might start running again and you continue with the 10 minute old value. What's the difference between that and "some cached" thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is our version for long
measurements:
case AggregationType.LongSumIncomingDelta:
case AggregationType.LongSumIncomingCumulative:
{
if (outputDelta)
{
long initValue = Interlocked.Read(ref this.runningValue.AsLong);
this.snapshotValue.AsLong = initValue - this.deltaLastValue.AsLong;
this.deltaLastValue.AsLong = initValue;
this.MetricPointStatus = MetricPointStatus.NoCollectPending;
// Check again if value got updated, if yes reset status.
// This ensures no Updates get Lost.
if (initValue != Interlocked.Read(ref this.runningValue.AsLong))
{
this.MetricPointStatus = MetricPointStatus.CollectPending;
}
}
else
{
this.snapshotValue.AsLong = Interlocked.Read(ref this.runningValue.AsLong);
}
break;
}
Also uses Interlocked.Read
, do you think it would be better to use Volatile.Read
there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is our main motivation behind doing our best to capture as many updates as possible before running collect. Any updates that we miss, would only be reported in the next collect cycle (which would be like a minute later).
Can you share the data you have that demonstrates using CompareExchange is actually improving this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also uses Interlocked.Read, do you think it would be better to use Volatile.Read there too?
Interlocked.Read(ref value)
is just CompareExchange(ref value, 0, 0)
, so it's effectively the same thing you're doing with the doubles. Personally, I'd just use Volatile.Read
. I'm really interested in seeing the data that suggests there's any meaningful benefit to using CompareExchange instead in these scenarios; otherwise, I'd stick with the idiomatic answer that's also cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share the data you have that demonstrates using CompareExchange is actually improving this?
We haven't tested this, and we most likely would not anytime soon. It would require elaborate testing setups with different kinds of hardware. That might not be the best use of our time. Ideally, we would like to rely on the official docs for this kind of information instead of running experiments ourselves and gathering this kind of data.
In this particular case, our understanding/assumption is in line with what you mentioned in one of your previous comments:
Using an Interlocked.CompareExchange here may give you a value more synchronized with the other uses of those variables, but that's because it's actually synchronizing,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is you'd find it makes essentially zero difference, other than slowing down the writing code.
Changes
double
values inMetricPoint
into a helper and improved them slightly. With help from @stephentoub.