Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk-metrics] Refactor exemplar offer duplicated code into helper methods inside MetricPoint #5399

Conversation

CodeBlanch
Copy link
Member

Changes

  • Fix a few more violations of DRY in MetricPoint by introducing helper methods for exemplar offer code.

@CodeBlanch CodeBlanch added pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package metrics Metrics signal related labels Feb 28, 2024
@CodeBlanch CodeBlanch requested a review from a team February 28, 2024 19:44
Copy link

codecov bot commented Feb 28, 2024

Codecov Report

Attention: Patch coverage is 95.55556% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 84.57%. Comparing base (6250307) to head (d59a6af).
Report is 105 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5399      +/-   ##
==========================================
+ Coverage   83.38%   84.57%   +1.19%     
==========================================
  Files         297      284      -13     
  Lines       12531    12086     -445     
==========================================
- Hits        10449    10222     -227     
+ Misses       2082     1864     -218     
Flag Coverage Δ
unittests ?
unittests-Solution-Experimental 84.56% <95.55%> (?)
unittests-Solution-Stable 84.46% <95.55%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/OpenTelemetry/Metrics/AggregatorStore.cs 82.65% <100.00%> (+2.26%) ⬆️
src/OpenTelemetry/Metrics/MetricPoint.cs 92.27% <95.34%> (+23.80%) ⬆️

... and 53 files with indirect coverage changes

this.mpComponents.ExemplarReservoir!.Offer(
new ExemplarMeasurement<long>(number, tags));
}
this.OfferExemplarIfSampled(number, tags, isSampled);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should no point in calling UpdateWithExemplar method with isSampled == false.

@CodeBlanch Could you refactor the method signature first to remove isSampled as a parameter? We would also have to update the call sites to only call UpdateWithExemplar when the ExemplarFilter returns true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed! The JIT is smart enough to do that when it inlines everything. Check it out: SharpLab

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have the extra code though? We never want to call ExemplarReservoir.Offer when isSampled == false.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry I thought you were talking about the call sites in UpdateWithExemplar. I agree with you and I plan to completely remove UpdateWithExemplar by the time I'm done with this effort 😄 That's what is on #5364. But let me get this in and then I'll keep getting there incrementally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should first update the call site and signature of UpdateWithExemplar method to remove isSampled parameter. Remove unnecessary code first and then make further refactoring changes.

The method introduced in this PR is anyway going to add unnecessary code now which we know for sure needs to be changed in a future PR.

OfferExemplarIfSampled: The offer should only be made when the measurement is sampled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually had to revert it. Things broke: https://github.com/open-telemetry/opentelemetry-dotnet/actions/runs/8088019825/job/22101191749?pr=5399

Doing this...

            if (this.IsExemplarEnabled() && this.exemplarFilter.ShouldSample(value, tags))
            {
                this.metricPoints[index].UpdateWithExemplar(value, tags: default);
            }
            else
            {
                this.metricPoints[index].Update(value);
            }

...seems to cause the unsampled stuff to flow with different locking mechanics. Yikes!

I do want to tackle this, but not here. So I'll give you two options:

  1. We can merge this incremental refactor and I can get back to it.
  2. We can close this. May or may not lead to a bigger PR in the future. The point of these small refactors was to reduce code duplication in order to make future refactoring PRs smaller 😄

Copy link
Contributor

@utpilla utpilla Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...seems to cause the unsampled stuff to flow with different locking mechanics. Yikes!

That's exactly what we want. Why should measurements that are unsampled by the ExemplarFilter take the slower update path (which acquires a lock) instead of the faster update path (which uses Interlocked statements)? The slower path made sense only when we had to call ExemplarReservoir.Offer in the update logic as it had to be synchronized with ExemplarReservoir.Collect.

I tried this refactoring in #5401 and it seems to work.

Copy link
Contributor

@utpilla utpilla Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what we want.

I take it back. Actually, it's not what we want 😄. That would cause issues when there are two concurrent updates: one which gets sampled and one that does not.

If at all the user has enabled exemplars, we have to take the slow path. We need to keep isSampled parameter for now. Sorry for the confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah there was an issue with my refactor. I see it now why it broke things. But I still think it is a bad idea.

Consider the case of like DoubleSumIncomingDelta we'll have some threads running this (when exemplar is not sampled)...

InterlockedHelper.Add(ref this.runningValue.AsDouble, number);

...and some threads running this (when exemplar is sampled)...

                    this.mpComponents!.AcquireLock();

                    unchecked
                    {
                        this.runningValue.AsDouble += number;
                    }

                    this.OfferExemplarIfSampled(number, tags, isSampled);

                    this.mpComponents.ReleaseLock();

That second block is really this...

                    this.mpComponents!.AcquireLock();

                    unchecked
                    {
                        var temp = this.runningValue.AsDouble;
                        this.runningValue.AsDouble = temp + 1;
                    }

                    this.OfferExemplarIfSampled(number, tags, isSampled);

                    this.mpComponents.ReleaseLock();

Imagine this case...

                    this.mpComponents!.AcquireLock();

                    unchecked
                    {
                        var temp = this.runningValue.AsDouble;
                        // Thread here yields for a while. A bunch of InterlockedHelper.Add(ref this.runningValue.AsDouble, number) calls fire in the meantime.
                        this.runningValue.AsDouble = temp + 1; // We have now undone the interlocked calls
                    }

                    this.OfferExemplarIfSampled(number, tags, isSampled);

                    this.mpComponents.ReleaseLock();

See the comment about the yield. If you have some threads updating the value not using the lock, and some threads expecting the lock, we're going to have a race. No?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If at all the user has enabled exemplars, we have to take the slow path. We need to keep isSampled parameter for now. Sorry for the confusion.

No worries! I wish we could do it 😄 My goal is to get us there somehow!

Comment on lines +1348 to +1357
if (typeof(T) == typeof(long))
{
this.mpComponents!.ExemplarReservoir!.Offer(
new ExemplarMeasurement<long>((long)(object)number, tags));
}
else if (typeof(T) == typeof(double))
{
this.mpComponents!.ExemplarReservoir!.Offer(
new ExemplarMeasurement<double>((double)(object)number, tags));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this work?

Suggested change
if (typeof(T) == typeof(long))
{
this.mpComponents!.ExemplarReservoir!.Offer(
new ExemplarMeasurement<long>((long)(object)number, tags));
}
else if (typeof(T) == typeof(double))
{
this.mpComponents!.ExemplarReservoir!.Offer(
new ExemplarMeasurement<double>((double)(object)number, tags));
}
if (typeof(T) == typeof(long) || typeof(T) == typeof(double))
{
this.mpComponents!.ExemplarReservoir!.Offer(
new ExemplarMeasurement<T>(number, tags));
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, no. ExemplarReservoir.Offer has two overloads:

    public abstract void Offer(in ExemplarMeasurement<long> measurement);
    public abstract void Offer(in ExemplarMeasurement<double> measurement);

The code as you have it there the compiler doesn't know which one you want to call so it (more or less) guesses the first one and spits out:

Error CS1503 Argument 1: cannot convert from 'OpenTelemetry.Metrics.ExemplarMeasurement' to 'in OpenTelemetry.Metrics.ExemplarMeasurement'

@CodeBlanch CodeBlanch merged commit 77ef123 into open-telemetry:main Feb 29, 2024
37 checks passed
@CodeBlanch CodeBlanch deleted the sdk-metricpoint-exemplarofferrefactor branch February 29, 2024 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metrics Metrics signal related pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants