Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kinesis/CloudWatch input is noisy when PutMetricData IAM permission missing #17648

Open
damianharouff opened this issue Dec 11, 2023 · 2 comments

Comments

@damianharouff
Copy link

damianharouff commented Dec 11, 2023

The following message was noted in a Cloud tenant's logs:

2023-12-11 19:45:19,485 WARN : software.amazon.kinesis.metrics.CloudWatchMetricsPublisher - Could not publish 20 datums to CloudWatch software.amazon.awssdk.services.cloudwatch.model.CloudWatchException: User: arn:aws:iam::redacted:user/graylog is not authorized to perform: cloudwatch:PutMetricData because no identity-based policy allows the cloudwatch:PutMetricData action (Service: CloudWatch, Status Code: 403, Request ID: redacted) at software.amazon.awssdk.services.cloudwatch.model.CloudWatchException$BuilderImpl.build(CloudWatchException.java:104) ~[graylog.jar:?] at software.amazon.awssdk.services.cloudwatch.model.CloudWatchException$BuilderImpl.build(CloudWatchException.java:58) ~[graylog.jar:?] at software.amazon.awssdk.protocols.query.internal.unmarshall.AwsXmlErrorUnmarshaller.unmarshall(AwsXmlErrorUnmarshaller.java:99) ~[graylog.jar:?] at software.amazon.awssdk.protocols.query.unmarshall.AwsXmlErrorProtocolUnmarshaller.handle(AwsXmlErrorProtocolUnmarshaller.java:102) ~[graylog.jar:?] at software.amazon.awssdk.protocols.query.unmarshall.AwsXmlErrorProtocolUnmarshaller.handle(AwsXmlErrorProtocolUnmarshaller.java:82) ~[graylog.jar:?] at software.amazon.awssdk.core.http.MetricCollectingHttpResponseHandler.lambda$handle$0(MetricCollectingHttpResponseHandler.java:52) ~[graylog.jar:?] at software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:67) ~[graylog.jar:?] at software.amazon.awssdk.core.http.MetricCollectingHttpResponseHandler.handle(MetricCollectingHttpResponseHandler.java:52) ~[graylog.jar:?] at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler.lambda$prepare$0(AsyncResponseHandler.java:89) ~[graylog.jar:?] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150) ~[?:?] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?] at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?] at software.amazon.awssdk.core.internal.http.async.AsyncResponseHandler$BaosSubscriber.onComplete(AsyncResponseHandler.java:132) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$DataCountingPublisher$1.onComplete(ResponseHandler.java:515) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.runAndLogError(ResponseHandler.java:250) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$600(ResponseHandler.java:75) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:371) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.publishMessage(HandlerPublisher.java:402) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.flushBuffer(HandlerPublisher.java:338) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.receivedDemand(HandlerPublisher.java:291) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.access$200(HandlerPublisher.java:61) ~[graylog.jar:?] at software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher$ChannelSubscription$1.run(HandlerPublisher.java:495) ~[graylog.jar:?] at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[graylog.jar:?] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[graylog.jar:?] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[graylog.jar:?] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566) ~[graylog.jar:?] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[graylog.jar:?] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[graylog.jar:?] at java.lang.Thread.run(Thread.java:833) [?:?]

Per Graylog2/graylog-plugin-integrations#293 (comment) this is not a blocker or likely to be impacting message pull, and indeed the input seems to be functioning normally.

However, users may not want to give IAM permissions for cloudwatch:PutMetricData for whatever reason, so writing metrics back into CloudWatch should be a toggle on the input. This would impact both Cloud and on-prem installations, where this can be quite noisy, as in the Cloud tenant where it was noted, it's creating about 24 log entries per minute, every minute:

image

Another idea would be to check for existence of the permission on input start, then do not publish metrics if permission is not available.

Your Environment

  • Graylog Version: Graylog Cloud 5.2.2 (388)
@damianharouff
Copy link
Author

damianharouff commented Jan 10, 2024

The tenant in question did indeed point out that it's not required in our least-privileged permissions block:

“I don’t see that permission listed in the docs for 5.2 under the least privilege policy section.
We did not apply the permissions for the automatic setup flow as it’s a very privileged policy.
https://go2docs.graylog.org/5-2/getting_in_log_data/aws_kinesis_cloudwatch_input.html#Permissi
I can add that permission for our user, however.”

The alert messages stopped after the customer added the PutMetricData permission.

@der-eismann
Copy link

In my eyes this is still an issue. We looked at our billing and due to always changing Graylog node IDs (running as a kubernetes deployment) we generated almost 100k cloudwatch metrics costing us ~40 US$/month. Now this is a small amount for organizations, but wasted money after all if you don't need the metrics.

Image

We removed the permission now (it's still not listed as required in the manual flow), but as a result our logs are getting spammed with is not authorized to perform: cloudwatch:PutMetricData errors. It would be great if one could simply disable metrics for this input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants