-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-32734 Add ElasticSearch metric sink configuration #19247
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32734 Jirabot Action Result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kenrowland looks good, but I do have some comments for you to ponder. They're all meant to be constructive, let me know if should clarify any points. Thanks.
data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch | ||
default mapping does not properly map value to unsigned 64-bit integers. | ||
|
||
To create an index with dynamic mapping, use the following object when creating the index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this write up seems very useful.
Is the goal still to handle the creation of the indices programmatically in later versions of the sink?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are
- If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.
- The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance
- If the sink creates the index, we need to worry about life cycle management.
I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.
If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above
helm/examples/metrics/README.md
Outdated
<Environment> | ||
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial but changing the name to 'myelasticsink' would drive to point further and would follow the convention from the metrics name attribute above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I'll change that.
helm/examples/metrics/README.md
Outdated
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> | ||
<settings period="30" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge of adding the configuration before the logic is higher potential for downstream restructure of configuration which can be very disruptive to end-users.
It's obviously hard to know what we don't know we'll need to support in later versions, but let's scrutinize each element of your config layout more than usual...
for instance, indexName, if we're planning on expanding the index creation, we might need a pattern rather than a name. with that in mind it might be worth supporting a structured element for the index, for now the only active attribute for the index element could be "name", later adding more attributes under the index element
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index config changed accordingly to allow sub items.
# type - sink type (must be elastic for ElasticSearch support) | ||
# name - name for the sink instance | ||
# settings.elasticHost - url for the ElasticSearch instance | ||
# settings.indexName - ElasticSearch index name where metrics are indexed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to illustrate what I meant above, this could be settings.index.name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.
@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet | |||
ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true); | |||
pSettingsTree->getProp("@elasticHost", elasticHost); | |||
pSettingsTree->getProp("@indexName", indexName); | |||
|
|||
// Initialize standard suffixes | |||
countMetricSuffix.append("_count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an advantage to use append rather than set? (I'm obviously assuming these suffix strings are empty before line 50)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.
- type: elastic | ||
name: elasticsink | ||
settings: | ||
elasticHost: http://localhost:9200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps too picky, but "Host" implies the dns entry only, no protocol, port, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.
However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pleas see comments
data types to be stored in the index in their native types. Without dynamic mapping, the ElasticSearch | ||
default mapping does not properly map value to unsigned 64-bit integers. | ||
|
||
To create an index with dynamic mapping, use the following object when creating the index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no plans to create the index from the sink. If that was something we added, there are several considerations that must be taken into account. These are
- If the cluster is using the same index for all metric reporting, creating and configuring the index must be coordinated somehow. If each component creates their own, there still needs to be coordination.
- The sink would also need to add mapping templates in order to map values to the correct type. This could still use wildcard dynamic templates, or could be specific to the metrics registered for each metric framework instance
- If the sink creates the index, we need to worry about life cycle management.
I felt like just telling the sink what index to use allows the policy of the ElasticSearch instance to remain there instead of pushing it down to the sink.
If we decide to push that responsibility to the sink, we can add it as an improvement addressing the concerns from above
helm/examples/metrics/README.md
Outdated
<Environment> | ||
<Software> | ||
<metrics name="mymetricsconfig"> | ||
<sinks name="elastic" type="elastic"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I'll change that.
- type: elastic | ||
name: elasticsink | ||
settings: | ||
elasticHost: http://localhost:9200, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I wanted to provide the full string to which the specific endpoint is appended. This makes it a bit easier than having to read multiple configuration values just to put them together internally. Any individual part can be changed in the string as easily as it could be in a separate config value.
However, thinking ahead, I can see where something like the port or other portion of the route could be a cluster wide parameter. For that reason, separating them into separate values makes sense.
@@ -45,11 +45,26 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet | |||
ignoreZeroMetrics = pSettingsTree->getPropBool("@ignoreZeroMetrics", true); | |||
pSettingsTree->getProp("@elasticHost", elasticHost); | |||
pSettingsTree->getProp("@indexName", indexName); | |||
|
|||
// Initialize standard suffixes | |||
countMetricSuffix.append("_count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they are empty. Use of append is based on code review comments from a previous PR from others on the team.
# type - sink type (must be elastic for ElasticSearch support) | ||
# name - name for the sink instance | ||
# settings.elasticHost - url for the ElasticSearch instance | ||
# settings.indexName - ElasticSearch index name where metrics are indexed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable. Change made. It also would allow keeping the index configuration separate in case we decide to push more policy into the sink.
885790b
to
a3dd106
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kenrowland a few comments/questions
@@ -0,0 +1,36 @@ | |||
# | |||
# Defines a elastic sink for reporting metrics to an ElasticSearch instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial. Should be "Defines an..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
Owned<IPropertyTree> pHostConfigTree = pSettingsTree->getPropTree("host"); | ||
|
||
pHostConfigTree->getProp("@name", hostName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the exact iptree behavior, but what happens if pSettingsTree does not contain a "host" branch? If we're not guaranteed accessing pHostConfigTree->getProp is safe, let's protect this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
|
||
if (!pHostConfigTree->getProp("@protocol", hostProtocol)) | ||
{ | ||
hostProtocol.append("https"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a reasonable default, but perhaps not the expected default. If this default is not clearly declared to the admin/user somewhere, let's add something to make sure it is.
} | ||
else | ||
{ | ||
WARNLOG("ElasticMetricSink: Host configuration is invalid"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this commit doesn't show me how the sink will behave if we fall into this case. I see there's a "configurationValid" variable, but what happens when it's set to false? Does the sink simply refuse to report metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't quite decided how to handle this case yet. Since this is a periodic sink, it may be appropriate to add a boolean return code to the prepareToStartCollecting method. That would prevent the base class from starting the collection thread. Or, maybe it just returns each time the doCollection method is returned. This PR was only concentrating on configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I think the question is, do we want to fail the load process and force admins to fix the misconfiguration immediately, or allow the components to load and report the error in logs. There's arguments for both, but if your sink cannot operate w/ the given misconfiguration we might consider error'ing out at load time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am of the belief that we do not fail the load process. We log the appropriate warning and just don't collect metrics. That support should be added when we actually start the collection process.
@ghalliday Please merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, and one problem with the suffixes.
* indexName - The name of the index to which metrics are reported. (required) | ||
* countMetricSuffix - The suffix used to identify count metrics. (default: count) | ||
* gaugeMetricSuffix - The suffix used to identify gauge metrics. (default: gauge) | ||
* histogramMetricSuffix - The suffix used to identify histogram metrics. (default: histogram) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Does the elastic stack support mtls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. HPCC-32968 was opened to add security configuration to the sink. It provides for TLS and basic auth. As time progresses I suspect that we will add additional authentication methods to the sink.
- type: elastic | ||
name: myelasticsink | ||
settings: | ||
countMetricSuffix: count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the comma here and next line are include by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. removed
Owned<IPropertyTree> pHostConfigTree = pSettingsTree->getPropTree("host"); | ||
if (pHostConfigTree) | ||
{ | ||
pHostConfigTree->getProp("@name", hostName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor future point: If the tree is guaranteed to stay active then more common is to use queryProp() and avoid cloning the strings. In this case the code is only executed once, so it doesn't matter.
const char * protocol = pHostConfigTree->queryProp("@protocol");
if (!protocol)
protocol = "https";
histogramMetricSuffix.append("histogram"); | ||
|
||
// See if any are overridden | ||
pSettingsTree->getProp("@countMetricSuffix", countMetricSuffix); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not work. getProp() appends to the existing value, it does not replace it. I am not sure why this code uses a different pattern from the code above that gets the hostname.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
"dynamic_templates": [ | ||
{ | ||
"hpcc_metric_count_to_unsigned_long": { | ||
"match": "*_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cause problems if other, non-HPCC, components are publishing values with these suffixes to elastic stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, however only when publishing to the same index with these dynamic mapping settings. The json object is meant to be representative of how you'd add dynamic mapping. I will update the description to mention that _count can be set to what makes sense for the index taking any additional mappings into account.
configurationValid = !elasticHostUrl.isEmpty() && !indexName.isEmpty(); | ||
|
||
// Initialize standard suffixes | ||
countMetricSuffix.append("count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it likely that anyone is going to want to change these? Under what situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HPCC-32909, one of the current sub task Jiras, is to read the current dynamic mapping from the index. That will likely change if these values are read from the configuration or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kenrowland please squash
} | ||
|
||
if (indexName.isEmpty()) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: We do not normally use {} if an if statement only has a single statement - here and in several other places in this file. It makes it slightly harder to read.
Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file Signed-Off-By: Kenneth Rowland kenneth.rowland@lexisnexisrisk.com
88e7f21
to
2de4b39
Compare
@ghalliday Please merge |
@kenrowland please do not rebase when you squash - unless it is necessary. If you do both at the same time it makes it very hard to check if any other change was introduced as part of the squash. |
Jirabot Action Result: |
Added configuration options for the ElasticSearch sink Updated metrics readme with information on adding the sink to the cluster Added example ElasticSearch configuraion yaml file
Signed-Off-By: Kenneth Rowland kenneth.rowland@lexisnexisrisk.com
Type of change:
Checklist:
Smoketest:
Testing: