ES-10063 Add multi-project support for more stats APIs #127650

PeteGillinElastic · 2025-05-02T16:38:06Z

This affects the following APIs:

GET _nodes/stats:
- For indices, it now prefixes the index name with the project ID (for non-default projects). Previously, it didn't tell you which project an index was in, and it failed if two projects had the same index name.
- For ingest, it now gets the pipeline and processor stats for all projects, and prefixes the pipeline ID with the project ID. Previously, it only got them for the default project.
GET /_cluster/stats:
- For ingest, it now aggregates the pipeline and processor stats for all projects. Previously, it only got them for the default project.
GET /_info:
- For ingest, same as for GET /_nodes/stats.

It does not add multi-project support for the repositories section on _nodes/stats, since repositories aren't MP-ready yet AFAIK.

This is done by making IndicesService.stats() and IngestService.stats() include project IDs in the NodeIndicesStats and IngestStats objects they return, and making those stats objects incorporate the project IDs when converting to XContent.

The transitive callers of these two methods are rather extensive (including all callers to NodeService.stats(), all callers of TransportNodesStatsAction, and so on). To ensure the change is safe, the callers were all checked out, and they fall into the following cases:

The behaviour change is one of the desired enhancements described above.
There is no behaviour change because it was getting node stats but neither indices nor ingest stats were requested.
There is no behaviour change because it was getting indices and/or ingest stats but only using aggregate values.
In MachineLearningUsageTransportAction and TransportGetTrainedModelsStatsAction, the IngestStats returned will return stats from all projects instead of just the default with this change, but they have been changed to filter the non-default project stats out, so this change is a noop there. (These actions are not MP-ready yet.)
MonitoringService will be affected, but this is the legacy monitoring module which is not in use anywhere that MP is going to be enabled. (If anything, the behaviour is probably improved by this change, as it will now include project IDs, rather than producing ambiguous unqualified results and failing in the case of duplicates.)

This affects the following APIs: - `GET _nodes/stats`: - For `indices`, it now prefixes the index name with the project ID (for non-default projects). Previously, it didn't tell you which project an index was in, and it failed if two projects had the same index name. - For `ingest`, it now gets the pipeline and processor stats for all projects, and prefixes the pipeline ID with the project ID. Previously, it only got them for the default project. - `GET /_cluster/stats`: - For `ingest`, it now aggregates the pipeline and processor stats for all projects. Previously, it only got them for the default project. - `GET /_info`: - For `ingest`, same as for `GET /_nodes/stats`. This is done by making `IndicesService.stats()` and `IngestService.stats()` include project IDs in the `NodeIndicesStats` and `IngestStats` objects they return, and making those stats objects incorporate the project IDs when converting to XContent. The transitive callers of these two methods are rather extensive (including all callers to `NodeService.stats()`, all callers of `TransportNodesStatsAction`, and so on). To ensure the change is safe, the callers were all checked out, and they fall into the following cases: - The behaviour change is one of the desired enhancements described above. - There is no behaviour change because it was getting node stats but neither `indices` nor `ingest` stats were requested. - There is no behaviour change because it was getting `indices` and/or `ingest` stats but only using aggregate values. - In `MachineLearningUsageTransportAction` and `TransportGetTrainedModelsStatsAction`, the `IngestStats` returned will return stats from all projects instead of just the default with this change, but they have been changed to filter the non-default project stats out, so this change is a noop there. (These actions are not MP-ready yet.) - `MonitoringService` will be affected, but this is the legacy monitoring module which is not in use anywhere that MP is going to be enabled. (If anything, the behaviour is probably improved by this change, as it will now include project IDs, rather than producing ambiguous unqualified results and failing in the case of duplicates.)

elasticsearchmachine · 2025-05-02T16:41:15Z

Pinging @elastic/es-data-management (Team:Data Management)

nielsbauman

Thanks a lot for working on this, Pete! I did a first round of review by just looking at the production files. I'll have a look at the tests later on. This feels like the right direction; my comments are mostly small and/or thinking out loud.

nielsbauman · 2025-05-04T18:09:06Z

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/ClusterStatsNodes.java

+                    for (Map.Entry<
+                        ProjectId,
+                        Map<String, List<org.elasticsearch.ingest.IngestStats.ProcessorStat>>> processorStatsForProject : nodeStat
+                            .getIngestStats()
+                            .processorStats()
+                            .entrySet()) {


Nit: can we extract some variables here to keep this a bit more readable? The multi-line types inside a for loop declaration is a bit intense. Same goes for the loop below.

Also, is there a reason this class is fully qualified instead of imported? I don't immediately see any name conflicts but maybe I'm missing something. I still think it's better to extract variables - I'm personally not a big fan of multi-line for loop declarations but maybe that's just me - but it would already help to import the class.

nielsbauman · 2025-05-04T18:36:49Z

server/src/main/java/org/elasticsearch/action/admin/cluster/stats/ClusterStatsNodes.java

+                            String,
+                            List<org.elasticsearch.ingest.IngestStats.ProcessorStat>> processorStats : processorStatsForProject.getValue()
+                                .entrySet()) {
+                            pipelineIdsByProject.computeIfAbsent(projectId, k -> new HashSet<>()).add(processorStats.getKey());


I don't think we need to keep a map of all the pipeline IDs, right? Since we only need the count in the end, we can just keep one set when we loop over the processors in the project and then add the size of the set to an int/long variable that's initialized outside all the loops (technically it could cause an integer overflow, but I don't think that's something we really need to protect against here).

nielsbauman · 2025-05-04T18:43:10Z

server/src/main/java/org/elasticsearch/indices/NodeIndicesStats.java

+     * Constructs an instance. If {@code projectsByIndex} argument is non-null, then the index-to-project map will be stored, and the
+     * project IDs will be prepended to the index names when converting this instance to XContent (except when it is the default project).


I think you mean empty instead of non-null. Also, I think the then the index-to-project map will be stored part is not relevant anymore as we're always storing that field? But maybe I'm misinterpreting it.

nielsbauman · 2025-05-04T18:55:44Z

server/src/main/java/org/elasticsearch/indices/NodeIndicesStats.java

+        if (Objects.equals(projectId, Metadata.DEFAULT_PROJECT_ID)) {
+            return index.getName();
+        }


I'm a little bit on the fence about this approach. Thinking out loud: on the one hand, this maintains BwC with little effort. On the other hand, if we ever have indices in the default project this'll be a bit off. It's been mentioned a handful of times that we might want to consider (so, super hypothetically) having a shared cluster-level GeoIP index to avoid downloading the same DB over again. We shouldn't make this decision purely based on the GeoIP index, but the idea of the faint possibility of cluster-level indices is something to think about. At the same time, this approach doesn't "break" anything in that case either, it just makes the output inconsistent. So, we either try to account for that situation proactively now, or we decide to go with this approach and we solve that issue if/when it comes. Any thoughts?

nielsbauman · 2025-05-04T19:02:52Z

server/src/main/java/org/elasticsearch/ingest/IngestStats.java

        static List<PipelineStat> merge(List<PipelineStat> first, List<PipelineStat> second) {
-            var totalsPerPipeline = new HashMap<String, PipelineStat>();
+            record MergeKey(ProjectId projectId, String pipelineId) {}


TIL you could define records inside methods! Do you have any idea how that compiles? I'm mostly curious if the compiler extracts that record definition somewhere or whether it's going to redefine that record on every invocation of this method.

nielsbauman · 2025-05-04T19:32:55Z

test/external-modules/multi-project/build.gradle

@@ -11,7 +11,7 @@ esplugin {

 dependencies {
  testImplementation project(path: ':test:test-clusters')
-  clusterModules project(':test:external-modules:test-multi-project')
+  clusterModules(project(':test:external-modules:test-multi-project'), project(":modules:ingest-common"))


I think the more common syntax is:

Suggested change

clusterModules(project(':test:external-modules:test-multi-project'), project(":modules:ingest-common"))

clusterModules project(':test:external-modules:test-multi-project')

clusterModules project(":modules:ingest-common")

nielsbauman · 2025-05-04T19:34:07Z

.../plugin/core/src/test/java/org/elasticsearch/xpack/core/datatiers/DataTierUsageFixtures.java

@@ -79,7 +79,7 @@ static NodeIndicesStats buildNodeIndicesStats(RoutingNode routingNode, long byte
            IndexShardStats shardStats = new IndexShardStats(shardId, new ShardStats[] { shardStat });
            indexStats.computeIfAbsent(shardId.getIndex(), k -> new ArrayList<>()).add(shardStats);
        }
-        return new NodeIndicesStats(COMMON_STATS, Map.of(), indexStats, true);
+        return new NodeIndicesStats(COMMON_STATS, Map.of(), indexStats, Map.of(), true);


Nit: can we add a (small) comment here why were using Map.of() for the projectsByIndex parameter?

nielsbauman · 2025-05-04T19:42:19Z

x-pack/qa/multi-project/core-rest-tests-with-multiple-projects/build.gradle

Some YAML tests from the ingest-common module are muted in this file because they depend on stats APIs:

'^test/ingest/15_info_ingest/*', // uses cluster info API '^test/ingest/70_bulk/Test bulk request with default pipeline', // uses stats API '^test/ingest/70_bulk/Test bulk request without default pipeline', // uses stats API

However, because we decided to prefix the project ID, we won't be able to unmute those tests... I don't immediately have a good suggestion, but I thought I'd mention it so we're aware. If we decide that that is OK, we should move these lines to the bottom so they're next to the comment explaining why some stats API tests are muted in MP.

nielsbauman · 2025-05-04T19:50:46Z

server/src/main/java/org/elasticsearch/indices/IndicesService.java

+        return indices.values()
+            .stream()
+            .map(AbstractIndexComponent::index)
+            .collect(Collectors.toMap(index -> index, index -> clusterService.state().metadata().projectFor(index).id()));


I'm wondering if it's worth it to build this map manually instead of using Streams, for performance reasons. I believe the stats API is called pretty regularly by monitoring agents (but I could be mistaken), and if we're doing this for all indices in all projects, the difference between Streams and non-streams might be noticeable. What do you think?

nielsbauman · 2025-05-04T19:55:27Z

server/src/main/java/org/elasticsearch/indices/IndicesService.java

+        return indices.values()
+            .stream()
+            .map(AbstractIndexComponent::index)
+            .collect(Collectors.toMap(index -> index, index -> clusterService.state().metadata().projectFor(index).id()));


Thinking out loud: by using Metadata#projectFor, we throw an exception if the index is not found. I'm not familiar enough with the IndicesService to know whether there is a possibility of the service being "behind" the state in ClusterService. For instance, when an index is deleted from the cluster state, is the indices service updated atomically? Or if an index is added, is it first added to the cluster state or the IndicesService? If you don't already have an answer to those questions (which is completely fine), I think it's worth having a look to be sure what kind of behavior we could see here.

samxbr

LGTM overall! I haven't gone over the tests yet, will run another pass for the tests tomorrow.

samxbr · 2025-05-05T01:40:52Z

server/src/main/java/org/elasticsearch/indices/NodeIndicesStats.java

    public NodeIndicesStats(
        CommonStats oldStats,
        Map<Index, CommonStats> statsByIndex,
        Map<Index, List<IndexShardStats>> statsByShard,
+        Map<Index, ProjectId> projectsByIndex,


Nice, I like this approach of passing in the index to project map

PeteGillinElastic added >non-issue :Data Management/Stats Statistics tracking and retrieval APIs v9.1.0 labels May 2, 2025

PeteGillinElastic marked this pull request as ready for review May 2, 2025 16:40

elasticsearchmachine added the Team:Data Management Meta label for data/management team label May 2, 2025

PeteGillinElastic changed the title ~~Add multi-project support for more stats APIs~~ ES-10063 Add multi-project support for more stats APIs May 2, 2025

nielsbauman reviewed May 4, 2025

View reviewed changes

samxbr reviewed May 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ES-10063 Add multi-project support for more stats APIs #127650

ES-10063 Add multi-project support for more stats APIs #127650

PeteGillinElastic commented May 2, 2025 •

edited

Loading

elasticsearchmachine commented May 2, 2025

nielsbauman left a comment

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

nielsbauman May 4, 2025

samxbr left a comment

samxbr May 5, 2025

		* Constructs an instance. If {@code projectsByIndex} argument is non-null, then the index-to-project map will be stored, and the
		* project IDs will be prepended to the index names when converting this instance to XContent (except when it is the default project).

	clusterModules(project(':test:external-modules:test-multi-project'), project(":modules:ingest-common"))
	clusterModules project(':test:external-modules:test-multi-project')
	clusterModules project(":modules:ingest-common")

ES-10063 Add multi-project support for more stats APIs #127650

Are you sure you want to change the base?

ES-10063 Add multi-project support for more stats APIs #127650

Conversation

PeteGillinElastic commented May 2, 2025 • edited Loading

elasticsearchmachine commented May 2, 2025

nielsbauman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samxbr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PeteGillinElastic commented May 2, 2025 •

edited

Loading