Limit parallel execution of a stage's layer #574

maoueh · 2025-01-28T21:47:49Z

Previously, the engine was executing modules in a stage's layer all in parallel. So if you had 20 independent mapper modules, they were all run in parallel.

This was hindering performance on high load where a lot of CPU cycles can be consumed will the machine has limited physical cores available.

We now change that behavior, development mode will not execute any modules in parallel, never. For production mode, we now limit to 2 parallel execution. A future update will make that value dynamic based on the subscription of the request.

sduchesneau

just ensure tier2 gets the same treatment passed down from tier1, and the rest is very good!

sduchesneau · 2025-01-30T16:59:37Z

service/tier1.go

+				requestDetails.MaxParallelJobs = count
+			}
+		}
+		if parallelExecutors := auth.Get("X-Sf-Substreams-Stage-Layer-Parallel-Executor-Max-Count"); parallelExecutors != "" {


need to:

implement the same parsing in tier2.go

ensure we pass that header down to tier2 request

Done please check

sduchesneau

typo in comment, and I suggest moving the IsInDevelopment logic to the code evaluating the max number of workers for proper separation of concerns.

Other than that, LGTM !

sduchesneau · 2025-01-31T14:34:34Z

reqctx/context.go

+
+const safeguardMaxStageLayerParallelExecutorCount = 16
+
+// MaxParallelJobs returns the maximum number of parallel executors (e.g. go routines) that can


typo, should be MaxStageLayerParallelExecutor

sduchesneau · 2025-01-31T14:35:59Z

pipeline/process_block.go

+	logging.Logger(ctx, p.stores.logger).Debug("executing stage's layers", zap.Int("layer_count", len(p.StagedModuleExecutors)), zap.Uint64("max_parallel_executor", maxParallelExecutor))
+
+	for _, layer := range p.StagedModuleExecutors {
+		if isDevelopmentMode || maxParallelExecutor <= 1 || len(layer) <= 1 {


why do we care here about isDevelopmentMode ?

The MaxStageLayerParallelExecutor should check that we are in development mode and return 1. This logic doesn't belong here.

sduchesneau · 2025-01-31T14:36:34Z

reqctx/context.go

+	}
+
+	// If unset, provide default value which is 2 for now
+	if details.MaxStageLayerParallelExecutor == 0 {


as mentionned earlier, add the check for IsInDevelopmentMode here and return 1 ...

Previously, the engine was executing modules in a stage's layer all in parallel. So if you had 20 independent mapper modules, they were all run in parallel. This was hindering performance on high load where a lot of CPU cycles can be consumed will the machine has limited physical cores available. We now change that behavior, development mode will not execute any modules in parallel, never. For production mode, we now limit to 2 parallel execution. A future update will make that value dynamic based on the subscription of the request.

maoueh requested review from sduchesneau and ArnaudBger January 28, 2025 21:47

sduchesneau requested changes Jan 30, 2025

View reviewed changes

maoueh force-pushed the feature/limit-execution-parallelism branch from e9bbdb1 to 2fcfb47 Compare January 30, 2025 21:43

sduchesneau requested changes Jan 31, 2025

View reviewed changes

maoueh force-pushed the feature/limit-execution-parallelism branch from 2fcfb47 to 207dae8 Compare January 31, 2025 21:20

maoueh merged commit 6ba42ce into develop Jan 31, 2025
5 checks passed

maoueh deleted the feature/limit-execution-parallelism branch January 31, 2025 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit parallel execution of a stage's layer #574

Limit parallel execution of a stage's layer #574

maoueh commented Jan 28, 2025

sduchesneau left a comment

sduchesneau Jan 30, 2025

maoueh Jan 30, 2025

sduchesneau left a comment

sduchesneau Jan 31, 2025

sduchesneau Jan 31, 2025

sduchesneau Jan 31, 2025


		const safeguardMaxStageLayerParallelExecutorCount = 16

		// MaxParallelJobs returns the maximum number of parallel executors (e.g. go routines) that can

Limit parallel execution of a stage's layer #574

Limit parallel execution of a stage's layer #574

Conversation

maoueh commented Jan 28, 2025

sduchesneau left a comment

Choose a reason for hiding this comment

sduchesneau Jan 30, 2025

Choose a reason for hiding this comment

maoueh Jan 30, 2025

Choose a reason for hiding this comment

sduchesneau left a comment

Choose a reason for hiding this comment

sduchesneau Jan 31, 2025

Choose a reason for hiding this comment

sduchesneau Jan 31, 2025

Choose a reason for hiding this comment

sduchesneau Jan 31, 2025

Choose a reason for hiding this comment