-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit parallel execution of a stage's layer #574
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just ensure tier2 gets the same treatment passed down from tier1, and the rest is very good!
service/tier1.go
Outdated
requestDetails.MaxParallelJobs = count | ||
} | ||
} | ||
if parallelExecutors := auth.Get("X-Sf-Substreams-Stage-Layer-Parallel-Executor-Max-Count"); parallelExecutors != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to:
- implement the same parsing in tier2.go
- ensure we pass that header down to tier2 request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done please check
e9bbdb1
to
2fcfb47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in comment, and I suggest moving the IsInDevelopment logic to the code evaluating the max number of workers for proper separation of concerns.
Other than that, LGTM !
reqctx/context.go
Outdated
|
||
const safeguardMaxStageLayerParallelExecutorCount = 16 | ||
|
||
// MaxParallelJobs returns the maximum number of parallel executors (e.g. go routines) that can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, should be MaxStageLayerParallelExecutor
pipeline/process_block.go
Outdated
logging.Logger(ctx, p.stores.logger).Debug("executing stage's layers", zap.Int("layer_count", len(p.StagedModuleExecutors)), zap.Uint64("max_parallel_executor", maxParallelExecutor)) | ||
|
||
for _, layer := range p.StagedModuleExecutors { | ||
if isDevelopmentMode || maxParallelExecutor <= 1 || len(layer) <= 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we care here about isDevelopmentMode ?
The MaxStageLayerParallelExecutor
should check that we are in development mode and return 1. This logic doesn't belong here.
} | ||
|
||
// If unset, provide default value which is 2 for now | ||
if details.MaxStageLayerParallelExecutor == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentionned earlier, add the check for IsInDevelopmentMode here and return 1 ...
Previously, the engine was executing modules in a stage's layer all in parallel. So if you had 20 independent mapper modules, they were all run in parallel. This was hindering performance on high load where a lot of CPU cycles can be consumed will the machine has limited physical cores available. We now change that behavior, development mode will not execute any modules in parallel, never. For production mode, we now limit to 2 parallel execution. A future update will make that value dynamic based on the subscription of the request.
2fcfb47
to
207dae8
Compare
Previously, the engine was executing modules in a stage's layer all in parallel. So if you had 20 independent mapper modules, they were all run in parallel.
This was hindering performance on high load where a lot of CPU cycles can be consumed will the machine has limited physical cores available.
We now change that behavior, development mode will not execute any modules in parallel, never. For production mode, we now limit to 2 parallel execution. A future update will make that value dynamic based on the subscription of the request.