-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PoC implementation of using Celery for task running #7
base: basic-interface
Are you sure you want to change the base?
Conversation
from ref_core.env import env | ||
from ref_core.metrics import Configuration, TriggerInfo | ||
|
||
config = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parent process will need to track the jobs that have been declared as worker join and leave the cluster. The workers may declare the list of jobs that they provide and their dependencies at run-time
Perhaps the parent process needs to know which providers to listen to some way of whitelisting/blacklisting metrics.
@nocollier Here is an PoC for using celery to manage distributed task execution. There are a few other refactoring changes thrown in so it might not be perfectly clear. You should be able to run You might also need to run |
I might revive this as we need an async approach to executing metrics and celery is the easiest way to be able to test async tasks locally without k8s |
Description
A basic PoC of using celery workers to run metrics.
Celery is an async, distributed task queue. In this particular example we use redis as a backend as a task broker and results store. A parent process may queue a number of tasks which are to be executed out-of-process by the appropriate worker.
Each benchmarking package would build a docker image that contains all the required dependencies. This image can be used to spawn one or more docker containers. Each of these docker containers would run the
ref-celery
command which starts a new celery worker which listens for any tasks that need to be run.When new calculations are queued up the appropriate
run_metric
function is called and the results returned to the parent process.The stack can be started using
docker-compose up
.This spawns 3 containers
Outside of the docker containers jobs can be queued up (see
scripts/runner.py
).This decouples the queueing of tasks from the execution of tasks.
Still todo
Checklist
Please confirm that this pull request has done the following:
changelog/