DAG-based mapping and execution #31

WarmCyan · 2023-03-30T18:04:55Z

Closes #29

The intent of this PR is to make curifactory primarily determine need-to-execute for each stage based on the DAG and where that stage's outputs are needed (based on cache status of later stage outputs and overwrite conditions etc.)

This allows skipping loading an artifact into memory if it won't ever be needed by an executed stage

This commit also adds a CLI flag to just print out the map

…f functionality

…anywhere just works

WarmCyan · 2023-04-05T20:06:34Z

Unclear on best terminology for the "soft inputs" that aggregate stages can now take. Currently it's expected_state. Something like inputs would make it consistent with a regular stage, but since it's not actually handled the same way (since they're not directly passed in to the function), it seems odd to call it the same. Alternatively, we could make it so that 'inputs' really do get passed (as lists of values or as a list of tuples representing each 'call') into the function itself

WarmCyan · 2023-04-06T15:52:14Z

The basic concepts have been implemented - the idea is that the full run_experiment code is run in map_mode as a form of forward-pass, meaning every stage short-circuits before running or loading anything (essentially just collecting info about the stage and building up 'pseudo-records'). After this forward pass completes, the manager populates the DAG with a copy of these records and all of the pseudo artifacts (a modified ArtifactRepresentation which will load metadata if the underlying artifact was found/cached etc.)

The DAG then analyzes all the records and artifact metadata to gather execution trees for each leaf stage (stage whose outputs are not used as inputs in any other stage throughout the experiment), and then looks through each tree and determines which stages need to execute based on cache and overwrite status (specifying --overwrite-stage will now also automatically apply to impacted later stages.)

This allows looking at the "execution plan" of an experiment without actually running it, e.g. with the new --map-only flag on the CLI, which will tell you what values are cached, which run they were cached from, and the list of stages that will actually be executed.

An example is below, this is the newsgroups example experiment showing that a couple of the records' parametersets had been run previously (newsgroups_210 and newsgroups_212), so those values will be re-used from cache.

Farther up (this output will be cleaned up at some point), the full execution tree is shown, which is a rough tree-view of each stage and it's indented dependency stages. Following that is the execution list which is a list of tuples of combination record indices and stage names, representing the stages that should actually execute during the experiment run.

Something this should allow - we might be able to automatically unload values from state (conditioned on this not being in interactive mode, and prob some other flag that specifies no_unload or something - by checking the dag's is_output_used_anywhere at the end of a stage, and deleting from state if not.

An interesting side effect and improvement from adding the explicit expected_state is that stage maps now directly show which outputs go into an aggregate stage: (rather than previously, which would just show an arrow from the entire record to the aggregate stage.)

There are cases where the same stage might get called multiple times with the same arguments - the DAG will only see that it doesn't have the cached values at first, and since it doesn't update as the experiment is running, it will think every subsequent call of that stage will still need to execute. By adding in the cache check even when the DAG things the stage should execute, we ensure that it can still use cached values from earlier in the experiment and skip execution.

It seems that if the root logger has a level set to INFO and there are no handlers, it will default to using a handler anyway. I resolved this by setting the logging level to error if quiet is set. Fixes #33

jasonmhite

Nathan and I discussed briefly at a high level and this approach seems sound to me.

stewartsl

I have no problems with the overall approach.

mbadams5

Looks good. Just a few minor comments/questions and some code suggestions.

curifactory/dag.py

curifactory/staging.py

WarmCyan · 2023-07-18T16:45:43Z

After discussion with Mark, we figured that a better way of handling expected_state for @aggregate would be to make it more consistent with @stage, naming it inputs and actually passing in the associated state where possible into the function itself as an argument.

The idea here would be to pass in the state objects via dictionaries that are keyed by the associated records in the passed records list. Records that don't have the specified name in state would simply not be part of the argument-specific dictionary.

As an example, where a previous aggregate stage might have looked like:

@aggregate(outputs=["final_results"])
def combine_results(record: Record, records: list[Record]):
    final_results = {}
    for record in records:
        if "results" in record.state:
            final_results[record.args.name] = record.state["results"]
    return final_results

could now become:

@aggregate(inputs=["results"], outputs=["final_results"])
def combine_results(record: Record, records: list[Record], results: dict[Record, float]):
    final_results = {}
    for record, result in results.items():
        final_results[record.args.name] = result
    return final_results

This eliminates the need to directly access state on records and simplifies having to check whether a given state has the artifact at all, while retaining the benefits above from adding in an expected_state/inputs

…lling

WarmCyan added 7 commits March 8, 2023 22:28

Add input records to map

c67550f

Add map view artifact representation and record view

e42b02f

This commit also adds a CLI flag to just print out the map

Add ideas for map tests

6ca3f3e

Merge branch 'main' into unload

09987ac

Merge branch 'main' into unload

119369d

Fix cacher collecting metadata on map mode

35f57d6

Merge branch 'main' into dag

fd6bab2

WarmCyan mentioned this pull request Apr 4, 2023

Running an experiment with --quiet is not quiet #33

Closed

WarmCyan added 14 commits April 4, 2023 09:09

Allow record's get_reference_name to work for map mode

2f56723

Move record map printing into separate DAG class and start adding lea…

27a4da7

…f functionality

Write tests to find error with is_output_used_anywhere

ed0ad54

Merge branch 'main' into dag

0afc5c5

Fix input name checking in is_output_used_anywhere

7fc7549

Improve map artifact representation handling

aaff79a

Add metadata reference to get_record_string

962eb4b

Merge branch 'main' into dag

7c34da9

Fix DAG tests

e82fb20

Fix record copy not retaining state artifact reps

43aad5f

Add execution chain discovery logic

5978300

Add expected_state to aggregate

3b6e0ee

Fix aggregate expected state stage inputs handling so is_output_used_…

9f06141

…anywhere just works

Merge branch 'main' into dag

002f748

WarmCyan added 3 commits April 5, 2023 16:21

Test adding expected_state to example experiment

6295678

Construct execution as actual trees instead of only chain

5a7a3b1

Add execution list computation based on tree

cd73923

WarmCyan added 3 commits April 6, 2023 16:32

Add single and double stage dag test cases

90097b0

Add record split and aggregate dag execution list tests

0b6d028

Add handling to stage decorator for DAG-based stage skip

3f60fdc

WarmCyan added 9 commits July 6, 2023 10:40

Update map related CLI flags

fbe4380

Fix weird logging conflict between --map and --quiet

fe76c38

It seems that if the root logger has a level set to INFO and there are no handlers, it will default to using a handler anyway. I resolved this by setting the logging level to error if quiet is set. Fixes #33

Clean up documentation in dag.py

df26409

Finish test for stage_cachers

74b1cab

Update CHANGELOG

008a458

Clarify dag documentation

5c75031

Add some missed DAG test cases and further clarify documentation

064db62

Fix grammatical errors in dag comments and minor changes

19ad1f2

WarmCyan marked this pull request as ready for review July 7, 2023 17:48

WarmCyan requested review from jasonmhite, mbadams5 and stewartsl July 7, 2023 17:48

jasonmhite approved these changes Jul 7, 2023

View reviewed changes

stewartsl approved these changes Jul 9, 2023

View reviewed changes

WarmCyan mentioned this pull request Jul 17, 2023

Overhaul and improve artifacts system #58

Open

mbadams5 reviewed Jul 18, 2023

View reviewed changes

WarmCyan added 10 commits July 18, 2023 12:52

Clean up logic in is_leaf

824f8a4

Clean up several for loops

7da9adb

Various other cleaning steps from review

7959d32

Merge branch 'main' into dag

10c72e7

Change aggregate expected state to inputs and implement dictionary fi…

33bb47a

…lling

Update unit tests and fix input record state check in aggregate

196a026

Add tests for aggregate inputs

1d138dd

Update changelog to reflect aggregate inputs

3d8d88c

Update examples to use new aggregate inputs

287f784

Merge branch 'main' into dag

7b0a320

WarmCyan merged commit 7b0a320 into main Jul 19, 2023
2 checks passed

WarmCyan deleted the dag branch July 19, 2023 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DAG-based mapping and execution #31

DAG-based mapping and execution #31

WarmCyan commented Mar 30, 2023

WarmCyan commented Apr 5, 2023 •

edited

Loading

WarmCyan commented Apr 6, 2023

jasonmhite left a comment

stewartsl left a comment

mbadams5 left a comment

WarmCyan commented Jul 18, 2023

DAG-based mapping and execution #31

DAG-based mapping and execution #31

Conversation

WarmCyan commented Mar 30, 2023

WarmCyan commented Apr 5, 2023 • edited Loading

WarmCyan commented Apr 6, 2023

jasonmhite left a comment

Choose a reason for hiding this comment

stewartsl left a comment

Choose a reason for hiding this comment

mbadams5 left a comment

Choose a reason for hiding this comment

WarmCyan commented Jul 18, 2023

WarmCyan commented Apr 5, 2023 •

edited

Loading