correct directory handling for tasks that are imported as local (non-package) modules #715
+159
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #691
Inspect distinguishes between "local" tasks that are referenced from directories and "package" tasks that are exported by Python packages. The main difference being that local tasks are run within the directory of the task source file (allowing for relative references to datasets, prompt templates, docker config, etc.) and package tasks are run from the current working directory. Package tasks therefore tend to use
__file__
in order to ensure that they can find their resources, whereas local tasks can just make normal relative references.The issue identified in #691 is that non-package modules (e.g. a local
from tasks.easy import mytask
where you are importing fromtasks/easy.py
) were being treated like packages (since they had a "module" we could identify). This PR uses a more refined means of identifying tasks that are in installed Python packages, enabling us to properly treat the local imports the way we treat other local file references. The original test case (https://github.com/epoch-research/inspect-dockerfile-repro) now works as expected.Note that it is certainly debatable whether the behavior of switching to the task file directory for loading/running was a good decision in the first place (as it can be surprising and it enables you to write code that doesn't work correctly when moved into package). The original idea was to allow for a large suite of tasks in a directory hierarchy all of which could rely on relative resource resolution (without requiring the user to do the
__file__
thing). I'd say if I had to make the decision over I wouldn't have introduced this behavior, but if we change it we will break a lot of tasks in the wild (and we are very committed to not breaking people on updates except in vary narrow cases). The best way forward may be simply to encourage people via our documentation to write working directory independent code via__file__
.