Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add list of available profiles to run error #105

Merged
merged 4 commits into from
Jul 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ Versions are listed in reverse chronological order, with the most recent at
the top. Non pre-release versions sometimes have an associated name.

## [Unreleased]
- Nothing yet!
### New
- The error message you get when running `kerblam run` with no parameters now
includes a list of available profiles, or tells you that you have specified
no profiles.

## [v1.0.0-rc.3] - 2024-06-24

Expand Down
43 changes: 42 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,5 @@ chwd = "0.2.0"
git2 = "0.18.2"
paste = "1.0.14"
rusty-fork = "0.3.0"
serial_test = "3.1.1"
similar = "2.4.0"
8 changes: 6 additions & 2 deletions docs/src/manual/pipe_docstrings.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,12 @@
If you execute `kerblam run` without specifying a pipe (or you try to run a
pipe that does not exist), you will get a message like this:
```
Error: no runtime specified. Available runtimes:
Error: No runtime specified. Available runtimes:
◾◾ process_csv
🐋◾ save_plots
◾◾ generate_metrics

Available profiles: No profiles defined.
```
The whale emoji (🐋) represents pipes that [have an associated Docker container](run_containers.html).

Expand All @@ -32,10 +34,12 @@ to only have a single contiguous description block in each file.

The output of `kerblam run` will now read:
```
Error: no runtime specified. Available runtimes:
Error: No runtime specified. Available runtimes:
◾📜 process_csv :: Calculate the sums of the input metrics
🐋◾ save_plots
◾◾ generate_metrics

Available profiles: No profiles defined.
```
The scroll (📜) emoji appears when Kerblam! notices a long description.
You can show the full description for such pipes with `kerblam run process_csv --desc`.
Expand Down
19 changes: 18 additions & 1 deletion docs/src/manual/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,24 @@ In short, `kerblam run` does something similar to this:
This is why workflows are written as if they are executed in the root of the
project, because they are.

### Listing out workflows
If you just want a list of workflows that Kerblam! can see, just use
`kerblam run` with no workflow specified. Kerblam will reply with something
like this:

```
Error: No runtime specified. Available runtimes:
◾📜 process_csv :: Calculate the sums of the input metrics
🐋◾ save_plots
◾◾ generate_metrics

Available profiles: No profiles defined.
```

Workflows with a 📜 have [an associated description](pipe_docstrings.md), and
those with a 🐋 have [an associated docker container](run_containers.md).
You also get a list of available data profiles, which are detailed just below.

## Data Profiles - Running the same workflows on different data

You can run your same workflows, *as-is*, on different data thanks to data profiles.
Expand Down Expand Up @@ -160,4 +178,3 @@ For example, you can tell `make` to build a different target with this syntax:
kerblam run make_workflow -- other_target
```
As if you had run `make other_target` yourself.

4 changes: 4 additions & 0 deletions docs/src/manual/run_containers.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ are prepended with a little whale (🐋):
Error: No runtime specified. Available runtimes:
🐋◾ my_workflow :: Generate the output data in a docker container
◾◾ local_workflow :: Run some code locally

Available profiles: No profiles defined.
```

### Default dockerfile
Expand All @@ -68,6 +70,8 @@ pipes that use the default container, so you can identify them easily:
Error: No runtime specified. Available runtimes:
🐋◾ my_workflow :: Generate the output data in a docker container
🐟◾ another :: Run in the default container

Available profiles: No profiles defined.
```

### Switching backends
Expand Down
145 changes: 2 additions & 143 deletions src/commands/run.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
use crate::options::extract_profile_paths;
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::path::PathBuf;

use crate::cache::{check_last_profile, delete_last_profile, get_cache};
use crate::execution::{setup_ctrlc_hook, Executor, FileMover};
Expand All @@ -9,148 +10,6 @@ use crate::utils::update_timestamps;

use anyhow::{anyhow, bail, Result};

/// Push a bit of a string to the end of this path
///
/// Useful if you want to add an extension to the path.
/// Requires a clone.
fn push_fragment(buffer: impl AsRef<Path>, ext: &str) -> PathBuf {
let buffer = buffer.as_ref();
let mut path = buffer.as_os_str().to_owned();
path.push(ext);
path.into()
}

fn infer_test_data(paths: Vec<PathBuf>) -> HashMap<PathBuf, PathBuf> {
let mut matches: HashMap<PathBuf, PathBuf> = HashMap::new();

for path in paths.clone() {
let file_name = path.file_name().unwrap().to_string_lossy();
if file_name.starts_with("test_") {
let slug = file_name.trim_start_matches("test_");
let potential_target = path.clone().with_file_name(slug);
if paths.iter().any(|x| *x == potential_target) {
matches.insert(potential_target, path);
}
}
}

matches
}

// TODO: This checks for the existence of profile paths here. This is a bad
// thing. It's best to handle the error when we actually do the move.
// This was done this way because I want a nice error list.
// The 'check_existence' check was added to overcome this, but it's a hack.
fn extract_profile_paths(
config: &KerblamTomlOptions,
profile_name: &str,
check_existance: bool,
) -> Result<Vec<FileMover>> {
let root_dir = config.input_data_dir();

// If there are no profiles, an empty hashmap is OK intead:
// we can add the default "test" profile anyway.
let mut profiles = {
let data = config.clone().data;
match data {
Some(x) => x.profiles.unwrap_or(HashMap::new()),
None => HashMap::new(),
}
};

// add the default 'test' profile
if !profiles.keys().any(|x| x == "test") {
let input_files = config.input_files();
let inferred_test = infer_test_data(input_files);
if !inferred_test.is_empty() {
log::debug!("Inserted inferred test profile: {inferred_test:?}");
profiles.insert("test".to_string(), inferred_test);
}
}

let profile = profiles
.get(profile_name)
.ok_or(anyhow!("Could not find {} profile", profile_name))?;

// Check if the sources exist, otherwise we crash now, and not later
// when we actually move the files.
let exist_check: Vec<anyhow::Error> = profile
.iter()
.flat_map(|(a, b)| [a, b])
.map(|file| {
let f = &root_dir.join(file);
log::debug!("Checking if {f:?} exists...");
match f.try_exists() {
Ok(i) => {
if i {
Ok(())
} else {
bail!("\t - {file:?} does not exist!")
}
}
Err(e) => bail!("\t- {file:?} - {e:?}"),
}
})
.filter_map(|x| x.err())
.collect();

if !exist_check.is_empty() & check_existance {
let mut missing: Vec<String> = Vec::with_capacity(exist_check.len());
for item in exist_check {
missing.push(item.to_string());
}
bail!(
"Failed to find some profiles files:\n{}",
missing.join("\n")
)
}

// Also check if the targets do NOT exist, so we don't overwrite anything
let exist_check: Vec<anyhow::Error> = profile
.iter()
.flat_map(|(a, b)| [a, b])
.map(|file| {
let f = &root_dir.join(push_fragment(file, ".original"));
log::debug!("Checking if {f:?} destroys files...");
if f.exists() {
bail!("\t- {:?} would be destroyed by {:?}!", f, file)
};
Ok(())
})
.filter_map(|x| x.err())
.collect();

if !exist_check.is_empty() & check_existance {
let mut missing: Vec<String> = Vec::with_capacity(exist_check.len());
for item in exist_check {
missing.push(item.to_string());
}
bail!(
"Some profile temporary files would overwrite real files:\n{}",
missing.join("\n")
)
}

Ok(profile
.iter()
.flat_map(|(original, profile)| {
// We need two FileMovers. One for the temporary file
// that holds the original file (e.g. 'to'), and one for the
// profile-to-original rename.
// To unwind, we just redo the transaction, but in reverse.
[
// This one moves the original to the temporary file
FileMover::from((
&root_dir.join(original),
&root_dir.join(push_fragment(original, ".original")),
)),
// This one moves the profile one to the original one
FileMover::from((&root_dir.join(profile), &root_dir.join(original))),
]
})
.collect())
}

pub fn kerblam_run_project(
config: KerblamTomlOptions,
pipe: Pipe,
Expand Down
Loading