Add sample data script and corresponding .rda files #8

bsweger · 2024-03-27T20:23:30Z

Resolves #4.

This changeset generates sample data files using data from the complex-example-forecast-hub. For this first pass, the samples are generated from local files (i.e., a cloned copy of complex-example-forcast-hub).

Once complex-example-forecast-hub is fully onboarded to the cloud, we'll pull from there instead.

Resolves #4. This changeset generates sample data files using data from the complex-example-forecast-hub. For this first pass, the samples are generated from local files (i.e., a cloned copy of complex-example-forcast-hub). Once complex-example-forecast-hub is fully onboarded to the cloud, we'll pull from there instead.

elray1 · 2024-03-27T20:32:22Z

R CMD check is demanding that we document our data sets. I think we could pull from here, with some light adaptations. Should I do that, @bsweger, or would you like to?

bsweger · 2024-03-27T20:49:04Z

R CMD check is demanding that we document our data sets. I think we could pull from here, with some light adaptations. Should I do that, @bsweger, or would you like to?

I can take a stab at it, but you will likely have some wording revisions. Does that information go at the top of generate_example_forecast_data.R?

elray1 · 2024-03-27T21:01:11Z

It should go in a file called data.R in an R directory in this repository. Here's the description of how to document data sets in the R Packages book: https://r-pkgs.org/data.html#sec-documenting-data

These are descriptions are based on similar ones here: https://github.com/Infectious-Disease-Modeling-Hubs/hubEnsembles/blob/main/R/data.R They will likely require some edits, but this should be a reasonable start.

elray1

Adding an actual/formal review to request 2 changes:

Add documentation for the data objects. As (partially) discussed in some misc. comments, this will involve adding a data.R file that documents the data objects and also running devtools::document() to add the documentation to the R package.
Since we did this, we've decided on two other minor updates to the target values data format. It would be nice to get those updates in this PR as well. Those changes would be made first in the example-complex-forecast-hub repository with the source data, see this issue.

bsweger · 2024-03-29T16:04:11Z

umentation for the data objects. As (partially) discussed in some misc. comments, this will involve adding a data.R file that documents the data objects and also running devtools::document() to add the documentation to the R package.

Since we did this, we've decided on two other minor updates to the target values data format. It would be nice to get those updates in this PR as well. Those changes would be made first in the example-complex-forecast-hub repository with the source data, see this issue.

@elray1 just pushed a commit for the first note, and I'm expecting to do some updates on that documentation based on your feedback

For the second issue, in the spirit of higher velocity and/orsmaller changes, my vote is to get this PR merged and then tackle the issue you linked as a separate piece of work (which I'm happy to do as a fast follow)

It's okay to change our minds/evolve, but I don't like keeping PRs open for long periods of time when we do so. It likely doesn't matter for hubExample, but if this repo was under active development with many devs, lingering PRs increase the liklihood of merge conflicts and other annoyances.

bsweger · 2024-03-29T16:22:46Z

R/data.R

+#'        output_type_id is not relevant for every kind of output_type (for example,
+#'        hubs will not expect output_type_id values when the output_type is mean or median}
+#'   \item{value}{the model’s prediction}
+#'   \item{model_id}{the name of the model}


@elray1 question about model_id: is that a column we'd expect to see in a hub's model output data? I thought we derived it from the filename.

It is true that when the data are sitting in a hub, the model_id is encoded in the file name. But when we collect the data into a data frame in a working R (or in the future, python) session, the model_id is added into the data. And the intent of this example object is to represent what a user might get after running collect_hub(). (Maybe we should say that in this documentation.)

Ah, makes sense--thank you for that clarification. Just pushed a commit with that note.

elray1 · 2024-03-29T17:02:02Z

I'm approving this PR, with these notes:

We may want to make an update related to this thread
R CMD check is failing due to an unrelated issue which i've filed in unit tests failing because there aren't any #9
I filed separate issue Update target values for example forecast hub data #10 for item 2 in my initial review.

@bsweger I'll leave it to you to do something about the first point here or not and then merge this?

Rightly or wrongly, we've decided that this repo doesn't need tests. However, the lack of them is causing a CI failure. This commit is to see what happens if we just remove the tests directory.

codecov · 2024-03-29T18:18:06Z

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

bsweger requested a review from elray1 March 27, 2024 20:23

Add description of the sample datasets

0db0a52

These are descriptions are based on similar ones here: https://github.com/Infectious-Disease-Modeling-Hubs/hubEnsembles/blob/main/R/data.R They will likely require some edits, but this should be a reasonable start.

elray1 requested changes Mar 29, 2024

View reviewed changes

bsweger commented Mar 29, 2024

View reviewed changes

some updates to documentation

804db1d

elray1 approved these changes Mar 29, 2024

View reviewed changes

bsweger added 3 commits March 29, 2024 14:01

Add a note about forecast outputs reflecting model outputs from hubData

23f3ce8

Remove tests directory

6185cea

Rightly or wrongly, we've decided that this repo doesn't need tests. However, the lack of them is causing a CI failure. This commit is to see what happens if we just remove the tests directory.

Appease the lintr

086eddd

Remove Netlify preview

21a0147

bsweger merged commit 431fa24 into main Mar 29, 2024
7 checks passed

bsweger deleted the bsweger/create-sample-hub-data branch March 29, 2024 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sample data script and corresponding .rda files #8

Add sample data script and corresponding .rda files #8

bsweger commented Mar 27, 2024

elray1 commented Mar 27, 2024

bsweger commented Mar 27, 2024

elray1 commented Mar 27, 2024

elray1 left a comment •

edited by bsweger

Loading

bsweger commented Mar 29, 2024 •

edited

Loading

bsweger Mar 29, 2024

elray1 Mar 29, 2024

bsweger Mar 29, 2024

elray1 commented Mar 29, 2024

codecov bot commented Mar 29, 2024

Add sample data script and corresponding .rda files #8

Add sample data script and corresponding .rda files #8

Conversation

bsweger commented Mar 27, 2024

elray1 commented Mar 27, 2024

bsweger commented Mar 27, 2024

elray1 commented Mar 27, 2024

elray1 left a comment • edited by bsweger Loading

Choose a reason for hiding this comment

bsweger commented Mar 29, 2024 • edited Loading

bsweger Mar 29, 2024

Choose a reason for hiding this comment

elray1 Mar 29, 2024

Choose a reason for hiding this comment

bsweger Mar 29, 2024

Choose a reason for hiding this comment

elray1 commented Mar 29, 2024

codecov bot commented Mar 29, 2024

Welcome to Codecov 🎉

elray1 left a comment •

edited by bsweger

Loading

bsweger commented Mar 29, 2024 •

edited

Loading