Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C~scape interface does not work for user #842

Open
ConnectedSystems opened this issue Sep 5, 2024 · 22 comments
Open

C~scape interface does not work for user #842

ConnectedSystems opened this issue Sep 5, 2024 · 22 comments
Assignees

Comments

@ConnectedSystems
Copy link
Collaborator

ConnectedSystems commented Sep 5, 2024

@VHallerBull reports the C~scape interface has stopped working after PR #838 was merged.

The dataset for this test is the 20 scenario set that @VHallerBull provided earlier.

I am unable to reproduce this issue:

using ADRIA


cscape_data_path = "<path to C~scape dataset>"

rs = ADRIA.load_results(CScapeResultSet, cscape_data_path)
Precompiling ADRIA
  212 dependencies successfully precompiled in 307 seconds. 283 already precompiled.
Loading datasets 100%|██████████████████████████████████████████| Time: 0:00:06
    Name: C-Scape_IPMF_model_outputs:Scenario_140001

    Results stored at: C:/Users/tiwanaga/development/C_Scape_data/resultset1

    RCP(s) represented: 45
    Scenarios run: 20
    Number of sites: 213
    Timesteps: 2007:2099

The directory structure follows the same as documented here:

## Loading C~scape Results
Results from C~scape can be loaded with the `load_results` function.
```julia
# Assumes NetCDFs are contained in result subdirectory (see example directory tree below)
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>")
# Retrieves NetCDFs from separate directory
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>", "<path to result directory>")
# Manually pass in a list of files to load as results
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>", ["netcdf_fn1", "netcdf_fn2", ...])
```
The expected directory structure is:
```bash
data_dir
│ ScenarioID.csv
├───connectivity
│ connectivity.csv
├───site_data
│ geospatial_data.gpkg
├───initial_cover
│ initial_cover.csv
└───results (optional)
NetCDF_Scn_140001.nc
NetCDF_Scn_140002.nc
NetCDF_Scn_140003.nc
...
```

Note to @Zapiano @arlowhite : the most recent dev build of the docs don't seem to include the above content? Is the doc build issue truly resolved? (https://open-aims.github.io/ADRIA.jl/dev/usage/results/)

For clarity, here is the full directory structure using tree /F (on Windows terminal):

C:\Users\tiwanaga\development\C_Scape_data\resultset1>tree /F
Folder PATH listing for volume Windows
Volume serial number is B61D-54F2
C:.
│   ScenarioID.csv
│
├───connectivity
│       connectivity.csv
│
├───initial_cover
│       initial_cover.csv
│
├───results
│       CounterfactualMatch.csv
│       NetCDF_Scn_140001_NA.nc
│       NetCDF_Scn_140002_NA.nc
│       NetCDF_Scn_140003_NA.nc
│       NetCDF_Scn_140004_NA.nc
│       NetCDF_Scn_140005_NA.nc
│       NetCDF_Scn_140006_NA.nc
│       NetCDF_Scn_140007_NA.nc
│       NetCDF_Scn_140008_NA.nc
│       NetCDF_Scn_140009_NA.nc
│       NetCDF_Scn_140010_NA.nc
│       NetCDF_Scn_140011_NA.nc
│       NetCDF_Scn_140012_NA.nc
│       NetCDF_Scn_140013_NA.nc
│       NetCDF_Scn_140014_NA.nc
│       NetCDF_Scn_140015_NA.nc
│       NetCDF_Scn_140016_NA.nc
│       NetCDF_Scn_140017_NA.nc
│       NetCDF_Scn_140018_NA.nc
│       NetCDF_Scn_140019_NA.nc
│       NetCDF_Scn_140020_NA.nc
│
└───site_data
        geospatial_data.gpkg

EDIT:

Neglected to mention the reported error arises at:

env_layer_md::EnvLayer = EnvLayer(

Provided screenshot of stacktrace:

image

@ConnectedSystems
Copy link
Collaborator Author

@VHallerBull

Could you please confirm:

  1. You are on the correct branch (cscape-result-interface)
  2. You have pulled the latest changes
  3. The directory structure matches what is outlined above?

Thank you

@VHallerBull
Copy link

  1. Yes, I am on the correct branch
  2. I have pulled the latest changes
  3. Yes with one exception my C~scape results are not saved in the results folder but an external location. I supply this location when calling the function in line 52

@ConnectedSystems
Copy link
Collaborator Author

Yes with one exception my C~scape results are not saved in the results folder but an external location. I supply this location when calling the function in line 52

Could you show what you mean please? Example code and directory structure would be nice (I don't know where this Line 52 is)

@VHallerBull
Copy link

I can't link the code because this part is not on github

image

"Results_dir" is the folder that contains the NETCDF files
image

This worked with the previous version and is necessary to utilize the large number of C~scape results efficiently

@DanTanAtAims
Copy link
Collaborator

DanTanAtAims commented Sep 5, 2024

I ran the code whilst passing the NetCDFs as a separate directory as mentioned above, using the function at the lines. Still was unable to reproduce the error. I also tried with NetCDFs contained on one drive and didn't run into any issues.

@ConnectedSystems
Copy link
Collaborator Author

@VHallerBull is the clone of the repo sitting on OneDrive by any chance?

I'm going to point the finger of suspected blame at GitHub for Windows / OneDrive.

@VHallerBull
Copy link

Yes, everything on AIMS computers is supposed to

@VHallerBull
Copy link

I understand that the code is backed up in Github, but that doesn't include files in the sandbox. So, if I don't have it in the OneDrive, then I risk losing all of it IF there is a computer issue, right? How do you handle that?

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 5, 2024

I understand that the code is backed up in Github, but that doesn't include files in the sandbox. So, if I don't have it in the OneDrive, then I risk losing all of it IF there is a computer issue, right? How do you handle that?

I try to treat everything in the sandbox as temporary, as the sandbox environment is intended for testing/developing ADRIA functionality.

Larger pieces of analyses (for potential papers, etc) I separate to a repository as these should be version controlled and eventually made public with a DOI for transparency/reproducibility (e.g., https://github.com/ConnectedSystems/RME-intervention-efficacy-2024-03).

Aside from that, there's nothing stopping you from syncing the sandbox to a folder on OneDrive (I don't do this, just saying you could: https://superuser.com/questions/1224454/how-do-i-keep-two-folders-in-the-same-computer-synced-with-each-other)

@VHallerBull
Copy link

Ok, so your recommendation in this case would be to fork the branch and work on that as it is for a paper?
In general, I guess, I will restart the whole process and set up the ADRIA github locally instead of on OneDrive and then go from there and see if the issue persists

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 5, 2024

I can't link the code because this part is not on github

You don't need to link the code in this case as it's not part of ADRIA. You can copy paste it here using triple back ticks to format the code, like so:

```julia
Some code
```

The purpose is to allow us to copy paste the code and try exactly what you're running. A screenshot doesn't let us do this.

@DanTanAtAims
Copy link
Collaborator

Ok, so your recommendation in this case would be to fork the branch and work on that as it is for a paper? In general, I guess, I will restart the whole process and set up the ADRIA github locally instead of on OneDrive and then go from there and see if the issue persists

I am currently testing to see if One Drive is the problem or if it is something else.

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 5, 2024

I am currently testing to see if One Drive is the problem or if it is something else.

To clarify, what I suspect is the sync activity between GitHub for Windows and OneDrive corrupted the git logs or the repo files somehow.

EDIT: I think this is a credible explanation given the issue began after changes were pulled...

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 5, 2024

Ok, so your recommendation in this case would be to fork the branch and work on that as it is for a paper? In general, I guess, I will restart the whole process and set up the ADRIA github locally instead of on OneDrive and then go from there and see if the issue persists

You don't need to fork as such.

  1. Create a new repo as a Julia project
  2. If a specific version/branch is needed when adding packages for use, you can clone a separate copy of the package and dev it specifying the branch or commit you want to use. If no changes are expected, you can add (which creates a non-editable install), again specifying the exact branch/commit.
  3. Everything else is as normal

@VHallerBull
Copy link

OK, I'll give that a shot and see what happens

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 5, 2024

@VHallerBull

In step 2, the relevant commands are:

If a non-editable version is all that is needed:

add https://github.com/open-AIMS/ADRIA.jl#name-of-branch

Or, clone ADRIA somewhere and switch to the desired branch, then:

dev <path to local clone>

This is the same as what you would have done for the sandbox environment.
Be aware that with the second option, if you switch branches in your local repo, the active version of ADRIA for that environment will also change.

@ConnectedSystems
Copy link
Collaborator Author

@VHallerBull Are you able to load the C~scape datasets now?

@VHallerBull
Copy link

I am able to load them locally, but still working on the HPC as it gave me an error last week

@ConnectedSystems
Copy link
Collaborator Author

ConnectedSystems commented Sep 16, 2024

I am able to load them locally, but still working on the HPC as it gave me an error last week

An error related to loading datasets or something else? If it's something else we can close this issue.

@VHallerBull
Copy link

A different error but still related to loading the datasets. I am currently running a possible solution, but it could be a few hours before I know if it worked.

@VHallerBull
Copy link

I can now load the dataset, but it still requires 1-2hrs for around 2000 scenarios

@arlowhite
Copy link
Collaborator

Note to @Zapiano @arlowhite : the most recent dev build of the docs don't seem to include the above content? Is the doc build issue truly resolved? (https://open-aims.github.io/ADRIA.jl/dev/usage/results/)

That content isn't in the main branch.

## Loading C~scape Results
Results from C~scape can be loaded with the `load_results` function.
```julia
# Assumes NetCDFs are contained in result subdirectory (see example directory tree below)
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>")
# Retrieves NetCDFs from separate directory
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>", "<path to result directory>")
# Manually pass in a list of files to load as results
rs = ADRIA.load_results(CScapeResultSet, "<path to data dir>", ["netcdf_fn1", "netcdf_fn2", ...])
```
The expected directory structure is:
```bash
data_dir
│ ScenarioID.csv
├───connectivity
│ connectivity.csv
├───site_data
│ geospatial_data.gpkg
├───initial_cover
│ initial_cover.csv
└───results (optional)
NetCDF_Scn_140001.nc
NetCDF_Scn_140002.nc
NetCDF_Scn_140003.nc
...
```

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Not sure what happened, someone deleted the branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants