Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example with "real world" data showing how to align grids #139

Open
markpayneatwork opened this issue Oct 29, 2024 · 5 comments
Open
Labels
Feature New feature or request

Comments

@markpayneatwork
Copy link

markpayneatwork commented Oct 29, 2024

Is your feature request related to a problem? Please describe.
python-cmethods looks very promising, but I am struggling to get it to work on realworld data. The challenge is that I have data from HadGEM2 (360 day calendar) that I am trying to bias correct against ERA5 (standard gregorian calendar), with different spatial grids. The examples that are presented are with "toy data" and therefore don't face these issues.

Describe the solution you'd like
Are you able to extend the examples to include data where there are mismatches in the calendars and grids please?

Additional context
Partly inspired by the conversations that @ShingiNangombe and I have been having with you here: Klimaatlas/KAPy#58

@markpayneatwork markpayneatwork added the Feature New feature or request label Oct 29, 2024
@btschwertfeger
Copy link
Owner

Hello @markpayneatwork, unfortunately I can't DM you via ResearchGate. Could you please contact me, so that we can elaborate a proper distribution channel for the data sets?

https://www.researchgate.net/profile/Benjamin_Schwertfeger

@ehultee
Copy link

ehultee commented Jan 16, 2025

Chiming in here: I am also struggling to apply Cmethods on real-world data.

I have a secondary variable I computed from a GCM, and the same variable computed from the EN4 ocean reanalysis. The time indices are of different types, but I have been able to correct that myself with a little massaging of the xarray Datasets after read-in. The spatial grids are offset by 0.5 degree, which is unfortunate but could eventually be managed point-by-point with a dask ufunc application or a reprojection. I have trimmed everything to be the same period and spatial bounding box and forced identical coords. And still: Cmethods adjust will not run on 1D or 3D data.

The usual error I get is AttributeError: 'Dataset' object has no attribute 'to_dataset' with a traceback to this line. I am attaching an executed Jupyter notebook to illustrate what's going on.

My next plan is to implement some version of the QDM function by myself. It would of course be better if the solution could be within Cmethods. Happy to chat about next steps, if you are still thinking about this.

CMethods-issue.pdf

@btschwertfeger
Copy link
Owner

Hello @ehultee, thanks for sharing your impressions!

If you could provide the data via ResearchGate, I'd be happy to take a look.

Regarding bias adjustment, there are various entry points depending on the data. The examples in the documentation are designed to work seamlessly with cmethods, as the goal is to focus on the methods and their application, rather than the data preparation steps. There are many excellent resources and projects that specialize in data preparation and can provide better guidance on that.

As I'm not working in climate research anymore, I'm not in the best position to create or maintain extensive examples and how-to guides. My aim is to ensure the core functionality is robust and well-supported, but I encourage users to explore other resources for more specific data preparation and use cases. I'm happy to help with issues related to the tool itself, and I'd be thrilled if others contributed examples to help grow knowledge here!


By the way, grouping is disabled for distribution-based methods because the results can be incorrect, as outlined in the documentation. The fun issue you're encountering arises because apply_ufunc is designed to be generic, and reimplementing the same logic repeatedly isn't that fun.

@ehultee
Copy link

ehultee commented Jan 24, 2025

Hey @btschwertfeger, thanks for your response!

Totally understand that you can't be on the hook to maintain examples when you're not working in the field anymore. No pressure.

I have re-implemented a few of your functions in a Jupyter notebook (very messy worked version here). It avoids the behind-the-scenes errors that were coming up when I tried cmethods off-the-shelf, and basically does what I need it to do, so all good there.

A couple suggestions to help close this issue:

  1. I can clean this notebook and make a PR to your repo with the cleaned example and the data it's using. Then we have a worked example with real-world data, though it doesn't call cmethods in the way you wrote it.
  2. I can send you the data files and you can play with them, with the goal of making a nice real-world example that uses cmethods as you wrote it. I don't have ResearchGate, but I could send you a link via email if you like.

Thoughts? 😊

@btschwertfeger
Copy link
Owner

Hello @ehultee,

Thank you for sharing your notebook and for the effort you’ve put into exploring this issue. I had a chance to review it and noticed that it closely aligns with the cmethods implementation, aside from the omission of cmethods.adjust. To streamline the notebook, you might consider importing the relevant functions directly from cmethods.

I also revisited the PDF you provided, which demonstrates your approach to reproducing the error. I noticed you call to_dataset() right before applying the adjust function. When I tried the same with my data, I observed the same behavior. This suggests that cmethods might have worked for you if the to_dataset() step was skipped before calling adjust. I’ll address this by updating the type annotation for that method to prevent such confusion in the future.

Could you please verify if this resolves the issue for you?

Thank you again for offering to provide a notebook with examples. To maintain clarity, it would be great if these examples directly utilize the cmethods package rather than demonstrating that it doesn’t work as expected. 😅 Also, including potentially licensed data might not be ideal here.

If the proposed fix works for you using cmethods, I’d be delighted if you could release your notebook or research, and I’d happily reference it in the cmethods documentation.

Thank you for your time and collaboration! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants