Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor steps in blending code #453

Open
wants to merge 36 commits into
base: master
Choose a base branch
from

Conversation

sidekock
Copy link
Contributor

See #443

sidekock and others added 30 commits November 18, 2024 11:28
Co-authored-by: mats-knmi <145579783+mats-knmi@users.noreply.github.com>
….velocity_perturbations = [] in __initialize_random_generators
sidekock and others added 6 commits December 19, 2024 17:21
* Use seed for all rng to make a test run completely deterministic

* fix probmatching test and some copy paste oversights

* Add test for vel_pert_method

* Change the test so that it actually runs the lines that need to be covered
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

Attention: Patch coverage is 72.22222% with 10 lines in your changes missing coverage. Please review.

Project coverage is 84.31%. Comparing base (a7dae54) to head (4fb784e).

Files with missing lines Patch % Lines
pysteps/nowcasts/steps.py 72.22% 10 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #453      +/-   ##
==========================================
+ Coverage   84.26%   84.31%   +0.05%     
==========================================
  Files         160      160              
  Lines       13067    13250     +183     
==========================================
+ Hits        11011    11172     +161     
- Misses       2056     2078      +22     
Flag Coverage Δ
unit_tests 84.31% <72.22%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sidekock
Copy link
Contributor Author

Both visual tests and assert statements give the same conclusion: the output is exactly the same!

@sidekock
Copy link
Contributor Author

@dnerini The codecov check is failing but the only things it is actuary failing on in the deepcopys I have done in the code and on the timing which does not seem worth it to write tests for as this is a very basic subtraction. Would it be able to disable it for these things or how does this work? I know you where able to do it for the re-factoring of the steps nowcasting.

Copy link
Contributor

@RubenImhoff RubenImhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sidekock and @mats-knmi,

Thanks a lot for your work here! I'm sorry I was not around the past days to answer some questions and think along. However, this looks great! With the large number of changes, it is becoming difficult to check everything thoroughly, so at least I'm happy to see that it gives exactly the same results. I think a to do for a new PR could be updating the documentation, but that is probably easier and cleaner once this PR is pushed.

To further reduce memory usage, both this array and the ``velocity_models`` array
can be given as float32. They will then be converted to float64 before computations
to minimize loss in precision.
# TODO: compare old and new version of the code, run a benchmark to compare the two
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO still necessary? I can imagine we'd like to have this done prior to merging this PR.

Copy link
Contributor Author

@sidekock sidekock Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two versions do now provide identical output. I just want to make sure this is the case for all configurations, meaning:

  • Signle core single member
  • Single core multiple member
  • Multiple core single member
  • Multiple core multiple members (less cores than members)
  • Multiple core multiple members (more cores than members)

The reason I want to do this is that I discovered that the pysteps tests do not properly handle this (all tests passed in a previous version of the code but it crashed on issues related to number of cores and number of members).
If you also propose other tests, I would love to hear suggestions!
The tests would exist of running both the master and refactored branch and then doing the following checks:

  • DataArray.identical(other) @mats-knmi do you know if this actually checks the values of the nowcast or only the structure and metadata. Its a bit unclear to me but since it is very fast, I would assume it does not test the data itself.
  • Depending on the previous answer, I could also check some random fields of random times and members.
  • Lastly I can do a visual inspection but if the previous two are fine I do not think this matter but it's a matter of having enough confidence before I actually make this the default code :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identical should do an element wise comparison (at least that is what I gather from the docs). The docs say that identical is the same as equals but also checks metadata. The docs for equals say that it compares not only the structure but also the contents of the datasets. Doing a very simple test I do seem to be able to confirm this.

>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}))
True
>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,2], dtype=np.float64))}))
False

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I should have done such a test myself but totally forgot. Ill do the last checks in the next few days and then we can close this 6 -month-work-in-progress :)

@dnerini
Copy link
Member

dnerini commented Jan 21, 2025

@dnerini The codecov check is failing but the only things it is actuary failing on in the deepcopys I have done in the code and on the timing which does not seem worth it to write tests for as this is a very basic subtraction. Would it be able to disable it for these things or how does this work? I know you where able to do it for the re-factoring of the steps nowcasting.

@sidekock we can simply ignore the failing check and merge whenever you feel it's ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants