-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor steps in blending code #453
base: master
Are you sure you want to change the base?
Conversation
…4 for the refactoring
Co-authored-by: mats-knmi <145579783+mats-knmi@users.noreply.github.com>
….velocity_perturbations = [] in __initialize_random_generators
…xed seed assingments
* Use seed for all rng to make a test run completely deterministic * fix probmatching test and some copy paste oversights * Add test for vel_pert_method * Change the test so that it actually runs the lines that need to be covered
…workers at the same time
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #453 +/- ##
==========================================
+ Coverage 84.26% 84.31% +0.05%
==========================================
Files 160 160
Lines 13067 13250 +183
==========================================
+ Hits 11011 11172 +161
- Misses 2056 2078 +22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Both visual tests and assert statements give the same conclusion: the output is exactly the same! |
@dnerini The codecov check is failing but the only things it is actuary failing on in the deepcopys I have done in the code and on the timing which does not seem worth it to write tests for as this is a very basic subtraction. Would it be able to disable it for these things or how does this work? I know you where able to do it for the re-factoring of the steps nowcasting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sidekock and @mats-knmi,
Thanks a lot for your work here! I'm sorry I was not around the past days to answer some questions and think along. However, this looks great! With the large number of changes, it is becoming difficult to check everything thoroughly, so at least I'm happy to see that it gives exactly the same results. I think a to do for a new PR could be updating the documentation, but that is probably easier and cleaner once this PR is pushed.
To further reduce memory usage, both this array and the ``velocity_models`` array | ||
can be given as float32. They will then be converted to float64 before computations | ||
to minimize loss in precision. | ||
# TODO: compare old and new version of the code, run a benchmark to compare the two |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still necessary? I can imagine we'd like to have this done prior to merging this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two versions do now provide identical output. I just want to make sure this is the case for all configurations, meaning:
- Signle core single member
- Single core multiple member
- Multiple core single member
- Multiple core multiple members (less cores than members)
- Multiple core multiple members (more cores than members)
The reason I want to do this is that I discovered that the pysteps tests do not properly handle this (all tests passed in a previous version of the code but it crashed on issues related to number of cores and number of members).
If you also propose other tests, I would love to hear suggestions!
The tests would exist of running both the master and refactored branch and then doing the following checks:
- DataArray.identical(other) @mats-knmi do you know if this actually checks the values of the nowcast or only the structure and metadata. Its a bit unclear to me but since it is very fast, I would assume it does not test the data itself.
- Depending on the previous answer, I could also check some random fields of random times and members.
- Lastly I can do a visual inspection but if the previous two are fine I do not think this matter but it's a matter of having enough confidence before I actually make this the default code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identical should do an element wise comparison (at least that is what I gather from the docs). The docs say that identical is the same as equals but also checks metadata. The docs for equals say that it compares not only the structure but also the contents of the datasets. Doing a very simple test I do seem to be able to confirm this.
>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}))
True
>>> xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,1], dtype=np.float64))}).identical(xr.Dataset(data_vars={"x": ("x", np.array([1,2,3,4], dtype=np.float64)), "y": ("x", np.array([4,3,2,2], dtype=np.float64))}))
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I should have done such a test myself but totally forgot. Ill do the last checks in the next few days and then we can close this 6 -month-work-in-progress :)
@sidekock we can simply ignore the failing check and merge whenever you feel it's ready |
See #443