Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spherical refactoring #420

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ghiggi
Copy link
Contributor

@ghiggi ghiggi commented Feb 10, 2022

This PR provides a refactoring of spherical.py, introduces new methods and solves long-standing issues.
This PR is required for a future PR addressing the missing methods/option for cropping/slicing of SwathDefinition and AreaDefinition.
It is also likely that pyorbital and pytrollschedule can benefit of this PR.

The refactoring of spherical.py tries to stick to the naming/conventions employed by the shapely library, which is commonly used to perform planar geometrical computation. The PR will introduce backward incompatibilities!

Here below I summarize briefly the content of this very long (sorry!) PR.
It implement 6 main classes:

  • SExtent [NEW]
  • SPoint [previously called SCoordinate]
  • SArc [previously called SArc]
  • SLine [NEW]
  • SPolygon [previously called SphPolygon]
  • SMultiPolygon [NEW]

SPoint

  • Has simply a new method plot to plot with Cartopy.

SArc

  • Has several new methods: midpoint, segmentize, to_line, to_shapely and plot.
  • The method intersections has been renamed _great_circle_intersections.
  • The method intersection has been renamed intersection_point.
  • The method _hash__ has been added to possibly LRU cache intersection_point.
  • The inplace modification in _great_circle_intersections/intersections which was causing Differences in resampling coverage when area crosses 180 deg longitude satpy#383 has been resolved.

SLine

  • Provide methods: intersects, intersection_points, segmentize, buffer, get_parallel_lines, to_shapely and plot.

SPolygon/SMultiPolygon

  • Has several new methods: within, contains, equals, equals_exact, intersects, disjoint, overlap_fraction, overlap_rate, intersection_points, buffer, segmentize, to_line, to_shapely and plot.
  • Compared to SphPolygon, it does not accept a radius argument and performs all computations on a unit sphere.
  • The area can be calculated more accurately with pyproj specifying a custom ellipsoid. A method for computing the perimeter is also provided.
  • The validity of the input geometry can be checked using the is_valid_geometry property.
  • Compared to the previous implementation of intersection, it now does not fail when the intersection with itself, provides all the intersection also in cases of multiple intersections (which returns a SMultiPolygon), and deal correctly with the antimeridian crossing cases.
  • Compared to the previous implementation of union, if the SPolygon(s)/SMultiPolygon(s) are not intersecting it now returns the original geometries in a single SMultiPolygon (instead of None). Cascaded union is performed in case of unioning multiple SPolygons.
  • SMultiPolygons contain multiple SPolygon(s) with the constraint that they must not overlap each other!

SExtent

  • It defines the extent of a SPolygon taking into account if it crosses the antimeridian.
    As an example, the GOES-17 full disk would be represented by a list of 2 extent:
    [[-180.0, -57.76, -79.14, 79.14], [143.76, 180.0, -74.79, 74.79]] so that the extents does not cross the antimeridian and shapely can be employed for the computations.
  • An SExtent can be composed of an unbounded number of extents, under the conditions that the extents do not overlap each other. All the extents can be plotted using plot.
  • The method intersects, disjoint, within, contains, touches and equals employs shapely in the background for fast planar rectangle geometry computation. These methods enable to speed up some of the SPolygon spherical computations and the upcoming PR enabling SwathDefinition.crop,SwathDefinition.get_area_slices and SwathDefinition.get_extent_slices.

I didn't implement yet carefully tests as I wait for your opinion before proceeding.
But I used the code to run intersection/union and get MODIS and GPM swath slices with GEO AreaDefinition over 1 year of data and everything looks fine ;)
As a note of caution, in case of polygons encloses the poles, the SExtent of the SPolygon is currently wrongly calculated as I didn't manage to implement a function that checks if a point (i.e. the pole) is contained in the polygon. I AM LOOKING FOR YOUR HELP WITH THIS.

I am sorry for this long PR landing on your head, I hope you can forgive me @mraspaud @djhoese @gerritholl.

@mraspaud
Copy link
Member

@ghiggi thanks a lot for putting all the effort!
Some preliminary remarks:

  • This PR is way too long, it's going to be very difficult for us to review it because there are so many bits and pieces to understand in one go that we will surely miss something and potentially introduce bugs, and also because I don't know who would be able to review this without setting aside a block of multiple hours to review it.
  • This is changing names of classes that are part of the official API of pyresample, hence breaking compatibility with any software using these. A change of names should be debated with the maintainers first, and if accepted go through deprecation cycle.
  • There are no tests provided for us to understand how the new interface is to be used, and adding complete tests for all this will make the PR at least twice as big, which will make it even harder for us to review.

As such, I strongly suggest you set this PR aside, and start making instead small incremental PRs, adding at most a couple of features/bug fixes at the time, with accompanying tests and documentation. The changes you propose here look very promising, and it would be a pity they don't get through because of having too much to chew at once...

@gerritholl
Copy link
Member

No need to ask for forgiveness for what looks like great work toward solving some long-standing problems, but I second @mraspaud that smaller bites help to avoid choking. Ten small PRs would be easier to handle than one big one. Would it be possible to split it?

@djhoese
Copy link
Member

djhoese commented Feb 10, 2022

If the change to a more shapely-compatible set of names is where we want to go (I'm ok with that) then I'd suggest either:

  1. If possible add new methods to the existing classes while keeping the old ones.
  2. Create a whole new class (ex. SpherePolygon or something) that the user has to opt-in to using and switching.

This way we can put deprecation warnings on all the old stuff and eventually remove it. Heck you could even put it in the future subpackage of pyresample where all the pyresample 2.0 stuff is supposed to go.

Without reading the actual changes, I worry about testing for this. Some of the tests that were added for the existing interfaces were made because of edge cases with the implementation. Verifying that a new interface doesn't suffer from those or similar issues might be difficult.

In addition to agreeing with the many smaller PRs idea, I'd also suggest thinking about making multiple commits. This entire PR is one commit. That's 2000 lines of code where you never had a breaking/pause point. Separate commits make it easier to review your changes and issues you've run into as you develop something so we can work on it. You could even consider Test Driven Development (TDD) where you right tests first and then write the user code that makes that test pass. It may make it easier to make smaller commits and pull requests this way too since you'd be dividing work into user use cases.

@ghiggi
Copy link
Contributor Author

ghiggi commented Feb 10, 2022

I definitely understand your concerns and I could start to do small incremental PRs instead of this huge one. But it would be almost impossible to perform these PRs to spherical.py. I modified spherical.py in this PR with the intention of providing you with an overview of where the main changes happen and discussing your opinions.

I am aware that it exists also the old spherical_geometry.py file, but if you agree with the need for the refactoring I propose, I think it would be nice if we add a new file called spherical_future.py (or let me know the name) where I start adding incremental PRs that you can review progressively. I could proceed in the following order:

  1. Add SPoint
  2. Add SArc
  3. Add SLine
  4. Add SExtent
  5. Add SPolygon and SMultiPolygon

Regarding the integration of such classes defined in spherical_future.py into the current workflow, I would foresee the following:

  • In geometry.py the SphPolygon is currently obtained by creating a Boundary/AreaBoundary/AreaDefBoundary object and then calling the contour_poly method.
  • In my (ready) implementations of SwathDefinition crop/get_area_slices methods, I essentially come up adding a method boundary to AreaDefinition and SwathDefinition classes, which returns an AreaBoundary object. I then added the method polygon to AreaBoundary which enables the retrieval of the boundary SPolygon. It will look like area_def.boundary().polygon() instead of the current AreaDefBoundary(area_def).contour_poly.
  • So once spherical_future.py is ready, we can then progressively think to replace the usage of contours_poly in the code with polygon.
  • spherical.py will remain available for back-compatibility reasons, while the new computations will be performed with the S<geom> class objects defined in spherical_future.py.

When the above will be in place, I will be able to add the PR to enhance get_area_slices also for SwathDefinition. I come up with a quite efficient implementation that can return in about 30 seconds the intersection slices of 1 day of LEO orbits with a GEO disk, with a custom specification of the minimum_overlap_fraction required to define the intersection (see for value 0.9 and 0 in the below pictures) .

min_overlap_0 9
min_overlap_0

Let me know what you think and the path forward !!

@djhoese
Copy link
Member

djhoese commented Feb 10, 2022

Others may disagree with this, but here is my gut reaction in order to:

  1. Not waste your time @ghiggi separating things out in a way that might not help reviewing that much anyway. I know this contradicts some of what I said before, but separating PRs by object type will make it harder to see the overall picture even if it does reduce the individual PR review time. I don't see a way to separate it out any other way unless you see a way to submit individual bug fixes to the existing interfaces and then make a separate PR for the new interfaces (this one would still be large).
  2. Put things in a place that makes future changes easier and require less worrying about backwards compatibility.

I think you should:

  1. Put your new spherical geometry in pyresample/future/spherical/ (https://github.com/pytroll/pyresample/tree/main/pyresample/future). You already have these shapely utils, but are they used by any other module? I'd say put them in pyresample/future/spherical/shapely_utils.py or something like that. You could go Java-style and put each spherical object in its own spherical/ submodule, but maybe there is a more consolidated structure that would work (ex. primitives.py for lines, points, and arcs; polygons.py for extent and polygons).
  2. Put tests in a pyresample/test/test_futures/test_spherical.py (or test_spherical sub-directory).
  3. Any future development/changes to geometry objects (SwathDefinition, AreaDefinition, etc) should be done in the pyresample/future module and be in line with existing Pyresample 2.0 design concepts (Initial discussion on Pyresample 2.0 #181, https://github.com/pytroll/pyresample/milestone/3). Although as I say that I realize none of those issues include the decisions/discussions we had made about geometry objects...oops.

Questions:

  1. Are these individual classes meant to be seen or used by the user? Or are only the highest ones needed by the user?
  2. What time zone are you in?
  3. Can you meet with the pyresample devs to discuss some of these concepts so we are all on the same page about what's going on?
  4. Question for everyone, if we are putting this in future, should we make more drastic changes to the spherical polygon API?

We appreciate all you've done and I think these will all be really good improvements for pyresample. Thanks for helping and hopefully we can get these things figured out sooner rather than later.

@ghiggi
Copy link
Contributor Author

ghiggi commented Feb 11, 2022

I agree with all that you suggest !

  • So, I think that for starting, we can do a copy of current spherical.py into futures/spherical.py. After this first merge, I can start to do incremental PR to this file, so that you can easily track the changes, and you can easily see the point at which it breaks the compatibility.
  • When the new spherical API is implemented and tested, we can think about how eventually to split out stuff in different files.

Answering already your question:

  1. If working from Area/SwathDefinition objects, the import of the classes will not be necessary because SPolygon and SLine object could be derived by area_def.boundary().line() and area_def.boundary().polygon()
    But I think people might be interested in accessing some of the high classes with something like from pyresample.spherical import SPolygon, SLine, SPoint to perform operations that currently shapely does not enable. As an example, with less than 10 lines of code is possible to simulate/generate SwathDefinition objects using pyorbital satellite orbits.
  2. I live in Switzerland, so CET. Not a problem on my side to also meet in the evenings. Next week I would be available unfortunately only Monday (preferably) or Thursday ... when you prefer ...

Let me know @mraspaud @djhoese @gerritholl. Have a nice weekend.

@mraspaud
Copy link
Member

  • So, I think that for starting, we can do a copy of current spherical.py into futures/spherical.py. After this first merge, I can start to do incremental PR to this file, so that you can easily track the changes, and you can easily see the point at which it breaks the compatibility.

No, please do not duplicate code! Make calls to existing functionality from functions with new names, but please do not copy...

@djhoese
Copy link
Member

djhoese commented Feb 11, 2022

If there are bug fixes that can be applied to the existing interfaces 👍. Make those PRs. I see you already made one for the inplace operations in the intersection code.

Otherwise, for these new interfaces, start a PR where you take what you've done here (which seems almost entirely new code, right?) and put it in pyresample/future/spherical/polygons.py. Add imports to pyresample/future/spherical/__init__.py to support the imports you are hoping for (I would have suggested this anyway). Make a commit, push, and start the PR from here. You could even just modify this existing one if you really want. Then add tests with inspiration from the existing tests and any other information you know since you've developed this (what are the edge cases, what new functionality is possible by the user). Make multiple commits during this step and push. Then start refactoring into multiple modules if it makes sense. If you or anyone else doesn't see the benefit of this then we can move it all to pyresample/future/spherical.py.

Given that this seems like a new and separate interface from the existing functionality (correct me if I'm wrong) then starting from the point of the existing interfaces doesn't have a huge benefit in my eyes. So I say just make the new modules and I'll review it when it is ready.

@djhoese
Copy link
Member

djhoese commented Feb 11, 2022

@ghiggi I talked with Martin today in a video call about this (we had the video call about something else, but then talked about this PR). I thought this PR was more a complete rewrite than it was so I have some suggestions. Let me know what you think.

I think you could make a lot of little PRs out of this PR. I'm not entirely sure as I write this where everything should go. Possible PRs (with tests!):

  1. Each bug fix that doesn't require any of the other functionality should be a separate PR (for each bug fix).
  2. Each class marked as (NEW) in your description should be a separate PR (one PR per class...with tests).
  3. All the renames should be done in one single PR. If possible subclassing or the creation of base classes should be used to preserve the old class while the new class can be updated in later PRs with additional functionality. If needed an old class's __init__ could be updated with a DeprecationWarning saying to use one of the new classes.
  4. Additional methods/functionality that are related should be in separate PRs. For example, one for __contains__/within/intersection, one for plotting, etc. If any of this can be done from a base class for all spherical classes that would be great. Like a central def plot method. Or make plot a separate utility function outside of the classes (Martin wanted this, I'm less sure).

@ghiggi
Copy link
Contributor Author

ghiggi commented Feb 11, 2022

Some quick considerations/thoughts:

  • If there are fixes that I can do to the current spherical implementation I will do it ;)
  • SCoordinate and CCoordinate are essentially unvaried except for the plot method and some improved docs.
  • Concerning Arc, I can add the methods to spherical.py without breaking back-compatibility.
  • SPolygon is almost a complete rewrite and I will put it in /future/spherical/polygon.py. Really no sense to add PRs to SphPolygon. I will do PR "per topic".
  • SLine and SExtent are completely new and I will place them into /future/spherical/line.py and /future/spherical/extent.py.
  • The definition of SArc and SPoint will be done in /future/__init__ by importing spherical.Arc and SCoordinate.
  • I will think if it makes sense to develop a separate plot utility function. Instantaneously, I would say that to plot SPoint on Cartopy is necessary to use ax.scatter, because shapely Point/MultiPoint can not be plotted AFAIK. Plotting LineString requires the specification of facecolor='none' otherwise it plots polygons ... we will see ...

In my mind I plan to commit 1/2 day per week on these PRs, to give you time to review.

I will also start parallelly to add incremental small PR for the Boundary retrieval and SwathDefinition methods. I am not sure I see how to add new methods in /future/geometry/swath (monkey patching?). Maybe I will need an example ;)

At a given point, we should discuss also how to implement a function to test if a point is contained in a polygon. This will be required to enable the API to work on all edge cases. I couldn't figure out which maths are required ...

@djhoese
Copy link
Member

djhoese commented Feb 11, 2022

That sounds like it should work. For the future SwathDefinition, we'd have to make a subclass but I'm not sure that is even the right approach yet. As long as it doesn't break backwards compatibility, could you define new methods or change existing methods in the current SwathDefinition that use the future spherical stuff?

@ghiggi
Copy link
Contributor Author

ghiggi commented Sep 19, 2022

@mraspaud @djhoese. It's a long time since the last iterations on this refactoring.
Unfortunately, I never found the time to advance and implement the tests ...
I was thinking ... if I come to the PCW meeting in November in Athens, do you think it could make sense to address (part of) this this refactoring during the week? I think that in-person discussion will facilitate the design and implementation of the required PRs. Let me know ;-)

@mraspaud
Copy link
Member

It will definitely help, yes! Live discussions, but also dedicated time will allow you to work on this efficiently.

@ghiggi
Copy link
Contributor Author

ghiggi commented Sep 19, 2022

In such a case, would you agree to have a meeting already in mid of October to briefly discuss again the structure of the PRs? In this way, I could already do some preparatory PRs to be efficient as possible during the week in Athens.

@mraspaud
Copy link
Member

Works for me!

@ghiggi
Copy link
Contributor Author

ghiggi commented Sep 26, 2022

Hey @mraspaud. Do you have time in the week between the 17 and 21 October? Do @djhoese want to participate too? Have a nice week :)

@djhoese
Copy link
Member

djhoese commented Sep 26, 2022

I'd be OK joining any conversations, but given the time zone differences don't let my schedule stop you from meeting.

Edit: It may be best to have this conversation on slack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants