Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BoundarySides class #568

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ghiggi
Copy link
Contributor

@ghiggi ghiggi commented Dec 8, 2023

  • Closes #xxxx
  • Tests added
  • Tests passed
  • Passes git diff origin/main **/*py | flake8 --diff
  • Fully documented

This PR add the BoundarySides class to facilitate the manipulation of boundary sides.

Copy link

codecov bot commented Dec 8, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (c240c04) 94.09% compared to head (47e2004) 94.12%.
Report is 10 commits behind head on main.

❗ Current head 47e2004 differs from pull request most recent head a2e9fb1. Consider uploading reports for the commit a2e9fb1 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #568      +/-   ##
==========================================
+ Coverage   94.09%   94.12%   +0.03%     
==========================================
  Files          85       87       +2     
  Lines       13235    13312      +77     
==========================================
+ Hits        12453    12530      +77     
  Misses        782      782              
Flag Coverage Δ
unittests 94.12% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ghiggi
Copy link
Contributor Author

ghiggi commented Dec 8, 2023

@djhoese I think this is also ready to be merged. I will restart working on the other PR on Monday ;)
Have a nice week-end

@coveralls
Copy link

Coverage Status

coverage: 93.711% (+0.04%) from 93.675%
when pulling 47e2004 on ghiggi:add-boundary-sides-class
into c240c04 on pytroll:main.

pyresample/boundary/boundary_sides.py Outdated Show resolved Hide resolved

def __iter__(self):
"""Return an iterator over the sides."""
return iter(self._sides)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the iter() wrapper? I think you can just return self._sides.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djhoese I think iter is right for the purpose here, we are supposed to return an iterator, with self._sides isn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._sides is a tuple. A tuple is an Iterator isn't it? Ah ok, I guess it is an Iterable, but technically needs to support next(my_iter) to be an Iterator:

https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes

Comment on lines +84 to +85
if not isinstance(index, int) or not 0 <= index < 4:
raise IndexError("Index must be an integer from 0 to 3.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think provides much benefit over the error message that self._sides[index] gives by itself, right? You could also do a try/except IndexError if you really wanted to raise this. That way the index checks aren't happening multiple times (inside _sides[] and here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, I think this function should just be return self._sides[index] and let python handle the rest.

Comment on lines +27 to +38
class BoundarySides:
"""A class to represent the sides of an area boundary.

The sides are stored as a tuple of 4 numpy arrays, each representing the
coordinate (geographic or projected) of the vertices of the boundary side.
The sides must be stored in the order (top, right, left, bottom),
which refers to the side position with respect to the coordinate array.
The first row of the coordinate array correspond to the top side, the last row to the bottom side,
the first column to the left side and the last column to the right side.
Please note that the last vertex of each side must be equal to the first vertex of the next side.
"""
__slots__ = ['_sides']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in the large PR and I'm curious what @mraspaud thinks, but I think this entire class can be replaced with a namedtuple. A namedtuple would provide all the same functionality except the .vertices property but I think you can subclass the namedtuple to add that if you need. You could also then do your own docstrings as you have them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dataclass would be an alternative

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that profiling done by some Python core devs have proven that data classes are not faster than any of the alternatives and in some cases are slower. I don't remember those cases and differences.

Co-authored-by: David Hoese <david.hoese@ssec.wisc.edu>
Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, however, this is not used anywhere? Is there a lot of code to change to use this instead of the current sides in the rest of pyresample?

@@ -17,6 +17,7 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""The Boundary classes."""

from pyresample.boundary.boundary_sides import BoundarySides # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to import this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it.

Comment on lines +27 to +38
class BoundarySides:
"""A class to represent the sides of an area boundary.

The sides are stored as a tuple of 4 numpy arrays, each representing the
coordinate (geographic or projected) of the vertices of the boundary side.
The sides must be stored in the order (top, right, left, bottom),
which refers to the side position with respect to the coordinate array.
The first row of the coordinate array correspond to the top side, the last row to the bottom side,
the first column to the left side and the last column to the right side.
Please note that the last vertex of each side must be equal to the first vertex of the next side.
"""
__slots__ = ['_sides']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dataclass would be an alternative

Comment on lines +40 to +43
def __init__(self, sides: tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike]):
"""Initialize the BoundarySides object."""
if len(sides) != 4 or not all(isinstance(side, np.ndarray) and side.ndim == 1 for side in sides):
raise ValueError("Sides must be a list of four numpy arrays.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of controlling the size of the argument, why not pass four arguments instead? this would make it also clear which one is top, bottom, etc...

Copy link
Contributor Author

@ghiggi ghiggi Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined like this because in geometry.py we always retrieve directly a list with the 4 elements.
Do you think it make sense to unpack @mraspaud?
Also consider that for some projections (the left and right) sides are just "dummy" and of size 2 (because we have a circle ...) ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such an obvious solution it makes me angry that I didn't think of it in the giant pull request. I think the fact that left and right are sometimes "dummy" sides doesn't change anything unless there is some optimization by knowing that information, but that doesn't seem to be used here, right? ...I still think a namedtuple serves the same purpose.

I haven't used it yet, but I'm 90% sure that you can specify shape or dimensionality and dtype with the ArrayLike type annotation. That might add some more type-annotation-level information for the user and IDEs to make sure the right thing is passed.


def __iter__(self):
"""Return an iterator over the sides."""
return iter(self._sides)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djhoese I think iter is right for the purpose here, we are supposed to return an iterator, with self._sides isn't.

Comment on lines +84 to +85
if not isinstance(index, int) or not 0 <= index < 4:
raise IndexError("Index must be an integer from 0 to 3.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, I think this function should just be return self._sides[index] and let python handle the rest.

the first column to the left side and the last column to the right side.
Please note that the last vertex of each side must be equal to the first vertex of the next side.
"""
__slots__ = ['_sides']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is __slots__ needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember @djhoese suggesting me to use __slots__ whatever possible to speed up stuffs ...
Do we want to use a named_tuple instead @mraspaud? Or just a class without __slots__?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought on __slots__ is that low-level immutable classes should use it to reduce memory footprint and speedup execution. My opinion is that it should be used anywhere there isn't a strong reason not to (subclassing gets harder, etc). I still prefer a namedtuple here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants