Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add geo-traits crate #1157

Merged
merged 40 commits into from
Oct 26, 2024
Merged

Add geo-traits crate #1157

merged 40 commits into from
Oct 26, 2024

Conversation

kylebarron
Copy link
Member

@kylebarron kylebarron commented Feb 26, 2024

  • I agree to follow the project's code of conduct.
  • I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

Updated version of #1019 that's clean against the latest main. Aside from having no extra diff from main, it also has:

geo-traits/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Member

@frewsxcv frewsxcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for driving this initiative @kylebarron! I'm excited to use these traits in my own projects.

This pull request looks mergeable to me! With the understanding it is not yet ready to be published. I want to hear from @michaelkirk and @urschrei before merging though.

@kylebarron
Copy link
Member Author

kylebarron commented Feb 27, 2024

In the medium term (in 2024) I would love to add 3d support to geoarrow, so I want to keep thinking about xyz geometry traits, but I'm pretty happy with these for 2d geometries. They've worked well so far in geoarrow.

@michaelkirk
Copy link
Member

Thanks for keeping this up to date WRT to main @kylebarron!

I said in #1011 (and still believe) that having some examples of algorithms that use these trait definitions would be key to evaluating them.

Are there examples anywhere that utilize these traits? @frewsxcv - have you tried, or would you be willing to see how this code works in your project?

@frewsxcv
Copy link
Member

Thanks for keeping this up to date WRT to main @kylebarron!

I said in #1011 (and still believe) that having some examples of algorithms that use these trait definitions would be key to evaluating them.

Are there examples anywhere that utilize these traits? @frewsxcv - have you tried, or would you be willing to see how this code works in your project?

Are the usages in geoarrow-rs sufficient? Or are you looking for something else? Also would love to hear why you find this a blocker to merging these in to the repository.

@frewsxcv
Copy link
Member

In rgis I would like to be able to read geospatial data and perform operations on the underlying data without having to convert through intermediary representations like geo-types. geo-traits with geozero would make this a lot simpler.

@kylebarron
Copy link
Member Author

kylebarron commented Feb 27, 2024

Examples of how they're used today in geoarrow-rs:

@JosiahParry has also implemented these traits in serde_esri, and I believe he's tested against geoarrow.

@michaelkirk
Copy link
Member

Thanks for the examples @kylebarron! I'll take a look.

And relevant to @frewsxcv question:

Also would love to hear why you find this a blocker to merging these in to the repository.

I continue to think it's reasonable that code in main is intended to be useful, and that seeing code actually used in pursuit of the problem it purports to solve seems like a reasonable bar. I look forward to looking at @kylebarron's examples to that end.

geo-traits with geozero would make this a lot simpler.

I understand that's the theoretical goal, and a worthy one. I'm asking to see if this actually works towards that goal. Care to give it a shot? Is there something that keeps you from building against this branch, for example?

@michaelkirk
Copy link
Member

@JosiahParry has also implemented these traits in serde_esri, and I believe he's tested against geoarrow.

@JosiahParry - can you elaborate on what these traits enable you to do? Presumably it's useful to you or you wouldn't have done it 😉

@frewsxcv
Copy link
Member

Nope! There is nothing blocking me from building on-top off a branch.

The point of merging the code into main is to make it easier to iterate on, as I find working with remote git branches to be unnecessarily tedious and hard to reason about when I'm using other geo dependencies in my projects. Another point is to prevent staleness that happened often with the previous separate branch approach, which is a concern I had from the start.

@frewsxcv
Copy link
Member

@michaelkirk Correct me if this is wrong, but it sounds like you are skeptical of the usefulness of these traits without actually saying that. Am I mind reading correctly? I feel like there's an unusual and surprising amount of resistance to merging this in.

@michaelkirk
Copy link
Member

Correct me if this is wrong, but it sounds like you are skeptical of the usefulness of these traits without actually saying that. Am I mind reading correctly? I feel like there's an unusual and surprising amount of resistance to merging this in.

I feel a bit attacked! But it's true - I am skeptical of the traits code. Truly though, I am skeptical of all code. I am admittedly bit extra skeptical of the geo-traits code in that it's not just a single isolated algorithm, it's a cross cutting concern. I feel like I've said similar things before, but maybe not clearly enough. I appreciate @kylebarron patience with me.

An implementation of BoundingRect in terms of traits.

This implementation looks good to me! You're using the trait accessors to implement a useful algorithm and it doesn't seem overly ceremonious compared to the implementation on geo-types.

FrechetDistanceLineString / Area

Ugh, these ones on the other hand, seem a bit unfortunate. Is there a way to make these look more like the BoundingRect one? Or is it reasonable to assume that most algorithms implemented on geo-traits will need to have a separate trait definition (DistanceLineString, DistancePolygon, etc.) for each geometry type?

I haven't yet grokked why these implementations can't look more like the BoundingRect one - something about having a dedicated struct for the algorithm (i.e. struct BoundingRect) maybe? Can we do this with our other algorithms?

Also, am I reading it correctly that the "trait based" implementation for FrechetDistance is simply converting to a geo-types and then calling FrechetDistance on that? Like we're not actually implementing the algorithm using the trait here in any meaningful way - or am I misunderstanding? Is there something preventing a geo-types free implementation?


All the conversion API's seem reasonable at first blush, but can you connect the dots for me a little bit?

let line_string_1_wkb: &[u8] = "..."
let line_string_bbox = ???

let line_string_2_wkb: &[u8] = "..."
let frechet_distance = ???

@kylebarron
Copy link
Member Author

Or is it reasonable to assume that most algorithms implemented on geo-traits will need to have a separate trait definition (DistanceLineString, DistancePolygon, etc.) for each geometry type?

All algorithm traits from geo need a separate implementation in geoarrow because geo's traits always return scalar objects while geoarrow's traits need to return arrays or chunked arrays.

I would love to have only one algorithm trait, e.g. just Distance but as mentioned in #1113 I don't think it's possible to implement blanket traits for multiple traits without specialization. I think impl<G: LineStringTrait> Distance for G {} and impl<G: PolygonTrait> Distance for G {} will always fail without specialization? So my first thought was to separate the traits to be able to implement impl<G: PolygonTrait> DistancePolygon for G {}. But I'd love for there to be a better way 🙂.

Also, am I reading it correctly that the "trait based" implementation for FrechetDistance is simply converting to a geo-types and then calling FrechetDistance on that? Like we're not actually implementing the algorithm using the trait here in any meaningful way - or am I misunderstanding? Is there something preventing a geo-types free implementation?

In BoundingRect I'm reimplementing the core algorithm in a native way on the traits. I do this for BoundingRect because getting bboxes is common enough that removing overhead is good and it's quite easy to implement. In FrechetDistanceLineString I don't want to touch the core algorithm implementation. In general, geoarrow's role is to provide the glue between high-level APIs in JS and Python to core algorithms implemented in other crates, such as geo.

The copy to a geo object is short-term glue that allows the trait to be called either on geoarrow scalars (e.g. a positional reference in an array) or geo scalar objects. In the long run, the underlying FrechetDistance trait in geo could be implemented in terms of LineStringTrait and the geoarrow glue wouldn't need to copy the input into geo::LineString.


All the conversion API's seem reasonable at first blush, but can you connect the dots for me a little bit?

let line_string_1_wkb: &[u8] = "..."
let line_string_bbox = ???

let line_string_2_wkb: &[u8] = "..."
let frechet_distance = ???

I added a test for the first one in this example PR: geoarrow/geoarrow-rs#542

// A builder for a columnar WKB arrays
let mut wkb_builder = WKBBuilder::<i32>::new();
// Add a geo polygon to the WKB array
// This uses geo-traits to serialize to WKB and adds the binary to the array
wkb_builder.push_polygon(Some(&p0()));

// Finish the builder, creating an array of logical length 1.
let wkb_arr = wkb_builder.finish();

// Access the WKB scalar at position 0
// This is a reference onto the array. At this point the WKB is just a "blob" with no other
// information.
let wkb_scalar = wkb_arr.value(0);

// This is a "parsed" WKB object. The [WKBGeometry] type is an enum over each geometry
// type. WKBGeometry itself implements GeometryTrait but we need to unpack this to a
// WKBPolygon to access the object that has the PolygonTrait impl
let wkb_object = wkb_scalar.to_wkb_object();

// This is a WKBPolygon. It's already been scanned to know where each ring starts and ends,
// so it's O(1) from this point to access any single coordinate.
let wkb_polygon = match wkb_object {
    WKBGeometry::Polygon(wkb_polygon) => wkb_polygon,
    _ => unreachable!(),
};

// Add this wkb object directly into the BoundingRect
let mut bounding_rect = BoundingRect::new();
bounding_rect.add_polygon(&wkb_polygon);

assert_eq!(bounding_rect.minx, -111.);
assert_eq!(bounding_rect.miny, 41.);
assert_eq!(bounding_rect.maxx, -104.);
assert_eq!(bounding_rect.maxy, 45.);

Run with

cargo test --lib -- test2::test_wkb_to_bbox --nocapture

@kylebarron
Copy link
Member Author

I haven't yet grokked why these implementations can't look more like the BoundingRect one - something about having a dedicated struct for the algorithm (i.e. struct BoundingRect) maybe? Can we do this with our other algorithms?

I think part of why these implementations look the way they do is because I want to have compile-time safety across geometry types. E.g. I don't want to have fn frechet_distance(array: GeometryArray) -> Float64Array because frechet distance is not implemented for every geometry type. In this way, I think I've adopted the same use of traits as geo itself has done. The trait impl describes what operations are allowed on different combinations of objects. But this idea, that the trait describes what is allowed, ends up significantly more complex when you want to handle broadcasting across scalars, arrays, and chunked arrays.

(The chunked array abstraction, where you might have 10 million geometries in 100 separate chunks where each of those chunks is a contiguous buffer holding 100,000 geometries, is quite useful and is currently my unit of "automatic parallelism". I.e. a rayon par_map is called over each chunk.)

But the largest area of complexity in geoarrow-rs is handling these combinations. I think it's useful for UX to be able to call an operation like Distance against either a geoarrow scalar or a geo scalar, and thus it's necessary either to copy my impls twice, which is a lot of duplicated code, or implement them against a shared blanket trait.

@JosiahParry
Copy link
Contributor

JosiahParry commented Feb 29, 2024

I would quite like this! I think it would be really neat if the geo-type traits also had a new() method that would allow you to write algorithms entirely with trait methods.

Take this super derived example (i adjusted @kylebarron's geo-traits/src/line_string.rs file) https://gist.github.com/JosiahParry/f3469f8df4706a4d36f8a47db9ae3e09)

  trait StartEnd where Self: LineStringTrait {
      fn start_end(&self) -> Self::Owned {
          let start = self.coord(0).unwrap();
          let end = self.coord(self.num_coords() - 1).unwrap();
          Self::new(vec![start, end])
      }
  }

impl<T: CoordNum> StartEnd for LineString<T> {}

 #[test]
    fn test_start_end() {
        let coords = vec![
            Coord {x: 1.0, y: 2.0},
            Coord {x: 3.0, y: 4.0},
            Coord {x: 5.0, y: 6.0},
        ];

        let line_string = LineString::new(coords);
        let start_end = line_string.start_end();

        println!("{:?}", start_end);
        assert_eq!(start_end.num_coords(), 2);
    }
running 1 test
LineString([Coord { x: 1.0, y: 2.0 }, Coord { x: 5.0, y: 6.0 }])
test line_string::tests::test_start_end ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 2 filtered out; finished in 0.00s

@kylebarron
Copy link
Member Author

kylebarron commented Feb 29, 2024

I think that trait definition has the same downsides as mentioned elsewhere, where you'll need a different trait definition for each geometry type. (You'd need a StartEndPolygon that's implemented on PolygonTrait).

I also think that it's just extra unnecessary complexity, because I argue that the implementation of the algorithm should choose the most appropriate return type, as long as the geometry trait is also implemented on that return type. That means, for any geo-implemented algorithms, the internal, temporary data objects will always be geo objects, and so the lowest-cost return object will always be a geo object. Instead of always incurring the cost to convert back to the original trait representation, that conversion should be deferred to the called for when it's optimal for them. For many libraries, calling convert_to_my_repr(line_string.start_end()) where convert_to_my_repr is defined as

fn convert_to_my_repr(geom: &LineStringTrait) -> MyLineStringStruct {}

But for other libraries, especially in the context of libraries like geoarrow where you want to optimize the vectorized execution and not the scalar execution, it's much better to defer the data transformation until you have an array of geometries.

@busstoptaktik
Copy link

busstoptaktik commented Mar 19, 2024

I need something like this for Geodesy, but obviously only need the CoordTrait there, so my comments may be a bit narrow-sighted.

First, given your comments, @kylebarron, about wanting to extend towards 3D, I find the name of the x_y method unfortunate, and would prefer xy.

Both because for my TeX-eyes, x_y looks like x with subscript y, and because with 3D, 4D and/or Measure extensions, it will be too much of an eyecatcher to spot x_y_z_t_m (and its truncated siblings) in my source code. xy, xyz, xyzt, etc. would suffice.

Second, I find it reasonable to extract the CoordTrait in a separate crate, since coordinates and features (in the OGC Simple Features sense) are two very different things. Then the geo-traits crate could start by implementing CoordTrait for the Point feature, and build the world from there.

Also, if I understand correctly, most of the reservations expressed above would not apply to CoordTrait in itself, so it could probably be merged separately. Which (full disclosure) would be extremely useful for me, especially if somewhat extended.

Third, as a general remark of the Georust naming conventions x and y are rather unfortunate names for something that may most likely be longitudes or eastings in the former, latitudes or northings in the latter case.

Geodetically speaking, coordinates are ordered, and can be referred to as 1st, 2nd, 3rd and 4th, and as you never know what is the CRS specific convention regarding (N,E) vs (E,N), any kind of implied (mis)understanding is unfortunate. I suggest to let the trait implementer provide x(), y(), z(), t(), and let the trait autoprovide first(), second(), third(), fourth() (or preferably the other way round), and then just postulate that the naming is an internal convention, not to be confused with any external conventions.

Evidently, trying to extend this to the entire georust ecosystem would neither be valuable or reasonable - wherever the culture descends from an "earth-is-flat-as-my-monitor" computer graphics world, x and y are obviously the more useful convention. But in a CoordTrait it would make much sense to support a more geodetic world view - it wouldn't harm, and it would not make implementation harder or more verbose: It's just a few autoprovided aliases in the trait definition.

EDIT 2: On second thoughts, I actually think the Third item can be ignored: It doesn't matter how the "this is the internal convention" is communicated, and first()...fourth() is probably a unnecessary complication, that impedes communication.

Fourth, I have implemented an extended version of CoordTrait over at Geodesy, confined to the Coordinate module (with impls here), and the Geodesics functionality of the Ellipsoid implementation.

As you can see, I took the liberty to extend to 4D+M, but probably not in a very elegant way. Also, I need the xy_as_f64 etc. functionality (cf. geodesic_fwd) and geodesic_inv), but have probably implemented it in an idiotic, rather than idiomatic way, so any comments and/or improvements would be much appreciated.

I would find it awesome to be able to implement a tighter integration between Geodesy and the Georust universe. And a somewhat geodesy-aware CoordTrait would be a straightforward and sensible way of doing it.

I know, however, that I am stomping on your grass, but please consider my suggestions - and I would be very happy for any comments regarding how I am wrong, and how I could work further towards a Georust<->Geodesy bridge.

@kylebarron
Copy link
Member Author

I'm interested in multi-dimensional geometry support, and think that handling it in traits could be a good way to incrementally add support. But I think it would probably be ideal if the dimension could be a generic, so that you'd statically know what type of coordinate you have, and which type of operations could be done on it.

I'm happy to switch x_y to xy.

Then the geo-traits crate could start by implementing CoordTrait for the Point feature, and build the world from there.

geo has had some discussion about whether to merge Point and Coord, and so it's not 100% clear that we do want both PointTrait and CoordTrait (in this PR I implemented PointTrait on Coord and CoordTrait on Point for simplicity). In any case CoordTrait/PointTrait is intricately linked to the rest of the traits here. A LineStringTrait will yield objects that implement CoordTrait, a PolygonTrait will yield objects that implement LineStringTrait and so on. So it's not clear to me that stabilizing only CoordTrait is the best, if we have to modify that later to support geometries.

@busstoptaktik
Copy link

But I think it would probably be ideal if the dimension could be a generic, so that you'd statically know what type of coordinate you have, and which type of operations could be done on it.

That would require a good deal more genericity than just the dimension.

In PROJ, we check the pipelines upon construction, such that the output type of step n fits with the input type of step n+1, but that's static features of each operator, not of the coordinate (the coordinate types are all just untagged unions), so the coordinate is supposed to fit the pipeline its fed into, while the pipeline is checked for internal consistency. And the internal consistency is still just at a very general level (any kind of projected coordinate will be accepted between a step producing kind A and a step expecting kind B)

That's why I entirely dropped the "different coordinate types" in Rust Geodesy, and only support different dimensionalities: It just doesn't make sense - coordinates are interpreted by operators which expect a specific input type, and the gamut of possible types is infinite.

No reason to try to make the type system reject having me feeding a utm zone 32 coordinate into a pipeline expecting some other type of projected coordinate (or even geographical or geocentric cartesians): The output becomes garbage, but its consistent garbage :-) and there are just too many kinds of parametrizations, each of which would lead to principally different gamuts for valid operations.

So it's a tough job!

@busstoptaktik
Copy link

I'm interested in multi-dimensional geometry support, and think that handling it in traits could be a good way to incrementally add support. But I think it would probably be ideal if the dimension could be a generic

@kylebarron My current, and much improved version, takes off from a much less modified version of your original CoordTrait, but with support for up to 4 dimensions, and a M(easure). Is it something like this ↓ you intend? That would fit very well with my plans

pub trait CoordTrait {
    type T: CoordNum;
    const DIMENSION: usize;
    const MEASURE: bool;

    /// Accessors for the coordinate tuple components
    fn x(&self) -> Self::T;
    fn y(&self) -> Self::T;
    fn z(&self) -> Self::T;
    fn t(&self) -> Self::T;
    fn m(&self) -> Self::T;

    /// Returns a tuple that contains the two first components of the coord.
    fn x_y(&self) -> (Self::T, Self::T) {
        (self.x(), self.y())
    }
}

@kylebarron
Copy link
Member Author

I had been thinking more along the lines of the proposal in this PR, where the struct is generic over a CoordNum that can be NoValue.

In particular, when DIMENSION is 2, what should z and t return? T::default()? I was thinking it would be lovely if the CoordTrait could somehow be parameterized so that z and t didn't exist when DIMENSION was 2.

Or maybe the better way of looking at this is that z() would return NoValue. That I suppose would be the best way of linking your proposal and that PR.

@busstoptaktik
Copy link

when DIMENSION is 2, what should z and t return?

Essentially, I believe it is up to the person implementing the trait for a concrete data type to decide: It's your trait, but their data!

In Geodesy, I use NaN for t and 0 for z, because that fits well with geodetical reasoning: If you have no height information, your coordinate is probably assumed to be placed directly on the ellipsoid, whereas if you have no time coordinate, but accidentally call a time dependent operation on the coordinate, you actually want the NaN value to spill out all over your coordinate, to indicate its invalidity ("stomping on it with the x-large NaN boots").

In a sense, NaN is the IEEE 754 equivalent of None, and we want to be reminded when doing something stupid.

@kylebarron
Copy link
Member Author

when DIMENSION is 2, what should z and t return?

Essentially, I believe it is up to the person implementing the trait for a concrete data type to decide: It's your trait, but their data!

In general, I disagree because I think it's important to have a well-specified data contract for these traits. If geo or geodesy or any other crate implements algorithms based on these traits, they need to know what they'll receive.

In Geodesy, I use NaN for t and 0 for z, because that fits well with geodetical reasoning

In this case I think it would be better to have z, t, and m return Option<Self::T> and then in geodesy you could use coord.t().unwrap_or(T::NaN) and coord.z().unwrap_or(0). That way the default data values are informed by the consumer and not the producer.

@frewsxcv
Copy link
Member

I appreciate the detailed discussion and the unique perspectives each of you brings to the table—it’s clear that everyone’s expertise has been invaluable in shaping this proposal. I understand that it’s frustrating not getting everything perfect the first time, but I’m concerned we might be overthinking this for a 0.1 release. The current implementation is strong and would provide a lot of value, even if it doesn’t cover every use case perfectly. I suggest we proceed with merging now and open a follow-up ticket to continue discussing and refining this further.

What do you all think? I’m happy to assist with any follow-up work after the initial release.

@kylebarron
Copy link
Member Author

I'm ok with merging but I'd also be happy to update with this latest proposal now while we're talking about it. I'd like to get to it later this afternoon

@michaelkirk
Copy link
Member

Thanks for trying to move things forward @frewsxcv. I appreciate that the perfect can be the enemy of the good and that we can iterate. A long time has passed since this PR opened, but really not that much time has passed since the major design decision to let points be empty was proposed, so I think we're making progress, not just moving laterally.

I'd be interested (excited even?) to implement some geo algorithms in terms of the proposed CoordTrait because it is semantically a drop in replacement for things we already do, and plus it nicely solves some of the "should this be a geo::Point or geo::Coord" problem we currently have.

On the other hand, if we were to merge geo-traits "as is", I'd probably just not use it until it did. Almost all algorithms using PointTrait would now need to inspect their input for PointTrait::is_empty and potentially fail. It seems like it'd be a step backwards for our users if the type system can no longer automatically delineate that failure case.

Maybe I have my blinders on, but I think that we're really close to having something useful for geo here.

@kylebarron
Copy link
Member Author

Did we decide whether MultiPoint should yield Point or Coord?

@michaelkirk
Copy link
Member

michaelkirk commented Oct 24, 2024

Litmus test: MULTIPOINT(POINT EMPTY)

Since that's valid, I think that implies MULTIPOINT is a collection of POINT.

edit: Wait, is that valid? I don't actually know. Let me look at the spec...

@kylebarron
Copy link
Member Author

Updated georust/wkt#123 for commit 4764343 (#1157)

@michaelkirk
Copy link
Member

Litmus test: MULTIPOINT(POINT EMPTY)

Since that's valid, I think that implies MULTIPOINT is a collection of POINT.

I think my litmus test was valid, but I didn't actually read the test results. 😆

From what I can tell a MULTIPOINT cannot contain an empty point?

@kylebarron
Copy link
Member Author

I'm working on updating geoarrow-rs to this latest commit in geoarrow/geoarrow-rs#839. So far I really like the clarity of knowing a CoordTrait won't be empty, though it'll take a little bit of time to refactor each of the readers and writers to support the latest changes

@michaelkirk
Copy link
Member

regarding: MultiPoint(Vec<Coord>) vs. MultiPoint(Vec<Point>)

FWIW shapely explodes when you try to put an empty point into a MultiPoint:

>>> from shapely.wkt import loads
>>> p1 = loads("POINT(1 2)")
>>> p2 = loads("POINT EMPTY")
>>> mp = shapely.MultiPoint([p1, p2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mkirk/.pyenv/versions/3.11.8/lib/python3.11/site-packages/shapely/geometry/multipoint.py", line 56, in __new__
    raise EmptyPartError("Can't create MultiPoint with empty component")
shapely.errors.EmptyPartError: Can't create MultiPoint with empty component

So that's just one (albeit important) implementation, but if it's true in general that a MultiPoint cannot contain a POINT EMPTY then it seems like encoding that into the type system with MultiPoint(Vec<CoordinateTrait>) would be the way to go.

I don't really have time to dig into the implications of this further, but I am OK with either approach.

Unlike the "should we have a CoordTrait" discussion, I think there is less at stake with this implementation detail of MultiPoint.

As an example:

whereas POINT(1, 2) and POINT EMPTY behave quite differently...

I think that:

MULTIPOINT (1 2, POINT EMPTY) vs. MULTIPOINT (1 2)
MULTIPOINT (POINT EMPTY) vs. MULTIPOINT EMPTY

behave not entirely identically, but pretty similarly.

@kylebarron
Copy link
Member Author

Unlike the "should we have a CoordTrait" discussion, I think there is less at stake with this implementation detail of MultiPoint.

I agree, it feels like there's a lot less at stake, and feels a lot easier to change down the road.

@JosiahParry
Copy link
Contributor

JosiahParry commented Oct 24, 2024

FWIW the same limitation exists in the R ecosystem’s {sf} package.

pnt <- st_point()
st_multipoint(list(pnt))

this returns an error complaining that the point is not numeric

@kylebarron
Copy link
Member Author

Should we update this MultiPoint to yield CoordTrait instead of PointTrait?

@michaelkirk
Copy link
Member

Should we update this MultiPoint to yield CoordTrait instead of PointTrait?

Dealers choice - maybe someone else has an opinion.

/// Access the n'th (0-based) element of the CoordinateTuple.
/// May panic if n >= DIMENSION.
/// See also [`nth()`](Self::nth).
fn nth_unchecked(&self, n: usize) -> Self::T;
Copy link
Member

@frewsxcv frewsxcv Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I first read this, I thought this would behave similarly to Vec#get_unchecked which is an unsafe method which does not do a bounds check, hence it being marked as unsafe. In fact, it looks like all functions named *_unchecked Rust's standard library are marked as unsafe, which to me implies they skip the bounds check.

My first though was, what if we had a naming convention like this:

fn nth_checked(n) -> Option<T> {}
fn nth(n) -> T {} // Panics
unsafe fn nth_unchecked(n) -> T

But that may psychologically push people to use the panic'ing version.

I don't know what the right answer is, just wanted to call out that I was surprised when implementing CoordTrait that nth_unchecked was not marked with unsafe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option 🤷🏻

fn nth(n) -> Option<T> {}
fn nth_or_panic(n) -> T {} // Panics
unsafe fn nth_unchecked(n) -> T

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we also definitely don't need to have an unsafe method, in fact probably better to leave it out for this initial implementation

Copy link
Member

@michaelkirk michaelkirk Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn nth(n) -> Option<T> {}
fn nth_or_panic(n) -> T {} // Panics

That's nice. 👏

- coord.nth_unchecked(3);
+ coord.nth(3).unwrap();

We could also just leave out the panic flavor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I didn't find it in the stdlib, apparently there's quite a bit of precedent for _or_panic in 3rd party crates
https://github.com/search?q=or_panic+language%3ARust&type=code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to suggestions here. I suppose it's worthwhile having a variant that is actually unsafe (e.g. doesn't even do the bounds check), for situations where you know you're in bounds (like the iterator)

geo-traits/src/coord.rs Outdated Show resolved Hide resolved
Co-authored-by: Corey Farwell <coreyf@rwell.org>
@michaelkirk
Copy link
Member

I'm happy with all the big structural pieces. I'm not too worried about the naming, since I feel like it should be straight forward to revisit, and easy to do proper deprecation with at any point.

@frewsxcv
Copy link
Member

The geojson and rgis integrations went well! I'm excited to use these traits for my projects, and I'm curious to see what the community thinks.

Any other blockers before merging? If not, I can press the button tomorrow!

@frewsxcv
Copy link
Member

Here we go!

@frewsxcv frewsxcv added this pull request to the merge queue Oct 26, 2024
Merged via the queue into georust:main with commit b1a0142 Oct 26, 2024
18 checks passed
@kylebarron kylebarron deleted the kyle/geo-traits-crate branch October 29, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants