Skip to content

ST_HausdorffDistance function#790

Open
sapienza88 wants to merge 2 commits into
apache:mainfrom
sapienza88:st_hausdorffdistance
Open

ST_HausdorffDistance function#790
sapienza88 wants to merge 2 commits into
apache:mainfrom
sapienza88:st_hausdorffdistance

Conversation

@sapienza88
Copy link
Copy Markdown

No description provided.

@github-actions github-actions Bot requested a review from zhangfengcdt April 26, 2026 18:08
@zhangfengcdt
Copy link
Copy Markdown
Member

Thanks for the contribution! Could you please clarify the use cases you want to support by adding this st function? Also,

  1. The UDF needs to be registered in rust/sedona-functions/src/register.rs inside register_scalar_udfs — otherwise it's not callable from SQL.
  2. GEOS-backed kernels in this project live in the c/sedona-geos crate (see c/sedona-geos/src/distance.rs) and use GeosExecutor. Could you move the implementation there and drop the new geos dep from rust/sedona-functions/Cargo.toml?

Also, please add the Apache license header to any new files.

@sapienza88
Copy link
Copy Markdown
Author

@zhangfengcdt thanks for the comment, I want to first ask if g1.hausdorff_distance(&g2) would work?, is hausdorff_distance already defined?

@zhangfengcdt
Copy link
Copy Markdown
Member

@zhangfengcdt thanks for the comment, I want to first ask if g1.hausdorff_distance(&g2) would work?, is hausdorff_distance already defined?

Yes, geos already support it - https://raw.githubusercontent.com/georust/geos/master/src/geometry.rs

You might want to check how this one works and follow the same pattern:
c/sedona-geos/src/distance.rs

@sapienza88
Copy link
Copy Markdown
Author

sapienza88 commented Apr 29, 2026

@zhangfengcdt yet I cannot see the defining function, could you point out to the exact logic defining the hausdorff_distance?

@sapienza88
Copy link
Copy Markdown
Author

@zhangfengcdt why the tests are in python while the implementation in rust?

@sapienza88
Copy link
Copy Markdown
Author

@paleolimbot this function should implemented in c/sedona-geos/src or rust/sedona-functions/src?

Copy link
Copy Markdown
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

// under the License.
---
title: ST_HausdorffDistance
description: Returns the Hausdorff distance between two geometries. This is a measure of how similar two geometries are.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Returns the Hausdorff distance between two geometries. This is a measure of how similar two geometries are.
description: Returns the Hausdorff distance between two geometries.

Comment on lines +21 to +29
- returns: double
args:
- geom1: geometry
- geom2: geometry
- returns: double
args:
- geom1: geometry
- geom2: geometry
- density_frac: double
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have funny hand-rolled syntax for these, but this should work:

Suggested change
- returns: double
args:
- geom1: geometry
- geom2: geometry
- returns: double
args:
- geom1: geometry
- geom2: geometry
- density_frac: double
- returns: double
args:
- geometry
- geometry
- returns: double
args:
- geometry
- geometry
- name: density_frac
type: float64
description: Densification to apply before calculation

Comment on lines +26 to +27
(None, None, None, None),
("POINT (0 0)", None, None, None),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add (None, "POINT (0 1)", None, ...), ("POINT (0 1)", "POINT (0 1)", None), and checks for EMTPY behaviour (empty linestring + non empty linestring and reversed)

query = f"SELECT ST_HausdorffDistance({geom_or_null(geom1)}, {geom_or_null(geom2)})"
else:
query = f"SELECT ST_HausdorffDistance({geom_or_null(geom1)}, {geom_or_null(geom2)}, {density_frac})"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you separate the tests for the version with and without density_frac (i.e., def test_st_hasdorff_distance_density_frac()? This will let us separate the "null" from "not specified" case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be added next to wherever st_distance is tested (I'm not sure it needs its own file)

})?;

executor.finish(Arc::new(builder.finish()))
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add Rust tests (you can use the tests for st_distance as a template). The rust tests check for basic functionality and ensure that scalar and array arguments work everywhere.

@sapienza88 sapienza88 changed the title add st_hausdorffdistance add ST_HausdorffDistance May 21, 2026
@sapienza88 sapienza88 changed the title add ST_HausdorffDistance ST_HausdorffDistance function May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants