-
Notifications
You must be signed in to change notification settings - Fork 742
Open
Labels
Description
Our test suite for joins is quite poor at the moment. Now that I look at it, it's generally just testing if it's hooked up and not if the exact results...
sedona/python/tests/geopandas/test_sjoin.py
Lines 171 to 183 in 1eb5769
| def test_sjoin_dwithin_distance(self): | |
| """Test dwithin predicate with distance parameter""" | |
| # Test with a distance that should capture nearby points | |
| joined = sjoin(self.gdf1, self.nearby_points, predicate="dwithin", distance=0.5) | |
| assert joined is not None | |
| assert type(joined) is GeoDataFrame | |
| # Test with a very small distance that should capture fewer points | |
| joined_small = sjoin( | |
| self.gdf1, self.nearby_points, predicate="dwithin", distance=0.05 | |
| ) | |
| assert joined_small is not None | |
| assert type(joined_small) is GeoDataFrame |
test_match_geopandas_series.py has proved to be super helpful for finding bugs in Sedona. We should replicate this for joins. In this recent PR fix to ST_Distance, there's some suspicion that it's possible distance join is buggy on an edge case. Comparing results with geopandas would be a great way to test this.
Afterwards, we should also improve our tests for the rest of sjoins and for sindexes.