You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem
=======
The current implementation of Geohash decoding may yield the wrong
result, due to the rounding algorithm being too aggressive.
To my knowledge, there is no official Geohash specification. However, the
wikipedia page on Geohash is written by the Geohash author, and is thus used
as the specification. The page states the following:
Final rounding should be done carefully in a way that
min <= round(value) <= max
This is violated by the current implementation (funfact: it is also violated by
the reference implementation http://geohash.org). To illustrate this, a decoding
example is given:
Decode "xkcd" into a latitude value: We begin by translating "xkcd" into its
binary representation:
x: 11101
k: 10010
c: 01011
d: 01100
Full binary representation: 11101100100101101100. To decode the latitude value,
we only use the odd bits (given that the first bit is bit number zero). Removing
all the even bits, we end up with the binary value 1010011010. A bit value of
'1' indicates that we should discard the lower half of the latitude range. A
value of '0' indicates that we should discard the upper half of the range. This
gives us the following decoding steps:
Start [-90.0 90.0 ]
1 [ 0.0 90.0 ]
0 [ 0.0 45.0 ]
1 [ 22.5 45.0 ]
0 [ 22.5 33.75 ]
0 [ 22.5 28.125 ]
1 [ 25.3125 28.125 ]
1 [ 26.71875 28.125 ]
0 [ 26.71875 27.421875 ]
1 [ 27.0703125 27.421875 ]
0 [ 27.0703125 27.24609375 ]
From the specification, we see that the final value should be in the rage
[27.0703125, 27.24609375]. However, the current implementation ends up with
rounding the value to "27.0". Both "27.1" and "27.2" would be considered a
correct result.
Fix
===
Add a while-loop to Item_func_latlongfromgeohash::round_latlongitude which
ensures that the returned result is within the valid range. We also add some
asserts to ensure that the returned result satisfy the Geohash rounding
condition mentioned in the specification.
Note that the result returned from MySQL geohash functions may not be the same
results returned from http://geohash.org, due to the fact that the "reference
implementation" have the same bug/flaw.
Also, the bug report mentiones the following case:
For some positions, you can get results that differ wildly:
SELECT ST_GeoHash(ST_PointFromGeoHash('ebrb', 0), 4);
+-----------------------------------------------+
| ST_GeoHash(ST_PointFromGeoHash('ebrb', 0), 4) |
+-----------------------------------------------+
| s00j |
+-----------------------------------------------+
With this bugfix, the result is now the following:
SELECT ST_GeoHash(ST_PointFromGeoHash('ebrb', 0), 4);
+-----------------------------------------------+
| ST_GeoHash(ST_PointFromGeoHash('ebrb', 0), 4) |
+-----------------------------------------------+
| s020 |
+-----------------------------------------------+
This is however valid, as explained below:
'ebrb' represents the bounding box with latitude [1.40625, 1.58203125]
and longitude [-0.3515625, 0.0]. ST_PointFromGeoHash('ebrb') results in
POINT(0.0 1.5) (which is valid).
's020' represents the bounding box with latitude [1.40625, 1.58203125]
and longitude [0.0, 0.3515625]. This is also a valid representation of
POINT(0.0 1.5).
0 commit comments