Geohashing¶
Geohashes, created by Gustavo Niemeyer in 2008 and placed in the public domain, are an elegant and succinct geographic encoding. Geohashes work by reducing a two-dimensional longitude, latitude pair into a single alphanumeric string where each additional character adds precision to the location.
Procedure¶
The algorithm for creating a geohash works by consequently splitting the world in 2 parts, once longitutinal, once latitudinal. This way a long binary string is created that identifies a smaller and smaller space on the world map, the longer the string gets. (as illustrated in the image below)
The resulting string is encoded in base 32. As an example, iGent Tower is located at:
Geohash | Lat-lng |
---|---|
u14dhqs55g | 51.012817, 3.708028 |
Interesting properties¶
There are quite some interesting properties associated with geohashes that we can exploit in favor of scalability and performance:
- URL friendly mechanism
/locations/u155k7/{typeId}/events
vs/locations/51.2308141,4.3706197,range=1km/{typeId}/events
- Limit false queries
- Geohashes with wrong prefix, are outside the search area. (e.g. prefix u155k is Antwerp)
- Zooming and neighbour search
- The more digits, the more precise, drop (least significant) digits and the area zooms out.
- Easy to discover neighbouring geohashes.
- Cacheable
- Deterministic and absolute locations which is excellent for caching behaviour.
- By limiting the precision of the geohashes to 6-9 characters, there is a bigger chance of overlapping request patterns / reuse.
- Performance gains
- Searching geospatial results with a simple prefix text search.
LatLong vs geohash¶
The above reasons already go a long way towards advocating the use of geohashes.
On second thought we also realised that although more widely known as a geo-location mechanism, LatLng is not that handy to use either.
- You cannot just guess a LatLng, you have to look it up in a tool like google maps.
- A LatLng exists of 2 doubles, which are more cumbersome in urls.
- A LatLng only defines a point, you need a range for an area.
- LatLng + radius is a circular area, which in a lot of cases is not ideal.
- In those cases that a circular area makes sense, there isn't really a reason not to use a square area.
- If really needed, libraries are available to return a set of geohashes that cover a given circular area.
- LatLng + radius is a hassle if you want to monitor something like a road (e.g. The Meir, Antwerp)
Tools¶
The Obelisk Explorer includes tooling to draw areas on a Map which then result in a list of geohashes defining that area. This can be a useful feature for query building!
iGent Site
Antwerp Central Station
Antwerp: Het eilandje