Foursquare needs quality place data to power its geocoding engine to ensure the best recommendations. When someone searches for the best coffee in Brooklyn, a simple venue to place point or venue to place bounding box search can result in venues in Manhattan and Jersey City overwhelming the results for Brooklyn.
To improve recommendations, we have created an authoritative source of polygons around a curated list of places. This gazetteer of non-overlapping polygons provides more relevant results than simple point geometries.
This work is based on foursquare checkins, geo tagged photos from Flickr, an extended version of Natural Earth, and open government data. Concordance is provided between quattroshapes, geonames.org, and Yahoo! GeoPlanet unique IDs in the gazetteer.
The quattroshapes technique calculates the dominant place ID for a given area based on heterogeneous inputs. This work is an extension of alphashapes and betashapes (thanks Aaron and Schuyler!) and is used to backfill countries without complete open data.
Geocoding can be the hardest part about going open source - and reverse geocoding is even harder. Reverse geocoding reports the gazetteer place for a latitude and longitude map location or address string and is useful when source data needs to be normalized. This new polygon gazetteer data is used in TwoFishes, the coarse splitting geocoder (and reverse geocoder) written in scala from David Blackman at foursquare.
The quattroshapes code and resulting 30 gb of data are licensed under CC-BY, but includes data licensed from many governments around the world. Check the License file for full details and limitations.
Shapefiles are in WGS84 (geographic) projection and UTF-8 character encoding.
Note that these do not currently have geonames concordances baked in. We'll be posting matched files soon. In the meantime, you can use shputils to do the matching, following the instructions here: https://github.com/foursquare/twofishes#reverse-geocoding-and-polygons
- quatroshapes admin 0 - 106 mb
- quatroshapes admin 1 regions - 17 mb
- quatroshapes admin 1 - 106 mb
- quatroshapes admin 2 regions - 1.5 mb
- quatroshapes admin 2 - 304 mb
- quatroshapes local admin - 467 mb
- quatroshapes localities - 420 mb
- quatroshapes localities (with geonames concordance) - 420 mb
- quatroshapes neighborhoods - 32 mb
- prefer GeoNames.org lat-lngs - 154 mb
- prefer GeoPlanet + Flickr lat-lngs - 154 mb
- localities only, Geonames.org lat-lngs - 46 mb
- places with population, checkins, or flickr photos, Geonames.org lat-lngs - 34 mb
- Natural Earth admin-1 - 16 mb, version 3.0.0
- US State Department HIU admin-0 - 79 mb, re-coded like Natural Earth
- (coming soon) National Mapping Agency Open Data - 7 gb
- Customized Europe localities - 135 mb, mashed-up EuroGeoGraphics urban data and European Environment Agency UMZ data.
- GeoPlanet voronoi diagrams - 198 mb includes adm0, adm1, adm2, localadmin & localities.
- Flickr alpha shapes broken up by country, with geonames concordance - geojson, 28 mb
- Flickr alpha shapes broken up by country, duplicate features dissolved, with geonames concordance - shp, 28 mb