Open Government Building Data

This is an ongoing project: the inventory is currently in beta, as we are working on cleaning it and adding new datasets. A preliminary snapshot is provided here to raise awareness of the project, solicit more datasets, and help us detect errors.

Mapping and analysing the availability of authoritative datasets on buildings worldwide

About the project

Open government data on buildings is becoming increasingly available and accessible globally, being useful for a variety of use cases, e.g. in urban morphology and energy simulations. While OpenStreetMap (OSM) covers a large number of buildings and has made impressive progress in the past few years, datasets released by governments often have authoritative status thanks to their full completeness in their jurisdictions, homogeneous data collection, and sometimes an extended set of attributes not available elsewhere. However, these datasets have limited availability.

We have created a global inventory of open government data on buildings. The principal results are available as a map (above) and as a list (below), with an ongoing analysis. The index spans datasets containing more than 100 million building footprints from dozens of locations around the world.

The results of this project have several purposes, e.g. draw attention to these datasets to practitioners and scientists, and aid governments in understanding how their data fares in comparison to others, or for governments that have not released their data yet, to provide insights about common practices of their counterparts around the world. Further, we seek to understand the intertwined relationships with other sources of data such as volunteered geoinformation and commercial entities, and the role authoritative data has amid the increasing role of other actors in the same geographies.

This research is funded by the National University of Singapore (NUS) and it is carried out by the NUS Urban Analytics Lab in collaboration with others.

Please note that this is an ongoing work, and we will be adding new datasets as we check their content.

Inclusion criteria

The criteria for inclusion in the list are as follows. The dataset should:

  • be released as open data, i.e. it can be freely used, modified, and shared by anyone for any purpose. For example, a viewer that enables viewing the data, but one that does not allow downloading it, is not considered to be a case of open data. In addition, the dataset should be relatively easy to download, not requiring expert knowledge or esoteric workflows.
  • be created and released by a governmental authority, such as national mapping/cadastral agency, regional government, or city administration.
  • contain 2D spatial data on buildings (i.e. footprints). For example, non-spatial datasets (e.g. spreadsheets or aggregated statistics) and point-based datasets (e.g. geocoded addresses) are not considered for this project due to their limited usefulness in geospatial workflows and urban studies.

While we regard also the semantic content of data (i.e. attributes) such as type of building, number of storeys, and its year of construction, they are not a requirement for the inclusion, thus, purely spatial datasets with no attributes are included in this study. In fact, about 47% of the datasets we have identified contain no semantic content whatsoever. Where attributes are available, we have analysed them.

The datasets have been identified through an exploration of data portals, crowdsourcing (through social media), and examining research papers.

Do note that we are not covering datasets that are not of official nature. For example, commercial releases and volunteered geoinformation are not in the focus of this research (however, they are subject of our other research activities).

List of datasets

The key result of the project is an inventory, which is given in the tables below, by level of jurisdiction. You may click on the links to visit the website linking to the data and often describing it with metadata.

Country-wide datasets

CountryWebsite
CzechiaLink
DenmarkLink
EstoniaLink
FranceLink
JapanLink
LithuaniaLink
LuxembourgLink
MaltaLink
NetherlandsLink
New ZealandLink
NorwayLink
PolandLink
SwitzerlandLink

City and regional datasets

CoverageCountryWebsite
Buenos AiresArgentinaLink
Christmas IslandAustraliaLink
Cocos IslandAustraliaLink
GeelongAustraliaLink
Greater ShepppartonAustraliaLink
HobartAustraliaLink
LauncestonAustraliaLink
ManninghamAustraliaLink
TirolAustriaLink
WalloniaBelgiumLink
BlainevilleCanadaLink
BurlingtonCanadaLink
CaledonCanadaLink
EdmontonCanadaLink
Fleuve St LaurentCanadaLink
Grande Prairie CountyCanadaLink
GrasslandsCanadaLink
HalifaxCanadaLink
HamiltonCanadaLink
JackheadCanadaLink
Lake ManitobaCanadaLink
Lower MainlandCanadaLink
Mauricie SudCanadaLink
MontrealCanadaLink
Niagara FallsCanadaLink
OttawaCanadaLink
Prince GeorgeCanadaLink
QuebecCanadaLink
ReginaCanadaLink
RepentignyCanadaLink
RimouskiCanadaLink
Riviere OutaouaisCanadaLink
Roseau River WatershedCanadaLink
Rouyn-NorandaCanadaLink
SannichCanadaLink
ShawiniganCanadaLink
SherbrookeCanadaLink
St CatharinesCanadaLink
The PasCanadaLink
Ville De QuebecCanadaLink
WellandCanadaLink
WindsorCanadaLink
WinnipegCanadaLink
BogotaColombiaLink
Medellin CityColombiaLink
TampereFinlandLink
FinistereFranceLink
Garges-les-GonesseFranceLink
BlagnacFranceLink
Grand PoitersFranceLink
IsereFranceLink
La RochelleFranceLink
Le Havre SineFranceLink
NiceFranceLink
RennesFranceLink
Val d’llle-AubigneFranceLink
BrandenburgGermanyLink
BerlinGermanyLink
WuppertalGermanyLink
Reggio EmiliaItalyLink
RoveretoItalyLink
UmbriaItalyLink
GisborneNew ZealandLink
SeoulSouth KoreaLink
NavarreSpainLink
AlabamaUSLink
Ann ArborUSLink
AtlantaUSLink
ChampaignUSLink
AlleghenyUSLink
AustinUSLink
BaltimoreUSLink
Bay fieldUSLink
BendUSLink
BloomingtonUSLink
BostonUSLink
BrownUSLink
BuffaloUSLink
BuncombeUSLink
CalumetUSLink
CentreUSLink
ChicagoUSLink
CiboloUSLink
CincinnatiUSLink
ClarkUSLink
CookUSLink
DakotaUSLink
DaneUSLink
DauphinUSLink
DenverUSLink
DouglasUSLink
FlagstaffUSLink
Fond du LacUSLink
Fort CollinsUSLink
IndianapolisUSLink
JeffersonUSLink
John CreeksUSLink
KentuckyUSLink
KervilleUSLink
Kodiak IslandUSLink
Los AngelesUSLink
ManitowocUSLink
New OrleansUSLink
New York CityUSLink
Newport NewsUSLink
OrangeUSLink
PhiladephiaUSLink
PiereceUSLink
PortageUSLink
RamseyUSLink
RedlandsUSLink
Rhode IslandUSLink
San FranciscoUSLink
SarpyUSLink
SaukUSLink
SomervilleUSLink
St AugustineUSLink
SummitUSLink
TempeUSLink
VilasUSLink
WashburnUSLink
Washington DCUSLink
WaukeshaUSLink
YavapaiUSLink
PulynyUkraineLink

Datasets with partial coverage

During our exploration, we have identified several datasets that do not include all the buildings in their administrative extent. For example, there are datasets that have only commercial buildings mapped, or buildings with a footprint larger than a threshold of considerable size, not being representative of buildings in the area. Further, some datasets have partial coverage as they have an indicative purpose, e.g. serving as a sample dataset. These datasets may still be found useful for some spatial analyses. We list such datasets in this table.

CoverageWebsite
CanadaLink
Cape Town (South Africa)Link
GermanyLink
Gold Coast (Australia)Link
Greene (US)Link
Jasper Park (Canada)Link
Park Canada (Canada)Link
Queensland (Australia)Link
SingaporeLink
Surprise (US)Link
Wyndham (Australia)Link

Analysis

Putting up the inventory of publicly available building datasets by governments is just the first part of this project. For each dataset, we have analysed the metadata and checked it to understand their content, geometric validity, etc. The full results will be published in a paper. In the meantime, we include two figures as a sneak peek into the ongoing work.

Frequency of most common attributes pertaining to buildings, which we identified in 100+ datasets we analysed. The information that is most commonly available is on the type of the building. The level of semantic richness has a wide range. On the one hand, a fifth of the datasets has 4 or more attributes stored for each building. On the other hand, nearly half of the datasets do not contain a single attribute, describing only the geometry of the building footprint.
Frequency of most common attributes pertaining to buildings, which we identified in 100+ datasets we analysed. The information that is most commonly available is on the type of the building. The level of semantic richness has a wide range. On the one hand, a fifth of the datasets has 4 or more attributes stored for each building. On the other hand, nearly half of the datasets do not contain a single attribute, describing only the geometry of the building footprint.
In many geographies, the built environment is dynamic. Therefore, it is beneficial to keep datasets on buildings fresh. Our analysis reveals that the most recently available version of most datasets is not more than a year old, which -- depending on the use case -- may be considered to be sufficiently up-to-date. A few datasets are updated on a weekly basis. On the other side of the spectrum, there are datasets that have not been updated in a decade.
In many geographies, the built environment is dynamic. Therefore, it is beneficial to keep datasets on buildings fresh. Our analysis reveals that the most recently available version of most datasets is not more than a year old, which – depending on the use case – may be considered to be sufficiently up-to-date. A few datasets are updated on a weekly basis. On the other side of the spectrum, there are datasets that have not been updated in a decade.

Webmap

The map at the top of this page is intended to show the locations of the datasets and the global distribution of the availability of authoritative open data on buildings. The locations of the datasets are represented in two ways: by approximate centroids of the entire dataset and by their approximate coverage (this is visible when zooming in). The latter has been generated in two different ways: for some datasets, we have (i) computed the convex hulls; while for others, we have (ii) used the polygon of the administrative unit the dataset states to represent. The administrative polygons have been sourced from GADM, the Database of Global Administrative Areas.

Adding a New Dataset

More open government building data are getting added to this index as we expand our search. If you’d like to contribute with new entries to enlarge our inventory, you are welcome to do so by filling the following form. Before doing so, please read the inclusion criteria above.

Errors

The registry is not free of errors, especially in jurisdictions we are less familiar with. The links have been checked during Q2 2021. However, it is possible that some of them are broken, with some datasets moved to another link and some removed. If you spot an error, please report it through this form.

Paper

A paper is coming out soon. Stay tuned!

People

Research Assistant

Lawrence Chew Zheng Xiong

Collaborators

Nikola Milojević-Dupont and Felix Creutzig (Mercator Research Institute on Global Commons and Climate Change and Technical University of Berlin)

Principal Investigator

Filip Biljecki

Research group

Urban Analytics Lab, National University of Singapore (NUS)

Disclaimer

We remain neutral with respect to jurisdictional claims in the datasets.

Acknowledgements

We thank all contributors who have pointed out authoritative building datasets, which we included in our list.

This research is part of the project Large-scale 3D Geospatial Data for Urban Analytics, which is supported by the National University of Singapore under the Start-Up Grant R-295-000-171-133.

The location and the source of the illustration of the building footprints at the top of this webpage: Middlesex County, Massachusetts, United States (Bureau of Geographic Information – MassGIS, Commonwealth of Massachusetts, Executive Office of Technology and Security Services, 2021.)