Open Government Building Data
Mapping and analysing the availability of authoritative datasets on buildings worldwide
About the project
Open government data on buildings is becoming increasingly available and accessible globally, being useful for a variety of use cases, e.g. in urban morphology and energy simulations. While OpenStreetMap (OSM) covers a large number of buildings and has made impressive progress in the past few years, datasets released by governments often have authoritative status thanks to their full completeness in their jurisdictions, homogeneous data collection, and sometimes an extended set of attributes not available elsewhere. However, these datasets have limited availability.
We have created a global inventory of open government data on buildings. The principal results are available as a map (above) and as a list (below), with an ongoing analysis. The index spans datasets containing more than 100 million building footprints from dozens of locations around the world.
The results of this project have several purposes, e.g. draw attention to these datasets to practitioners and scientists, and aid governments in understanding how their data fares in comparison to others, or for governments that have not released their data yet, to provide insights about common practices of their counterparts around the world. Further, we seek to understand the intertwined relationships with other sources of data such as volunteered geoinformation and commercial entities, and the role authoritative data has amid the increasing role of other actors in the same geographies.
Please note that this is an ongoing work, and we will be adding new datasets as we check their content.
The criteria for inclusion in the list are as follows. The dataset should:
- be released as open data, i.e. it can be freely used, modified, and shared by anyone for any purpose. For example, a viewer that enables viewing the data, but one that does not allow downloading it, is not considered to be a case of open data. In addition, the dataset should be relatively easy to download, not requiring expert knowledge or esoteric workflows.
- be created and released by a governmental authority, such as national mapping/cadastral agency, regional government, or city administration.
- contain 2D spatial data on buildings (i.e. footprints). For example, non-spatial datasets (e.g. spreadsheets or aggregated statistics) and point-based datasets (e.g. geocoded addresses) are not considered for this project due to their limited usefulness in geospatial workflows and urban studies.
While we regard also the semantic content of data (i.e. attributes) such as type of building, number of storeys, and its year of construction, they are not a requirement for the inclusion, thus, purely spatial datasets with no attributes are included in this study. In fact, about 47% of the datasets we have identified contain no semantic content whatsoever. Where attributes are available, we have analysed them.
The datasets have been identified through an exploration of data portals, crowdsourcing (through social media), and examining research papers.
Do note that we are not covering datasets that are not of official nature. For example, commercial releases and volunteered geoinformation are not in the focus of this research (however, they are subject of our other research activities).
List of datasets
The key result of the project is an inventory, which is given in the tables below, by level of jurisdiction. You may click on the links to visit the website linking to the data and often describing it with metadata.
City and regional datasets
|Fleuve St Laurent||Canada||Link|
|Grande Prairie County||Canada||Link|
|Roseau River Watershed||Canada||Link|
|Ville De Quebec||Canada||Link|
|Le Havre Sine||France||Link|
|Fond du Lac||US||Link|
|New York City||US||Link|
Datasets with partial coverage
During our exploration, we have identified several datasets that do not include all the buildings in their administrative extent. For example, there are datasets that have only commercial buildings mapped, or buildings with a footprint larger than a threshold of considerable size, not being representative of buildings in the area. Further, some datasets have partial coverage as they have an indicative purpose, e.g. serving as a sample dataset. These datasets may still be found useful for some spatial analyses. We list such datasets in this table.
|Cape Town (South Africa)||Link|
|Gold Coast (Australia)||Link|
|Jasper Park (Canada)||Link|
|Park Canada (Canada)||Link|
Putting up the inventory of publicly available building datasets by governments is just the first part of this project. For each dataset, we have analysed the metadata and checked it to understand their content, geometric validity, etc. The full results will be published in a paper. In the meantime, we include two figures as a sneak peek into the ongoing work.
The map at the top of this page is intended to show the locations of the datasets and the global distribution of the availability of authoritative open data on buildings. The locations of the datasets are represented in two ways: by approximate centroids of the entire dataset and by their approximate coverage (this is visible when zooming in). The latter has been generated in two different ways: for some datasets, we have (i) computed the convex hulls; while for others, we have (ii) used the polygon of the administrative unit the dataset states to represent. The administrative polygons have been sourced from GADM, the Database of Global Administrative Areas.
Adding a New Dataset
More open government building data are getting added to this index as we expand our search. If you’d like to contribute with new entries to enlarge our inventory, you are welcome to do so by filling the following form. Before doing so, please read the inclusion criteria above.
The registry is not free of errors, especially in jurisdictions we are less familiar with. The links have been checked during Q2 2021. However, it is possible that some of them are broken, with some datasets moved to another link and some removed. If you spot an error, please report it through this form.
A paper is coming out soon. Stay tuned!
We remain neutral with respect to jurisdictional claims in the datasets.
We thank all contributors who have pointed out authoritative building datasets, which we included in our list.
This research is part of the project Large-scale 3D Geospatial Data for Urban Analytics, which is supported by the National University of Singapore under the Start-Up Grant R-295-000-171-133.
The location and the source of the illustration of the building footprints at the top of this webpage: Middlesex County, Massachusetts, United States (Bureau of Geographic Information – MassGIS, Commonwealth of Massachusetts, Executive Office of Technology and Security Services, 2021.)