🌐 Brand new granular socio-economic datasets for the world
PLUS: detecting maritime crop circles and GPS spoofing, improving granular population mapping, and more.
Hey guys, here’s this week’s edition of Spatial Edge — a weekly round-up of geospatial news. The goal is to help make you a better geospatial data scientist in less than 5 minutes a week. If you’ve ever considered naming your next kid ‘EPSG:4326’, then we have one thing to say to you—welcome home.
In today’s newsletter:
Global socio-economic data: Dataset covers 7 billion people.
Enhanced population mapping: Bayesian model improves estimates.
Maritime 'crop circles' detection: Topological data spots anomalies.
Solar panels' urban impact: Panels can both warm and cool cities.
Research you should know about
1. Granular socio-economic dataset covering over 7 billion people
A new study published in Scientific Data introduces GLOPOP-S, the first global socio-economic dataset covering over 7 billion individuals and ~2 billion households for 2015 at the admin-1 level.
It features sub-national data on:
age
gender
education
income/wealth
household size
household type, and
settlement type (urban/rural)
The team generated GLOPOP-S by combining microdata from the Luxembourg Income Study (LIS) and Demographic and Health Surveys (DHS). They applied synthetic reconstruction methods, specifically the Iterative Proportional Updating algorithm, to align national survey data with regional statistics at the admin 1 level. This accounts for spatial differences within and across countries. For places without microdata, they developed methods to generate synthetic data based on demographically similar countries.
I’m really looking forward to playing around with this dataset, and have my fingers crossed that someone can try and extend the data to a longer time series.
PS the data is located here, and the replication code is here.
2. Improving granular population mapping with Bayesian models
The WorldPop team have introduced a new Bayesian Additive Regression Tree (BART) model for creating granular population maps from census data. This is particularly useful for understanding population concentrations when ground-truth data isn’t excellent.
The team used Ghana's 2021 census data at the district level and incorporated a bunch of geospatial factors like land use, climate data, infrastructure, and settlement patterns. They compared the traditional Random Forest (RF) model with the BART model, which is better at estimating uncertainty in predictions. By conducting some simulations and applying the models to actual census data, they could see which approach was better.
I know you just want the cliff notes, so here you go:
the BART model outperformed the RF model in predicting population distributions,
it achieved an 81% correlation with simulated true populations at the pixel level compared to 66% for the RF model.
The BART model provided uncertainty estimates around predictions, which is insanely useful when we’re using this data in research.
And if you’re interested in other approaches to granular population mapping, check out this overview of POPCORN.
3. Using topological data to address maritime ‘crop circles’ and GPS spoofing
A new study from Systems and Technology Research introduces a new way to detect ‘crop circles’ in the sea, which are unusual circular patterns in Automatic Identification System (AIS) data from ships.
As I’ve written about previously, AIS is a system that ships use to broadcast their location, speed, and other information. Sometimes, anomalies like ‘crop circles’ appear in AIS data, where a ship's reported positions form loops. These can be caused by GPS errors, intentional signal interference, or GPS spoofing (e.g. in the case of ‘dark shipping’).
The researchers applied ‘persistent homology’, a topographical data analysis tool that examines the shape of data. It then embeds spatiotemporal AIS data into a 3D space. By analysing loops in ships’ trajectories over time, they could detect these crop circles. They tested their method on AIS data near San Francisco and identified some previously unknown anomalies.
Their key finding is that normal ship movements don't create loops when plotted over time in this way. The presence of loops indicates unusual behaviour that may signal GPS spoofing or navigation errors. This method is robust to small changes in the data and doesn't require the loops to be perfect circles.
4. How rooftop solar panels can both warm up and cool down cities
A new study in Nature Cities looks at how rooftop photovoltaic solar panels affect urban temperatures. This is a topic that we’ve covered previously in the Spatial Edge. However, here the researchers investigated whether large-scale installation of these panels can lead to unintended heating or cooling effects.
Using advanced weather modelling with the Weather Research and Forecasting (WRF) model integrated with multilayer urban schemes (BEP+BEM), the team simulated the impact of widespread solar panel deployment on urban temperatures in Kolkata, India. They incorporated new parameters for solar panels, including how they interact thermally with underlying roofs and the surrounding air. The simulations assessed various scenarios, from 25% to 100% rooftop coverage, and considered factors like surface energy budgets, meteorological fields, boundary layer dynamics, and sea breeze circulations.
The tl;dr:
city-wide installation of rooftop solar panels can raise daytime temperatures by up to 1.5 °C (due to more heat absorption and release from the panels).
however, they can also lower nighttime temperatures by up to 0.6 °C (due to changes in heat retention and radiation).
solar panels can also alter urban surface energy balances and atmospheric conditions. These results were supported by a comparative analysis of cities like Sydney, Austin, Athens, and Brussels.
P.S. You can find the source code here and the Python version here.
Geospatial datasets
1. Salmonid biomass dataset
If the lack of geospatial data on salmonid fish keeps you up at night, then fear not. But for those of you who’ve never heard the word ‘salmonid’ before (like me), you might want to skip to the next dataset…In any case, this study from Scientific Data comprehensive dataset on salmonid biomass (including fish like trout and salmon) across over 1,000 rivers in 27 countries. Researchers pulled data from 240 studies spanning 84 years to estimate biomass and production for 194 rivers. It includes info on species, sampling methods, and river features.
You can access the data tables here.
2. Precinct boundary data
Another study in Scientific Data put together the U.S. Precinct Boundary Database to tackle a big challenge in U.S. elections: messy precinct boundary data. To fix this, the dataset covers over 170,000 precincts from the 2016, 2018, and 2020 elections.
You can access the data archive here.
3. Physical distancing index
The Physical Distancing Index (PDI) was developed to identify areas at high risk of infectious disease spread in Africa, using data from the Demographic and Health Surveys (DHS) and other public health sources. Researchers examined household access to essential services like clean water, sanitation, and transportation, along with factors like population density. By analysing these indicators, the PDI reflects how infrastructure impacts people's ability to practice physical distancing.
4. Road surface global dataset
This global road surface dataset maps paved and unpaved roads across 3 million kilometres, or about 36% of the world’s road network. They did this using over 105 million images from Mapillary, a crowdsourced street-view platform owned by Meta, combined with OpenStreetMap data.
Other useful bits
Satellites have captured images of the recent devastating floods in Spain.
Over 100 researchers are calling on the FCC to pause Starlink satellite launches until a full environmental review is done. There is concern about issues like space debris and atmospheric pollution from satellite re-entries.
Apple is investing $1.5 billion in Globalstar to boost iPhone satellite services. This partnership will support new satellites and infrastructure, dedicating most of Globalstar's network to Apple, and enhancing connectivity in areas without cellular coverage.
Researchers in Mexico are using geospatial tech, like drones and hyperspectral imaging, to locate hidden graves linked to disappearances. These tools detect changes in vegetation and soil, helping search teams identify burial sites more efficiently and support justice efforts.
The Taylor Geospatial Institute and AWS have launched the Generative AI for Geospatial Challenge, offering $1 million in AWS credits to innovators. This challenge invites participants to use AI and geospatial data to tackle big issues like environmental monitoring, urban planning, and disaster response.
Jobs
AECOM is looking for a GIS Specialist to work on analytical data management, database programming, or data science.
The University of Adelaide is looking for a Grant-Funded Researcher in Remote Sensing and Vegetation Ecology.
Barcelona Supercomputing Center is looking for a Researcher on EO products.
The University of Liverpool is looking for a University Teacher in Geographic Data Science as a maternity cover.
The European Space Agency is looking for an Intern in the Industrial Policy and Space Economy Division.
The International Atomic Energy Agency is looking for a Senior Safeguards Analyst (Satellite Imagery) under the Department of Safeguards.
Just for fun
Did you know Superman played a role in the Hubble Space Telescope’s launch? Like, actually? In 1972 a Superman comic was written to raise public support for funding the Hubble Space project. The comic (Action Comics No. 419) was then distributed to members of congress to demonstrate how enthusiastic the general public was about the project. Ultimately, funding for the project was secured in 1977.
That’s it for this week.
I’m always keen to hear from you, so please let me know if you have:
new geospatial datasets
newly published papers
geospatial job opportunities
and I’ll do my best to showcase them here.
Yohan