OII | Estimating local commuting patterns from geolocated Twitter data

Published on
25 Oct 2017

Written by
Jonathan Bright

Over the last decade or so there has been an explosion of research interest in the area of measuring (and forecasting) of traffic and commuting patterns. Part of this is driven by ever increasing human mobility: in 2016 alone, people in the UK travelled a collective 800 billion kilometres [PDF], more than 60% of which was by car, and congestion on these networks costs billions of pounds a year. But also driving the research agenda is the emergence of a wide variety of new forms of data (which has built on and supplemented more traditional magnetic loop technologies): such as data re-purposed from mobile phone records, or collected through IoT enabled smart sensors, or emerging from freely contributed traces to social media platforms. These data sources offer huge potential to improve on existing methods of data collection, such as hated transport census (see picture).

As part of a research project entitled NEXUS: Real Time Data Fusion and Network Analysis for Urban Systems (funded by InnovateUK), myself and a team of researchers at the OII have been looking into some of these possibilities. Our first paper on the subject, entitled “Estimating Local Commuting Patterns from Geolocated Twitter Data“, has just been published in EPJ Data Science. The paper addresses the extent to which we can make use of geolocated Twitter data to estimate commuting flows between local authorities (you can have a play with some of the underlying data using the map below, which shows census commuting figures and Twitter based estimates for local authorities around Britain).

We draw two main conclusions from the paper. First we show that, making use of heuristics for mapping individuals making geolocated tweets to home and work areas, we can use Twitter to produce accurate representations of the overall structure of commuting in mainland Great Britain; estimates which improve considerably on other ‘low information’ methods of estimating commuting flows (we compared estimates in particular to the popular radiation model). Second, and probably most importantly, we show that these results are not particularly sensitive to demographic characteristics. When looking at commuting flows broken down by gender, age group and social class, we found that Twitter still offered reasonable estimations for all of these sub-categories. We think this is important because a key concern about using social media data for this type of proxy estimation is the extent to which the ‘demographic bias’ in social media users (who are often younger, better educated and wealthier than the population average) might also result in biased predictions (for example, better prediction of the travel patterns of younger people). We show that, at least in our context, this is not the case.

What’s next? There is plenty more to explore in this research area: looking at whether predictions can be made more granular, or perhaps whether sentiment from social media can be worked in, or whether other platforms can also contribute. We will also start to work on some other data sources, making use of some of the exciting datasets being made available by places like the ADRN and CDRC.

Graham McNeill, Jonathan Bright and Scott A Hale (2017) Estimating local commuting patterns from geolocated Twitter data, EPJ Data Science 20176:24. https://doi.org/10.1140/epjds/s13688-017-0120-x .

Author

Dr Jonathan Bright

Research Associate

In 2022 Jonathan Bright became the Head of AI for Public Services at the Turing Institute, having previously been a faculty member of the OII. A political scientist, he specialises in computational and ‘big data’ approaches to the social sciences.

View profile

Related People

Private: Dr Graham McNeill

Former

Graham's research focuses on network science and visualization. He is particularly interested in geospatial and temporal networks.

View profile

Dr Scott A. Hale

Associate Professor, Senior Research Fellow

Dr Scott A. Hale is an Associate Professor, Senior Research Fellow, and Turing Fellow. He develops and applies computer science techniques to the social sciences focusing on increasing equitable access to quality information.

View profile

Related Projects

NEXUS: Real Time Data Fusion and Network Analysis for Urban Systems

Mining human mobility and migration patterns from social media and industry data sources as well as visualizing geo-temporal network data interactively with HTML5.

View Project

Data Science in Local Government

Data science in local government uses novel techniques to make government more efficient in targeting resources. This project aims to explain the spread of data science methods in the local government context and to understand their impact.

View Project

Author

Dr Jonathan Bright

Related People

Private: Dr Graham McNeill

Dr Scott A. Hale

Related Projects

NEXUS: Real Time Data Fusion and Network Analysis for Urban Systems

Data Science in Local Government

Related Topics: