Skip down to main content

Tourism needs to become sustainable: How data science can help

Tourism needs to become sustainable: How data science can help

Published on
20 Jul 2022
Written by
Felix J. Hoffman, Fabian Braesemann and Timm Teubner
In their latest blog, data experts Felix J.Hoffman, Fabian Braesemann and Timm Teubner explain how data science can help tourism become more sustainable.

Tourism needs to become sustainable: How data science can help

In Short: As pandemic-induced travel restrictions are falling, tourism is recovering faster than expected – and so are related emissions. To meet the sector’s climate targets despite growing demand, policymakers need to design effective interventions. These measures in turn rely on timely and accurate data, some of which could be acquired in unconventional ways.  In their latest blog, Felix J. Hoffmann, Fabian Braesemann, and Timm Teubner explain more.

The return of tourism and its environmental consequences

As covid19-related travel restrictions are falling, tourism is returning with full force. For the first quarter of 2022, international tourism saw a year-on-year increase of arrivals of almost 200%. The European market is leading this rebound with close to four times the number of international arrivals compared to the previous year [1]. While global tourism is still below its pre-pandemic size, this quick growth has surprised industry experts and increased recovery expectations. As of June 2022, roughly one in two experts anticipates a return to pre-pandemic levels by 2023.

This isn’t just good news for Airbnb landlords and travel bloggers – tourism is also a main driver of economic growth, particularly in less developed economies. The importance of tourism has been recognized in the UN Sustainable Development Goals, where the sector finds direct mention in three of the 17 goals [2].

In many cases, however, this economic progress comes at the cost of serious environmental consequences. Before the pandemic, tourism and related activities accounted for an estimated 8% of global greenhouse gas emissions. And even as the sector itself will be heavily affected by the change in global climate, these emissions are expected to grow in the coming decade [3].

It is this trade-off between economic benefits and environmental costs that well thought out policies and guidelines will need to balance. Developing nations in particular should aim to harness the economic potential of tourism while mitigating adverse environmental effects. Such policies require accurate and timely data to inform policy design and to assess the effects of the implemented measures.

How unconventional data sources can inform development statistics

Procuring this accurate and timely data poses a challenge for policy makers across all areas – and tourism is no exception. Take the European Tourism Indicators System (ETIS), a management and monitoring tool created by the European Commission to allow destinations to measure their sustainability performance. Some of the data used in the system is readily available from national statistics offices, other data is collected through surveys.

The European Commission suggests relying on three-year cycles for the collection of some indicators because of the time and cost intensity of these surveys. The need for cost savings thus decreases the temporal accuracy and availability of the sustainability indicators. And even under these guidelines, researchers have found more than half of the indicator data to be missing after the first implementation of the system in some regions [4].

Tapping into alternative forms of data might help to improve indicator frameworks. In other areas of economic and social development statistics, the potential of online platform and mobile network data to access alternative information sources has been explored with promising results. For example, Facebook advertising data could be shown to give relatively accurate estimates of female literacy in a 2018 study [5]. Similarly, mobile phone records could be shown to predict poverty in a study across both urban and rural regions in Bangladesh [6].

The findings in these studies are especially relevant as the estimated models can be used at high frequency and geographical accuracy while having low total cost. Granted, these unconventional data sources will not replace the gold standard of large-scale, statistician-led surveys for a long time, maybe ever. But they are able to offer additional information, increase coverage and fill gaps with good estimates.

Looking towards platform data for tourism statistics

So how do these ideas apply to the measurement of sustainable tourism?

In our recent study Measuring Sustainable Tourism with Online Platform Data – published in the journal EPJ Data Science – we approach this question and find tourism platform data to be a valuable source of information for understanding the degree of sustainable tourism in different countries. In the study, we focus on tourism in Europe, the world’s largest market.

Using a web-scraped data set of over 65,000 listings from and applying a range of statistical learning techniques, we find accommodations’ representation on the travel platform to be a good predictor of their sustainability practices. 

For example, the data shows that accommodations that were awarded a sustainability badge on the platform show higher rates of user engagement, and quality features (see Figure 1).


Figure 1: Differences between sustainable (GreenLeader: blue) and other accommodations (red) in the online platform data.

Based on these features, we trained a range of machine learning models, which can successfully predict the sustainability level of touristic accommodations. Due to the highly available nature of the data source and models, we are then able to predict the proportion of sustainable accommodations for additional countries at no additional cost (see Figure 2). In other words, the online platform data in combination with the machine learning models allows us to estimate the proportion of sustainable tourist accommodations across all countries with TripAdvisor presence across the globe.

Figure 2: Predicted shares of sustainable tourism accommodations in Europe.

The results of the study as well as the data and code to reproduce these results are freely available here.

The Take-Away

Whether 2023 or later, tourism will fully return to pre-pandemic levels, and with it the emissions it causes. Unconventional sources of information such as data from the online platform TripAdvisor and machine learning approaches can help to fill data gaps to track and manage sustainable tourism across the globe. Such information can aid the design of better policies to keep tourism within the planetary boundaries.

Read the full paper, ‘Measuring sustainable tourism with online platform data’ by Felix J. Hoffman, Fabian Braesmann and Timm Teubner.


UNWTO – Tourism recovery gains momentum as restrictions ease and confidence returns [accessed 19.06.2022]


United Nations. (2015). Transforming our world: The 2030 Agenda for

Sustainable Development.


UNWTO – Transforming tourism for climate action [accessed 19.06.2022]


Modica, P., Capocchi, A., Foroni, I., & Zenga, M. (2018). An assessment of the implementation of the European tourism indicator system for sustainable destinations in Italy. Sustainability, 10(9), 3160.


Fatehkia, M., Kashyap, R., & Weber, I. (2018). Using Facebook ad data to track the global digital gender gap. World Development, 107, 189-209.


Steele, J. E., Sundsøy, P. R., Pezzulo, C., Alegana, V. A., Bird, T. J., Blumenstock, J., Bjelland, J., Engø-Monsen, K., de Montjoye, Y.-A., Iqbal, A. M., Hadiuzzaman, K. N., Lu, X., Wetter, E., Tatem, A. J., & Bengtsson, L. (2017). Mapping poverty using mobile phone and satellite data. Journal of The Royal Society Interface, 14(127), 20160690.


World Tourism Organization (UNWTO) & International Transport Forum (Eds.). (2019). Transport-related CO2 Emissions of the Tourism Sector – Modelling Results. World Tourism Organization (UNWTO).

Related topics