Dr Scott A. Hale is a Senior Research Fellow, Co-Director of the Social Data Science MSc Programme, a Fellow of the Alan Turing Institute, and a Senior Member of St Antony’s College. He develops and applies techniques from computer science to research questions in the social sciences. His research investigates the spread of information between speakers of different languages online, the roles of bilingual Internet users, collective action and mobilization, and misinformation.

Scott graduated with degrees in Computer Science, Mathematics, and Spanish from Eckerd College, FL, USA. During his time at Eckerd, he published computer science research in the area of image processing while working on a larger research project, Darwin, to uniquely identify dolphins from digital photographs. After graduating, he worked in Okinawa, Japan, at the Okinawa Prefectural Education Centre with public school teachers to develop English immersion curricula and with IT professionals to deliver continuing education training through the Internet to staff members and students on outlying islands. He came to the OII as a master’s candidate in October 2009 and completed his DPhil (PhD) at the department in 2015. His DPhil research concentrated on how the design of social media platforms affects the amount of information shared across language divides.

Selected Publications

Wang, Z., Hale, S.A., Ifeoluwa Adelani, D., Grabowicz, P., Hartman, T., Flöck, F., and Jurgens, D. (2019) Demographic Inference and Representative Population Estimates from Multilingual Social Media Data. In Proceedings of the Web Conference 2019, WWW 2019, ACM.
Open-access |Blog | Python package | Web demo

Hale, S. and Eleta, I. (2017) Foreign-language Reviews: Help or Hindrance?, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

Margetts, H., John, P., Hale, S., and Yasseri, T. (2015) Political Turbulence: How Social Media Shape Collective Action. Princeton University Press.

Hale, S.A. (2014) Global Connectivity and Multilinguals in the Twitter Network. In Proceedings of the 32nd International Conference on Human Factors in Computing Systems, CHI’14, ACM.

Hale, S.A. (2014) Multilinguals and Wikipedia Editing. In Proceedings of the 2014 ACM conference on Web science (WebSci ’14). ACM.

Hale, S. A. (2012) Net Increase? Cross-lingual Linking in the Blogosphere. Journal of Computer Mediated Communication.

Research interests

Human-Computer Interaction (HCI), bilingualism, applications of machine learning, natural language processing (NLP), social network analysis (SNA), experiments, visualization, complex systems, mobilization and collective action, human mobility

Positions held at the OII

  • Senior Research Fellow, September 2018 –
  • Social Data Science MSc Programme Co-Director, January 2019 –
  • Social Data Science Programme Director, February 2017 – December 2018
  • Senior Data Scientist and Research Fellow, June 2016 – August 2018
  • Data Scientist, January 2015 – May 2016
  • DPhil Student, October 2010 – January 2015
  • Research Assistant, May 2010 – December 2014
  • MSc Student, October 2009 – August 2010

Students supervised at the OII

Current students
Past students


Current projects

  • Current Affairs 2.0: Agenda setting in the European Union

    Participants: Dr Scott Hale, Fabian Flöck, Przemyslaw Grabowicz, David Jurgens, Chico Camargo

    This project seeks to measure and explain what societal issues are given the highest priorities by media organizations, policy makers, and the general public in different nations and languages of the European Union.

Past projects

  • Political Volatility

    Participants: Professor Helen Margetts, Dr Scott Hale, Dr Chico Camargo, Dr Myrto Pantazi, Professor Peter John

    This project seeks to quantify trends and changes in the volatility of public opinion before and after widespread use of social media, and to study how social information can drive public opinion.

  • TRANSNET: Forecasting and understanding transport network resilience and anomalies

    Participants: Dr Scott A. Hale, Dr Jonathan Bright, Dr Graham McNeill, Chico Camargo

    This project seeks to utilise newly available data to help urban policy makers improve transport infrastructure to cope with growing and increasingly mobile populations.

  • Wikipedia’s Networks and Geographies: Representation and Power in Peer-Produced Content

    Participants: Dr Han-Teng Liao, Dr Bernie Hogan, Professor Mark Graham, Dr Scott A. Hale, Dr Heather Ford

    This project brings together OII research fellows and doctoral students to shed light on the incorporation of new users and information into the Wikipedia community.

  • NEXUS: Real Time Data Fusion and Network Analysis for Urban Systems

    Participants: Dr Scott A. Hale, Dr Jonathan Bright, Dr Graham McNeill

    Mining human mobility and migration patterns from social media and industry data sources as well as visualizing geo-temporal network data interactively with HTML5.

  • Human Translation of User-Generated Content

    Participants: Dr Scott A. Hale

    Understanding what translation rating/voting systems work best for human, crowd-sourced translation and the optimal display of translated content.

  • Elections and the Internet

    Participants: Professor Helen Margetts, Dr Scott A. Hale, Dr Jonathan Bright

    This site collects elections research at the OII. We are interested in exploring the extent to which data from the social web can be used to predict interesting social and political phenomena, especially elections.

  • The Internet, Political Science and Public Policy: Re-examining Collective Action, Governance and Citizen-Government Interactions in the Digital Era

    Participants: Professor Helen Margetts, Dr Scott A. Hale, Tom Nicholls, Dr Taha Yasseri

    This research programme aims to assess where political science understanding, knowledge and theory should be re-examined and developed in light of widespread use of the Internet, and to develop methodologies to study online behaviour.

  • Big Data: Demonstrating the Value of the UK Web Domain Dataset for Social Science Research

    Participants: Professor Helen Margetts, Professor Eric T. Meyer, Dr Sandra Gonzalez-Bailon, Dr Scott A. Hale, Tom Nicholls, Dr Taha Yasseri, Dr Jonathan Bright

    This project aims to enhance JISC's UK Web Domain archive, a 30 TB archive of the .uk country-code top level domain collected from 1996 to 2010. It will extract link graphs from the data and disseminate social science research using the collection.

  • Government on the Web

    Participants: Professor Helen Margetts, Dr Tobias Escher, Dr Scott A. Hale, Simon Bastow, Professor Patrick Dunleavy, Oliver Pearce, Jane Tinkler

    Research dedicated to improving knowledge and understanding of e-government and the impact of web-based technologies on government.

  • Using Twitter to Map and Measure Online Cultural Diffusion

    Participants: Professor Mark Graham, Dr Scott A. Hale, Devin Gaffney, Dr Ning Wang

    This project is using Twitter data to comprehensively uncover where Internet content is being created; whether the amount of content created in different places is changing over time; and how content moves across time and space in the Social Web.

  • Interactive Visualizations for Teaching, Research, and Dissemination

    Participants: Professor Helen Margetts, Professor Mark Graham, Dr Scott A. Hale, Dr Monica Bulger, Joshua Melville

    "InteractiveVis" aims to support easy creation of interactive visualisations for geospatial and network data by researchers: it will survey existing solutions, build currently missing features, and smooth over incompatibilities between existing libraries.

  • OXLab: Oxford eXperimental Laboratory

    Participants: Professor Helen Margetts, Dr Tobias Escher, Dr Nir Vulkan, Dr Scott A. Hale, Ingrid Boxall, Professor Peter John, Lucy Bartlett

    Oxford eXperimental Laboratory is undertaking laboratory-based experiments (eg information-seeking tasks) on networked computers in two disciplines: Economics (interactive decision making) and Political Science (evaluating government information online).



  • MARGETTS, H., HALE, S. and John, P. (2019) "How Social Media Shapes Political Participation and the Democratic Landscape" In: Society and the Internet: How Networks of Information and Communication are Changing Our Lives (Second Edition) GRAHAM, M. and Dutton, W. (eds.) 2nd. Oxford University Press.
  • Stephens, M., Tong, L., HALE, S. and GRAHAM, M. (2018) "Misogyny, Twitter, and the rural voter" In: Atlas of the 2016 Elections Watrel, R.H., Weichelt, R., Davidson, F.M., Heppen, J., Fouberg, E.H., Archer, J.C., Morrill, R.L., Shelley, F.M. and Martis, K.C. (eds.). Rowman & Littlefield.
  • Hale, S., Blank, G. and Alexander, V. (2017) "Live versus archive: Comparing a web archive to a population of web pages" In: The Web as History Brügger, N. and Schroeder, R. (eds.). London: UCL Press. 45-61.
  • Meyer, E., Yasseri, T., Hale, S., Cowls, J., Schroeder, R. and Margetts, H. (2017) "Analysing the UK web domain and exploring 15 years of UK universities on the web" In: The Web as History Brugger, N. and Schroeder, R. (eds.). London: UCL Press. 23-44.

Conference papers

  • Camargo, C.Q., Bright, J., McNeill, G., Raman, S. and Hale, S.A. (2020) "Estimating Traffic Disruption Patterns with Volunteered Geographic Information", Scientific Reports. England. Springer Science and Business Media LLC. 10 (1) 252.
  • MCNEILL, G.R.A.H.A.M. and HALE, S. (2019) "Viz-Blocks: Building Visualizations and Documents in the Browser", EuroVis 2019 - Short Papers. EuroVis. Eurographics Association. 97-101.
  • Vidgen, B., Harris, A., Nguyen, D., Tromble, R., Hale, S. and Margetts, H. (2019) "Challenges and frontiers in abusive content detection", Proceedings of the Third Workshop on Abusive Language Online. Proceedings of the Third Workshop on Abusive Language Online, 1 January 1970. Association for Computational Linguistics. 80–93.
  • Aragón, P., Sáez-Trumper, D., Redi, M., Hale, S.A., Gómez, V. and Kaltenbrunner, A. (2018) "Online petitioning through data exploration and what we found there: A dataset of petitions from", 12th International AAAI Conference on Web and Social Media, ICWSM 2018. 474-480.
  • Hale, S., McNeill, G. and Bright, J. (2017) "Where’d it go? How geographic and force-directed layouts affect network task performance", EuroVis Workshop on Reproducibility, Verification, and Validation in Visualization (EuroRV3). EuroVis Workshop on Reproducibility, Verification, and Validation in Visualization (EuroRV3), Barcelona, Spain. Eurographics Association.
  • McNeill, G. and Hale, S. (2017) "Generating Tile Maps", Computer Graphics Forum. Eurographics Conference on Visualization (EuroVis), Barcelona, Spain, 12 – 16 June 2017. Wiley. 36 (3) 435-445.
  • Hale, S. and Eleta, I. (2017) "Foreign-language Reviews: Help or Hindrance?", Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '17: CHI Conference on Human Factors in Computing Systems. ACM. 2017-May 4430-4442.
  • Hale, S.A. (2016) "User Reviews and Language: How Language Influences Ratings", Extended Abstracts on Human Factors in Computing Systems, CHI ’16 EA. the 2016 CHI Conference Extended Abstracts, 7 – 12 May 2016. New York: ACM. 07-12-May-2016 1208-1214.
  • Hale, S.A. (2015) "Cross-language Wikipedia editing of Okinawa, Japan", Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’15. the 33rd Annual ACM Conference, 18 – 23 April 2015. ACM. 2015-April 183-192. (In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2015. ACM)
  • De Sabbata, S., Coltekin, A., Eccles, K., Hale, S. and Straumann, R. (2015) "Collaborative Visualizations for Wikipedia Critique and Activism", Proceedings of ICWSM. AAAI. Association for the Advancement of Artificial Intelligence. WS-15-19 11-16. (Source info: In Proceedings of ICWSM. AAAI, Forthcoming)
  • Liao, H.T., Fu, K.W. and Hale, S.A. (2015) "How much is said in a microblog? A multilingual inquiry based on Weibo and Twitter", Proceedings of the 2015 ACM Web Science Conference, WebSci 2015. the ACM Web Science Conference, 28 June – 1 July 2015. ACM. (9 pages, 4 figures WebSci 2015)
  • Hale, S.A. (2014) "Multilinguals and Wikipedia editing", Proceedings of the 2014 ACM Web Science Conference, WebSci 2014. the 2014 ACM conference, 23 – 26 June 2014. ACM Press. 99-108.
  • Hale, S., Yasseri, T., Cowls, J., Meyer, E.T., Schroeder, R. and Margetts, H.Z. (2014) "Mapping the UK Webspace: Fifteen Years of British Universities on the Web", CoRR. the 2014 ACM conference, 23 – 26 June 2014. ACM Press. abs/1405.2856 62-70. (Source info: Proceedings of WebSci, 2014)
  • Hale, S.A. (2014) "Global connectivity and multilingualsin the Twitter network", Proceedings of the 32nd International Conference on Human Factors in Computing Systems, CHI ’14. the 32nd annual ACM conference, 26 April – 1 May 2014. ACM Press. 833-842.
  • Hale, S.A. (2014) "Okinawa in Japanese and English wikipedia", Proceedings of the extended abstracts of the 32nd annual ACM conference on Human factors in computing systems - CHI EA '14. the extended abstracts of the 32nd annual ACM conference, 26 April – 1 May 2014. ACM Press. 927-932.
  • Hale, S.A., Margetts, H. and Yasseri, T. (2013) "Petition growth and success rates on the UK No. 10 Downing Street website", Proceedings of the 5th Annual ACM Web Science Conference, WebSci'13. the 5th Annual ACM Web Science Conference, 2 – 4 May 2013. ACM Press. volume (10) 132-138. (Source info: Hale, S. A., Margetts, H., and Yasseri, T. (2013). Petition Growth and Success Rates on the UK No. 10 Downing Street Website. In Proceedings of the 5th Annual ACM Web Science Conference, WebSci ’13.)
  • Hale, S. (2012) "Impact of platform design on cross-language information exchange", Conference on Human Factors in Computing Systems - Proceedings. the 2012 ACM annual conference extended abstracts, 5 – 10 May 2012. ACM Press. 1363-1368.
  • Stewman, J., Debure, K., Hale, S. and Russell, A. (2006) "Iterative 3-D Pose Correction and Content-Based Image Retrieval for Dorsal Fin Recognition", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer Berlin Heidelberg. 4141 LNCS 648-660.

Journal articles


  • Bright, J., Margetts, H.Z., Wang, N. and Hale, S.A. (2015) Explaining Usage Patterns in Open Government Data: The Case of Data.Gov.UK.
  • Park, S., Kim, S., Hale, S.A., Kim, S., Byun, J. and Oh, A. (2015) Multilingual Wikipedia: Editors of primary language contribute to more complex articles.
  • Bright, J.M., Margetts, H., Hale, S. and Yasseri, T. (2014) "The use of social media for research and analysis: a feasibility study" In: Report of research carried out by the Oxford Internet Institute on behalf of the Department for Work and Pensions. Department for Work and Pensions.
  • Margetts, H.Z., John, P., Reissfelder, S. and Hale, S. (2012) Social influence and collective action: an experiment investigating the effects of visibility and social information moderated by personality.
  • Graham, M., Hale, S. and Stephens, M. (2011) "Geographies of the World's Knowledge" In: Oxford Internet Institute. London: Oxford Internet Institute.
  • Oostveen, A., Meyer, E., Cobo, C., Hjorth, I., Reisdorf, B., Papoutsi, C., Power, L., Abdel-Sattar, N., Hale, S. and Waldburger, M. (2011) First year report on scientific workshop, SESERV Deliverable D1.2, Socio-Economic Services for European Research Projects FP7-2010-ICT-258138-CSA.
  • Hale, S. (2008) A New Approach to Unsupervised Thresholding for Automatic Extraction of Dolphin Dorsal Fin Outlines from Digital Photographs in DARWIN. The Eckerd Scholar 2008. Eckerd College.


  • Accessing Research Data from the Social Web

    This course teaches the essentials of programming in Python, the language of choice in the growing field of computational social science.

  • Data Analytics at Scale

    The course will teach computational complexity and how to profile and increase the computational efficiency of Python code. It will also cover parallel and distributed computing approaches, and discuss data storage and retrieval techniques.

  • Foundations of Visualisation

    Discussion of the two-way interaction between visualisation and the social sciences.


  • Countering the COVID-19 Misinfodemic with Text Similarity and Social Data Science

    Recorded: 24 June 2020

    Duration: 01:02:23

    Dr Scott A. Hale discusses how text similarity algorithms are being used to help fact-checkers locate misinformation, cluster similar misinformation, and identify existing fact-checks on platforms with end-to-end encryption. Moderated by Dr Chico Camargo.

  • Global Fact 7: Fakes, misinformation and fact-checking

    Recorded: 24 June 2020

    Duration: 01:02:28

    In this session, part of International fact-checking network Global Fact 7 Virtual, panelists will present ongoing research on misinformation and fact-checking from around the world.

  • Monitoring health misinformation: An early-detection methodology for fact-checkers

    Recorded: 17 June 2020

    Duration: 00:09:18

    Monitoring the health conversation online. Detecting notable shifts through word embeddings and semantic change. Presentation by Jenna Sherman, Nat Gyenes, Ashkan Kazemi, and Scott Hale.

  • Transnet: Understanding traffic with open data and visualization

    Recorded: 26 July 2018

    Duration: 00:42:30

    This presentation, hosted by the Alan Turing Institute focuses on using crowd-sourced data, such as OpenStreetMap and Waze, to improve traffic models and better understand the factors contributing to traffic jams and other traffic issues.

  • Foreign-language Reviews: Help or Hindrance? (CHI2017)

    Recorded: 2 February 2017

    Duration: 00:05:14

    Dr Scott A. Hale tests the impact of foreign-language reviews on the perceived helpfulness of all reviews using an experiment, and finds use of translation buttons clearly separated individuals with positive and negative attitudes.

  • The ATI Fellow Short Talks: Dr Scott Hale

    Recorded: 2 December 2016

    Duration: 00:25:21

    In the Alan Turin Institute's Fellow Talks, Dr Scott A. Hale talks about social data science and its application to bilingualism.

  • Okinawa in Japanese and English Wikipedia

    Recorded: 26 April 2016

    Duration: 00:03:17

    This is a video summary of a extended abstract and poster presented at the 2014 ACM Annual Conference on Human Factors in Computing Systems, ACM (Montreal, Canada).

  • How Much Interaction Is There Between Wikipedia’s Language Editions?

    Recorded: 15 January 2016

    Duration: 00:15:01

    Presentation on Scott Hale's Wikipedia research, on the occasion of Wikipedia's 15th Birthday.

  • Modelling the Rise in Internet-based Petitions

    Recorded: 8 November 2013

    Duration: 00:17:39

    The launch of online government petition platforms allows for the passive study of all petitions to government. Scott Hale gives a brief overview of OII findings related to three government petition platforms and the future directions being pursued.

  • Quantitative Methods in Social Media Research: Data Visualization

    Recorded: 26 September 2012

    Duration: 00:10:55

    Scott Hale discusses visualisation during a seminar on quantitative methods in social media research held at the OII on 26 September 2012. How can we visualise data collected by social media? How does visualisation relate to statistical analysis?





Integrity Statement

In the past five years my work has been financially supported by UK Research and Innovation, the Alan Turing Institute, Lloyd’s Register Foundation, and the Volkswagen Foundation. My research has included collaborative grants with Oxfordshire County Council, Google, BT, Meedan, and a youth-support charity. I currently hold an external funded position at Meedan.
I have served in an unpaid advisory capacity to Meedan, the UK Government, and Town Square News in the past five years.