Paul is a DPhil student in Social Data Science. His research is concerned with improving natural language processing methods for hate speech detection and other applications.
Paul Röttger is a student on the DPhil in Social Data Science.
Paul’s research focuses on adapting natural language processing methods to address linguistic challenges such as language change and context dependence in online hate speech detection and other applications. His wider research interests are in political expression on the internet and applied machine learning.
Prior to joining the OII in 2019, Paul completed his MPhil in Finance and Economics at the University of Cambridge, where he received a distinction for his thesis using machine learning methods to predict local voting behaviour from online political participation. He also holds a BSc in Economics from Ludwig-Maximilians-University in Munich, Germany.
Positions held at the OII
- DPhil student: October 2019 – present
- Teaching Assistant, Digital Social Research: Statistics Core : October 2019 – December 2019
- Teaching Assistant, Advanced Statistics for Internet Research I and II : January 2020 – March 2020
13 August 2021 Sky News
Harmful posts can end up being missed altogether while acceptable posts are mislabelled as offensive, according to the Oxford Internet Institute.
18 July 2021 BBC News
Online hate has been in the headlines again recently due to an avalanche of racist posts directed at three players who missed penalties in England's defeat to Italy in the Euro 2020 final.
7 June 2021 Wall Street Journal
One of the most comprehensive studies to date on the effectiveness of online hate-speech filters underscores a shortcoming many social-media users know firsthand: They don’t always catch hateful speech.
4 June 2021 MIT Technology Review
But scientists are getting better at measuring where each system fails.
6 January 2021 VentureBeat
Detecting hate speech is a task even state-of-the-art machine learning models struggle with. That’s because harmful speech comes in many different forms, and models must learn to differentiate each one from innocuous turns of phrase.