Experts from the Oxford Internet Institute (OII) are calling for better access to platform data for academic and civil society research, in order to understand the impact of platform algorithms. In their report submitted to the EU Commission last week, they set out practical recommendations to guide the development of a new access scheme for vetted researchers. This scheme, being implemented as part of the new EU Digital Services Act, could offer independent researchers unprecedented insights into the platforms that now mediate access to services, goods, and news, amongst other critical elements of contemporary living.
The report was submitted by Lujain Ibrahim, DPhil student; Dr Luc Rocher, lecturer, Programme Director, DPhil in Social Data Science; and Dr Ana Valdivia, lecturer of AI, Governance & Policy. The report suggests best practices to implement new data access programs, and calls for new data to be released in order to study platforms’ algorithms and algorithm-user interactions.
Conditions of data access programs
Based on existing research, they call for caution in the deployment of privacy-enhancing technologies for vetted data access. Privacy-enhancing technologies (PETs) are now increasingly used by public bodies and by companies to make sensitive human data “anonymously” available to researchers. However, researchers have expressed strong concerns that the validity of research findings may be altered by privacy-preserving techniques, distorting statistical inferences and increasing disparities in outcomes for racial minorities.
Recommendation: advocate for the use of trusted research environments that are widely used in biomedical research, by building independent secure infrastructure where researchers can analyse platform data.
They also address data access issues outside the EU. The vast majority of users of social media platforms are outside of Europe, with some of the most represented countries being in the so-called Global South (e.g., India, Indonesia, Brazil, etc.). Additionally, some of the largest platform failures, notably on the content moderation front, have unfolded in the Global South (e.g., facilitating the Myanmar genocide, and inciting Islamophobic violence in India). Simultaneously, research on the functions and failures of platforms in those regions remains underrepresented and unexplored.
Recommendation: expand data access in a responsible manner to researchers outside of the EU through secure infrastructure in the EU that will benefit independent researchers and civil society organisations.
Category of Data Required to Study Platform Algorithms
The increased reliance on algorithms for content curation and moderation on online platforms has outpaced the availability and accessibility of data needed to study these complex systems. Current data-sharing practices are insufficient for a comprehensive understanding of algorithmic functions, their impact, and the efficacy of user controls. This shortfall hinders the ability to discern the causes and responsibility for unwanted or harmful outcomes.
Recommendation: enhance the breadth and depth of data access for vetted researchers. This should include detailed information about the nature, use, and effectiveness of algorithm-user interactions, particularly for recommender and moderation systems. Necessary data encompasses documentation of user controls, their usage and outcomes, as well as data and metadata for user-flagged content and subsequent automated decisions.
Download the Oxford response in full: Written evidence prepared by Lujain Ibrahim, Dr. Luc Rocher, and Dr. Ana Validivia.