This project examines the research potential of online communication to gauge public opinion by reviewing different methods to draw public opinion indicators from online communication, focusing on what the public thinks and how they think about it.

Online communication and digital technologies allow drawing real-time indicators of what the public think. These indicators offer great research potential because they rely not on a preconception of the issues that are salient in the mind of the public (as surveys usually do) but on what the public considers important enough to voluntarily talk about it.

Online communication offers two research advantages over polls and surveys:

  • It is more reactive to current affairs than surveys conducted years apart, which are good at capturing the long-term dynamics of opinion change but not the short term shifts that often shape policy discussions.
  • It is more informative than opinion barometers like approval ratings, which are measured more frequently than other opinion surveys but are shallow when it comes to illuminating the reasons behind general support to policies or representatives.

In spite of these advantages, the research potential of online communication is still largely underexploited in the social sciences because of many unsolved measurement and validity issues: How representative is online communication of general public opinion? How reliable are online indicators when measured using different sources and methods? How comparable are they to long-standing measures drawn from polls and surveys? And how much explanatory power do they have, on the aggregate, to explain offline political behaviour?

This project aims to review and compare different methods to draw public opinion indicators from online communication, with a focus not only on what the public think (which gives a good proxy to the most salient issues when forming political opinions) but also on how they think about it (which gives more information on the framing of those issues). The analyses will serve as a proof of concept of the research potential of online communication to gauge public opinion and its comparability to more standard measures. The project aims to, ultimately, pave the way for the collection of new empirical evidence that can help test and develop theories of agenda-setting and opinion formation in the new communication environment created by online technologies.

Research Objectives and Outcomes

This project has three main objectives. The first objective is to provide a state-of-the art review of opinion mining algorithms long used in computational linguistics and machine learning but virtually absent from public opinion studies, which have historically been centred on surveys. This review will identify methods for the classification and extraction of opinions from written communication, which involve converting a corpus of text into a vector of terms and subjective evaluations, usually by means of sentiment analysis techniques. Different methods exist, but no vis-a-vis comparison has been done in the realm of policy-relevant public opinion. This first stage of the project will identify the approaches that will be submitted to such comparison.

The second objective is to assess the performance of alternative opinion-mining methods when applied to different sources of data, for instance short text messages as used in micro-blogging sites like Twitter or the longer discussions held in the comments section of newspapers. Do 140-character tweets contain, when aggregated, as much information as the comments left to news? Are some online sources (e.g. blogs, social networks, news discussion sections) more informative than others when building opinion indicators? Are some methods more adequate to extract opinions from certain online platforms? Most of the validation exercise will take place in this second stage of the project.

The third objective is to assess the explanatory potential of these opinion indicators. The project will analyse correlation patterns with two offline measures of public opinion: (a) issue salience and framing in traditional media (measured using text mining techniques applied to archives of press news) and (b) a diversity of approval rates (obtained from available datasets like those managed by the Roper Centre for Public Opinion Research). The correlation patterns between issue salience in traditional media and online political communication will allow testing claims on agenda setting dynamics and the effects that priming and framing have on opinion formation.


This project is suported by Oxford University Press’s Fell Fund.

John Fell OUP Research Fund