Skip down to main content

Online Learning in the Crowd: Examining “Content Overload” in MOOC Forums

Online Learning in the Crowd: Examining “Content Overload” in MOOC Forums

Project Contents


MOOCs facilitate learning through large-scale, semi-synchronous exchanges between a global body of lifelong learners. Yet, few researchers have explored how these learners communicate and collaborate – arguably, the most significant differentiator of MOOCs from prior large-scale online learning initiatives (such as MIT OpenCourseWare).

Through varied in-depth qualitative study, complex network analysis, and machine learning, our prior research revealed: the structural properties of communication in MOOCs, including information diffusion; “communication communities” that can be inferred based on discussion forum content; and raised questions of ethics and inequality in education. A particularly compelling finding from this study was the relatively low proportion of MOOC registrants that post in the discussion forums. A major reason for this lack of participation is a feeling of helplessness: learners see the vast amount of content being generated in the discussion forums and do not know where or how to participate in order to engage with their peers.

A key question then is the extent to which social learning can be supported in these large online crowds. This project aims to examine this question by using MOOC data to:

  • Identify/develop and apply topic modelling methods to infer latent topics in discussion forum content. Topic-modelling is a well-studied problem across domains, and so, we aim to leverage the existing state-of-the-art and adapt it to be appropriate for an online learning context.
  • Using standard machine learning benchmarks/techniques, validate the strength and accuracy of these models, including measuring how they perform in predicting a given learner’s likelihood of viewing or participating in a particular discussion thread, based on their other behaviours / behaviours of peers.
  • Run experiments to explore the extent to which MOOC participants actually take up opportunities to review discussion threads based on their inferred interests to address “content overload”.

Alongside these computational social science techniques, we aim to explore the wider social, educational and ethical implications of this kind of approach to researching and facilitating learning at scale.


This research is supported by Google.

Key Information

  • Google
  • Project dates:
    January 2015 - December 2016