DTC Workshop: Capturing Online Data

With Dr Bernie Hogan, and Professor Vili Lehdonvirta
Date & Time:
09:45 - 16:00,
Monday 24 November, 2014


In this intensive one-day course, we will use the iPython web interface to collect data from Reddit and Twitter. We will discuss issues with data formats, APIs, authentication and strategies for the non-coder.

For this course we do recommend a basic familiarity with Python. That said, a working familiarity is not necessary since all of the code required to capture Reddit posts and Tweets will be presented in class using simple step-by-step instructions.

This course will be especially helpful to students with limited familiarity with programming who believe that accessing data from the web will be important to their dissertation or postdoctoral research. While we may not be able to solve the student’s specific data needs, exposure to this sort of workflow will help the student plan and execute their Internet-based research design.

Students taking this one-day course may wish to consider also taking our second one-day course, “Analyzing data with crowds”, in which they learn how to code the data they have acquired in a scalable manner using crowdsourcing.

Part 1 Digital Social Research as a concept. (1 hour)
Part 2 IPython as a working environment. (1 hour)
[Break for lunch]  
Part 3 APIs and data access. (1 hours)
Part 4 Downloading and viewing data. (2 hours)

**Attendees should note that they will need to bring their own laptop to the session. If you aren’t able to supply your own laptop, please email as there are a limited number available to borrow within the department.