OII | Human Translation of User-Generated Content

Project Contents

Overview
Key Information
Publications
Participants
Project News
Project Press Coverage

Overview

Major user-generated content platforms are used by people from across the world who contribute content in many different languages (e.g., Facebook, Twitter, Wikipedia). One of the largest barriers to the sharing of content between all users on these platforms is language. In general, users in each language contribute unique content that is not shared outside of the language. For example, about half of the articles in the German edition of Wikipedia, the second-largest edition, have no equivalent in the English edition of Wikipedia, the largest edition.

Many platforms have added the ability for people to view machine translations of other-language content in order to enable the wider spread of content across languages. However, the short and informal nature of much user-generated content results in poor-quality translations. Furthermore, machine translation is not available for speakers of many smaller-sized languages who are coming online for the first time and arguably are most in need of translation since there is generally little content available online in their languages.

This project investigates the mechanics of successful human, crowd-sourced translations of user-generated content. The project conducts online experiments to understand what translation rating/voting systems work best and the optimal display of translated content. The project is being carried out in cooperation with industry partners including Meedan, who are building tools to help facilitate the translation of social media content.

Key Information

Funder:

John Fell OUP Research Fund

Project dates:
July 2015 - August 2017

Contact:
Scott Hale

All Publications

Articles

Hale, S. and Eleta, I. (2017) “Foreign-language Reviews: Help or Hindrance?“, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2017.
Kim, S., Park, S., Hale, S.A., Kim, S., Byun, J. and Oh, A.H. (2016) “Understanding Editing Behaviors in Multilingual Wikipedia.”, PloS one. 11 (5) e0155305.
Hale, S. (2016) “User Reviews and Language: How Language Influences Ratings”, Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. New York: ACM.
Hale, S.A. (2014) Global Connectivity and Multilinguals in the Twitter Network. In Proceedings of the 32nd International Conference on Human Factors in Computing Systems, CHI’14, ACM.
Hale, S.A. (2014) Multilinguals and Wikipedia Editing. In Proceedings of the 2014 ACM conference on Web science (WebSci ’14). ACM.
Hale, S.A. (2015) Cross-language Wikipedia Editing of Okinawa, Japan. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’15, ACM.
Liao, H.T., Fu, K.W., and Hale, S.A. (2015) How much is said in a microblog? A multilingual inquiry based on Weibo and Twitter. In Proceedings of the 2015 ACM Conference on Web Science (WebSci ’15). ACM.

Participant

Dr Scott A. Hale

Associate Professor, Senior Research Fellow

Dr Scott A. Hale is an Associate Professor, Senior Research Fellow, and Turing Fellow. He develops and applies computer science techniques to the social sciences focusing on increasing equitable access to quality information.

View profile

Project News

Twitter trials 280 characters, but its success in Japan is more than a character difference

1 October 2017

Project Press Coverage

A Third Rely on Translation to Make E-commerce Decisions

Slator, 3 February 2017

Dr Scott A. Hale's research into how online consumers react to foreign-language reviews reveals nearly one-third rely on machine translation to make their choice.

Read now