Skip down to main content

Translating Twitter

Published on
29 Mar 2011
Written by
Scott A. Hale

I had the great opportunity to meet George Weyman, a project director at meedan, yesterday at an OII event. meedan has been doing great work in connecting English and Arabic speakers online through translation of news for many years.

My research has not included Arabic, unfortunately, but has found consistently that the English-language web is very insular. Other languages translate information from English and link to English sources, but English pages are significantly less likely to link to other languages. With the recent revolutions in the Arabic-speaking world, some English speakers have realized this insularity.

meedan has been using crowd sourced translation and machine translation to help bridge this gap. In addition to news coverage, meedan has also helped organize volunteers to translate tweets on Twitter. This presents a challenge as Twitter has no easy way to link translations to the original source content. (Indeed this is true of many social networking sites.) If tweets are not linked together, a conversation will fracture with every translation, work might be duplicated as multiple people translate the same tweet, and users may not easily find/know about translations.

As a great temporary fix, meedan is using Curated.by to organize the translations. This service was initially designed to organize and comment on tweets, but can somewhat be adapted to the goal of curating tweets and translations (each translation is posted as a comment to the original tweet).

While this is a great fix for now, it points to a longer term need to think about the design of platforms with users in multiple languages. One option is to add a lot of structure in advance, creating separate bins for each language with links between them as Wikipedia has done with its multiple language versions. However, I think Twitter is useful in many contexts because of its free-form, commons approach. All content, regardless of language, is posted into one commons. This could allow for wider conversations to develop. A few simple additions such as allowing for the linking of two tweets and specifying they are translations of one another (and which is the original) could add great connective power to the platform. Linking hashtags in different languages together could equally allow for wider conversations and better organization of tweets. Machine transliteration (converting non-Latin scripts to Latin characters) and machine translation might also be investigated.

I think too often the connective power of other language tools besides machine translation are forgotten. In particular on language pairs where machine translation performs poorly, other tools allowing for crowd translation, linking of content across languages, and transliteration can greatly help connect users.

Of course, the greatest connecting power is the will of the users who want to be connected. Recent events have drawn many Arabic and English speaking users together on Twitter as the network diagram of Twitter following relationships shows. I hope that these connections persist and English speakers will reach out beyond their language for information.


Please click the image for a larger version and explanation at Kovas Boguta’s website. English-only posting Twitter users are in blue, Arabic-only posting users in red. Connections are following relationships.

Related Topics