Crowdsourcing translation during crisis situations: are ‘real voices’ being excluded from the decisions and policies it supports?
International NGOs and government actors have embraced crowdsourcing to manage the flood of information produced during crisis. However, when crowdsourced material crosses the language barrier into English, it often becomes inaccessible to the original contributors. Gwyneth Sutherlin is a doctoral student at the University of Bradford, where she writes about the intersection of foreign policy, language and technology. Her paper “A Voice in the Crowd: Broader Implications for Crowdsourcing Translation during Crisis” which looks at the policy implications of recent crisis mapping efforts, is published in the Journal of Information Science.
As revolution spread across North Africa and the Middle East in 2011, participants and observers of the events were keen to engage via social media. However, saturation by Arab-language content demanded a new translation strategy for those outside the region to follow the information flows — and for those inside to reach beyond their domestic audience. Crowdsourcing was seen as the most efficient strategy in terms of cost and time to meet the demand, and translation applications that harnessed volunteers across the internet were integrated with nearly every type of ICT project. For example, as Steve Stottlemyre has already mentioned on this blog, translation played a part in tools like the Libya Crisis Map, and was essential for harnessing tweets from the region’s ‘voices on the ground.’
If you have ever worried about media bias then you should really worry about the impact of translation. Before the revolutions, the translation software for Egyptian Arabic was almost non-existent. Few translation applications were able to handle the different Arabic dialects or supply coding labor and capital to build something that could contend with internet blackouts. Google’s Speak to Tweet became the dominant application used in the Egyptian uprisings, delivering one homogenized source of information that fed the other sources. In 2011, this collaboration helped circumvent the problem of Internet connectivity in Egypt by allowing cellphone users to call their tweet into a voicemail to be transcribed and translated. A crowd of volunteers working for Twitter enhanced translation of Egyptian Arabic after the Tweets were first transcribed by a Mechanical Turk application trained from an initial 10 hours of speech.
The unintended consequence of these crowdsourcing applications was that when the material crossed the language barrier into English, it often became inaccessible to the original contributors. Individuals on the ground essentially ceded authorship to crowds of untrained volunteer translators who stripped the information of context, and then plotted it in categories and on maps without feedback from original sources. Controlling the application meant controlling the information flow, the lens through which the revolutions were conveyed to the outside world.
This flawed system prevented the original sources (e.g. in Libya) from interacting with the information that directly related to their own life-threatening situation, while the information became an unsound basis for decision-making by international actors. As Stottlemyre describes, ceding authorship was sometimes an intentional strategy, but also one imposed by the nature of the language/power imbalance and the failure of the translation applications and the associated projects to incorporate feedback loops or more two-way communication.
The after action report for the Libya Crisis Map project commissioned by the UN OCHA offers some insight into the disenfranchisement of sources to the decision-making process once they had provided information for the end product; the crisis map. In the final ‘best practices section’ reviewing the outcomes, The Standby Task Force which created the map described decision-makers and sources, but did not consider or mention the sources’ access to decision-making, the map, or a mechanism by which they could feed back to the decision-making chain. In essence, Libyans were not seen as part of the user group of the product they helped create.
How exactly does translation and crowdsourcing shape our understanding of complex developing crises, or influence subsequent policy decisions? The SMS polling initiative launched by Al Jazeera English in collaboration with Ushahidi, a prominent crowdsourcing platform, illustrates the most common process of visualizing crisis information: translation, categorization, and mapping. In December 2011, Al Jazeera launched Somalia Speaks, with the aim of giving a voice to the people of Somalia and sharing a picture of how violence was impacting everyday lives. The two have since repeated this project in Mali, to share opinions about the military intervention in the north. While Al Jazeera is a news organization, not a research institute or a government actor, it plays an important role in informing electorates who can put political pressure on governments involved in the conflict. Furthermore, this same type of technology is being used on the ground to gather information in crisis situations at the governmental and UN levels.
A call for translators in the diaspora, particularly Somali student groups, was issued online, and phones were distributed on the ground throughout Somalia so multiple users could participate. The volunteers translated the SMSs and categorized the content as either political, social, or economic. The results were color-coded and aggregated on a map.
The stated goal of the project was to give a voice to the Somali people, but the Somalis who participated had no say in how their voices were categorized or depicted on the map. The SMS poll asked an open question:
How has the Somalia conflict affected your life?
In one response example:
The Bosaso Market fire has affected me. It happened on Saturday.
The response was categorized as ‘social.’ But why didn’t the fact that violence happened in a market, an economic centre, denote ‘economic’ categorization? There was no guidance for maintaining consistency among the translators, nor any indication of how the information would be used later. It was these categories chosen by the translators, represented as bright colorful circles on the map, which were speaking to the world, not the Somalis — whose voices had been lost through a crowdsourcing application that was designed with a language barrier. The primary sources could not suggest another category that better suited the intentions of their responses, nor did they understand the role categories would play in representing and visualizing their responses to the English language audience.
An 8 December 2011 comment on the Ushahidi blog described in compelling terms how language and control over information flow impact the power balance during a conflict:
A—-, My friend received the message from you on his phone. The question says “tell us how is conflict affecting your life” and “include your name of location”. You did not tell him that his name will be told to the world. People in Somalia understand that sms is between just two people. Many people do not even understand the internet. The warlords have money and many contacts. They understand the internet. They will look at this and they will look at who is complaining. Can you protect them? I think this project is not for the people of Somalia. It is for the media like Al Jazeera and Ushahidi. You are not from here. You are not helping. It is better that you stay out.
Ushahidi director Patrick Meier, responded to the comment:
Patrick: Dear A—-, I completely share your concern and already mentioned this exact issue to Al Jazeera a few hours ago. I’m sure they’ll fix the issue as soon as they get my message. Note that the question that was sent out does *not* request people to share their names, only the name of their general location. Al Jazeera is careful to map the general location and *not* the exact location. Finally, Al Jazeera has full editorial control over this project, not Ushahidi.
As of 14 January 2012, there were still names featured on the Al Jazeera English website.
The danger is that these categories — economic, political, social — become the framework for aid donations and policy endeavors; the application frames the discussion rather than the words of the Somalis. The simplistic categories become the entry point for policy-makers and citizens alike to understand and become involved with translated material. But decisions and policies developed from the translated information are less connected to ‘real voices’ than we would like to believe.
Developing technologies so that Somalis or Libyans — or any group sharing information via translation — are themselves directing the information flow about the future of their country should be the goal, rather than perpetual simplification into the client / victim that is waiting to be given a voice.
Note: This post was originally published on the Policy & Internet blog on . It might have been updated since then in its original location. The post gives the views of the author(s), and not necessarily the position of the Oxford Internet Institute.