11 Aug 2014
I gave this talk at Wikimania in London yesterday.
In the first years of Wikipedia’s existence, many of us said that, as an example of citizen journalism and journalism by the people, Wikipedia would be able to avoid the gatekeeping problems faced by traditional media. The theory was that because we didn’t have the burden of shareholders and the practices that favoured elite viewpoints, we could produce a media that was about ‘all of us’ and not just ‘some of us’.
Dan Gillmor (2004) wrote that Wikipedia was an example of a wave of citizen journalism projects initiated at the turn of the century in which ‘news was being produced by regular people who had something to say and show, and not solely by the “official” news organizations that had traditionally decided how the first draft of history would look’ (Gillmor, 2004: x).
Yochai Benkler (2006) wrote that projects like Wikipedia enables ‘many more individuals to communicate their observations and their viewpoints to many others, and to do so in a way that cannot be controlled by media owners and is not as easily corruptible by money as were the mass media.’ (Benkler, 2006: 11)
I think that at that time we were all really buoyed by the idea that Wikipedia and peer production could produce information products that were much more representative of “everyone’s” experience. But the idea that Wikipedia could avoid bias completely, I now believe, is fundamentally wrong. Wikipedia presents a particular view of the world while rejecting others. Its bias arises both from its dependence on sources which are themselves biased, but Wikipedia itself has policies and practices that favour particular viewpoints. Although Wikipedia is as close to a truly global media product than we have probably ever come*, like every media product it is a representation of the world and is the result of a series of editorial, technical and social decisions made to prioritise certain narratives over others.
Mark Graham (2009) has shown how Wikipedia’s representation of place is skewed towards the developed North; researchers such as Brendan Luyt (2011) have shown that Wikipedia’s coverage of history suffers from an over-reliance on foreign government sources, and others like Tony Lam, Anuradha Uduwage, Zhenhua Dong, Shilad Sen, Dave Musicant, Loren Terveen and John Riedl (Lam et al., 2011) have shown how there are significant gender-associated imbalances in its topic coverage.
But there is, as yet, little research on how such imbalances might manifest themselves in articles about breaking news topics. At a stage when there is often no single conclusive narrative about what happened and why, we see the effects of these choices most starkly – both in decisions about whether a particular idea, event, object or person is important enough to warrant a standalone article, as well as decisions about which statements (aka ‘facts’) to include, what those statements will be, and what shape the narrative arc will take. Wikipedia, then, presents a particular view of the world in the face of a variety of alternatives.
It is not necessarily problematic that we choose to present one article about an event rather than 20 different takes on it. But it becomes problematic when Wikipedia is presented as a neutral source; a source that represents the views of “everyone”. It’s problematic because it means that users don’t often recognise that Wikipedia is constructed (and mirrors in many ways the biases of the sources it uses to support it), but it is also problematic because it means that Wikipedians don’t always recognise that we need to change the way that we work in order to be more inclusive of perspectives. Such perspectives will remain unreflected if we continue to adhere to policies developed to favour a particular perspective of the world.
What is Wikipedia news?
The 6th biggest website in the world, Wikipedia enjoys 18 billion page views and nearly 500 million unique visitors a month, according to comScore. Since 2003, the top 25 Wikipedia articles with the most contributors every month consist nearly exclusively of articles related to current events (Keegan, 2013). Last week, for example, the article entitled ‘Ebola virus disease’ was the most viewed article on English Wikipedia at about 1.8 million views, and the Israeli-Palestinian conflict and related articles made up 5 of the top 25 most popular articles and accounted for about 1.5 million views (see the Top 25 Report on English Wikipedia).
Wikipedia didn’t always ‘do news’. Brian Keegan (2013) writes that the policy around breaking news emerged after the September 11 attacks in 2001 when there was an attempt to write articles about every person who died in the Twin Towers. There was a subsequent decision to separate these out in what is called the ‘Memorial Wiki’. It was also around this time that editors defined what would constitute a notable event and signaled the start of an ‘in the news’ section on the website that would list the most important news of the day linking to good quality Wikipedia articles about those topics. Editors can now propose topics for the ‘in the news’ section on the home page and discuss whether the articles are good enough quality to be featured and whether the news is appropriate for that day.
Although both Wikipedia and traditional media both produce news, probably the most fundamental difference between Wikipedia and journalism practice is in our handling of sources. Journalists pride themselves on their ability to do original research by finding the right people to answer the right kinds of questions and for them to distil the important elements from those conversations into an article. For journalists, the people they interview and interrogate are their sources, and the process is a dialogic one: through conversation, questions, answers, follow-up questions and clarifications, they produce their article.
Wikipedians, on the other hand, are forbidden from doing ‘original research’ and must write what they can about the world on the basis of what we find in ‘reliable secondary sources’. For Wikipedians, sources are the ‘documents’ that that we find to back up what we write. This is both a limiting and empowering feature of Wikipedia – limiting because we rely heavily on what documents say (and documents can be contradictory and false without an opportunity to follow up with their authors), but empowering (at least in theory) because it enables readers to follow up on the sources that have been provided to back up different arguments and check or verify whether they are accurate. This is called the ‘verifiability’ principle and is one of Wikipedia’s core policies.
Wikipedia’s ‘no original research’ article is summarised as follows:
Wikipedia does not publish original thought: all material in Wikipedia must be attributable to a reliable, published source. Articles may not contain any new analysis or synthesis of published material that serves to reach or imply a conclusion not clearly stated by the sources themselves.
The problem is that when Wikipedia says it doesn’t allow ‘original research’, this doesn’t mean that Wikipedians aren’t constantly making decisions about what to write and what content to include that are to a lesser or greater extent subjective decisions. This is true for the construction of articles which require Wikipedians to construct a narrative from a host of distinct reliable and unreliable sources, but it is especially true when Wikipedians must decide whether something that happened is important enough to warrant a standalone article.
Notability, according to Wikipedia, is defined as follows:
The topic of an article should be notable, or “worthy of notice”; that is, “significant, interesting, or unusual enough to deserve attention or to be recorded”. Notable in the sense of being “famous”, or “popular”—although not irrelevant—is secondary.
Decisions about notability, then, can only be original research: the conclusion that something is important enough (according to Wikipedia’s criteria) to warrant an article must be made according to editors’ subjective viewpoints. The way in which Wikipedians summarise issues and pay attention to particular points about an issue is also subjective: there is no single reference that is an exact replica of what is represented in an article, decisions about what to include and what to leave out are happening all the time.
Such decisions are made, not just by information reflected in reliable sources but by a host of informational sources and experiences of the individual editor and the social interactions that develop as a result of the progress of an article. We don’t only learn about the world through ‘reliable sources’; we learn about the world through a host of informational cues – through street corner conversations, through gossip, through signage and posters and abandoned newspapers in restaurants, in train carriages and through social media and email and text messages and a whole host of what would be regarded, according to Wikipedia’s definition, as totally ‘unreliable sources’.
Let’s take the example of the first version of the 2011 Egyptian Revolution article (then called ‘protests’ rather than ‘revolution’). The article was started at 4:26pm local time on the first day of the Egyptian protests that led to the unseating of then-President Hosni Mubarak. (The first protests began around 2pm on that day). Let’s first look at the AFP article used as a citation in the article:
Egypt braces for nationwide protests
By Jailan Zayan (AFP) – Jan 25, 2011
CAIRO — Egypt braced for a day of nationwide anti-government protests on Tuesday, with organisers counting on the Tunisian uprising to inspire crowds to mobilise for political and economic reforms.
And then the first version of the article:
The 2011 Egyptian protests are a continuing series of street demonstrations taking place throughout Egypt from January 2010 onwards with organisers counting on the Tunisian uprising to inspire the crowds to mobilize.
I interviewed some of the (frankly amazing) Wikipedians who worked on this article about what became the 2011 Egyptian Revolution Wikipedia article. I knew that there had been a series of protests in Egypt in the run-up to the January 25 protest, but none of these had articles on them on Wikipedia, so I wondered why this article was started (backed up by such weak evidence at the time) and how the article was able to survive.
In a Skype interview, the editor who started the article and oversaw much of its development, TheEgyptianLiberal, said that he knew
‘the thing was going to be big… before the revolution became a revolution’
(He had actually prepared the article the day before the protests even began.) When I asked him how he knew that it was going to be significant, he replied,
‘The frustration in the street. And what happened in Tunisia.’
The Egyptian Liberal had access to a wealth of information on the streets of Cairo that gave him access to what was really happening and what was happening, he (rightly) believed, was definitely “a thing” – “a thing” worth taking notice of. In Wikipedia’s terms, this was something “significant, interesting, or unusual enough to deserve attention or to be recorded”, despite the fact that it was impossible to tell at this early stage.
Another article wasn’t as successful in its early stages. When working for Ushahidi in 2011/12, I took a trip to Kenya to visit the HQ. Ushahidi let me do a side project which for me was to try to understand the development of Swahili Wikipedia. When I arrived in Nairobi, the first thing I did was to buy every newspaper available from the local supermarket. I wanted to immerse myself in the media environment that Kenyans were being enveloped by. I also sat in the B&B and watched a lot of local television. Most of the headlines were about the looming war against Al Shabaab in southern Somalia as the Kenyan army moved into southern Somalia to try to root out the militant Al Shabaab terrorist group who were alleged to have kidnapped several foreign tourists and aid workers inside Kenya.
This was the first time that the Kenyan army had been engaged in a military campaign since independence, and so it was a big deal because the government wanted to be seen to be acting to root out the elements that were believed to be behind a series of kidnappings and murders of both locals and foreigners near the border. After two bombings in central Nairobi while I was visiting, people were trying to stay at home and avoid crowded areas.
During this time, one of the Wikipedians who I interviewed, AbbasJnr pointed me to a deletion discussion going on on the English Wikipedia about whether the Kenyan military intervention warranted its own article. The article had been nominated for deletion on the grounds that it did not meet Wikipedia’s notability standards. The nominator wrote that the event was not being reported in reliable news outlets as an ‘invasion’ but rather an ‘incursion’ and since it was ‘routine’ for troops from neighboring countries to cross the border for military operations, this event did not warrant its own article.
The Wikipedians who I spoke to were very sure that what was happening in their country should be considered notable. I was sure too, having spent at that stage only 24 hours in the country. But the media in the West weren’t reflecting this story as an important, notable event as the people living in Kenya understood it to be. Wikipedians (the majority of whom are based in the West ) were making the decision to put the article up for deletion because they didn’t have much to go by – they only had the ‘reliable sources’, a few international publications, very few of the Kenyan media publications (since few are online and updated regularly) and fewer still of the informal and unreliable communications that filled the air in Kenya at that time.
Both of these examples show that there is important local contextual information required to make decisions about whether something is a notable event worth documenting on Wikipedia, and that usually we don’t notice this because the majority of editors of English Wikipedia share a similar media sphere and world view. They occupy similar informal and formal media spheres. When there is disagreement, the disagreement is usually about how to cover an issue rather than whether the issue is actually important.
There are plenty of disagreements that result from these isolated media spheres. We see these disagreements when the very different and highly isolated media spheres operating inside Israel and in Gaza are exposed; we see them when we find Russian government employees as well as ordinary citizens attempting to edit articles about the Crimean crisis, and how high a Ukrainian military jet can fly in a Russian Wikipedian article to support alternative narratives being promulgated inside Russia about what happened to MH17. We see glimpses of what Eli Pariser calls, ‘the filter bubble‘ when Gilad Lotan invites us to do a Facebook search and see what our friends are saying about a particular event or when he shows us how there are distinct, isolated Twitter groupings in accounts following news about one of the recent UNRWA school bombings in Gaza.
When the issue or the event happens outside both the formal and informal media spheres that the majority of Wikipedians inhabit, where the BBC or NYTimes is not covering the issue and when there aren’t many Wikipedians in place to account for its importance, Wikipedia has nothing to go on. Our ‘no original research’ and notability policies do not help us.
Not only does Wikipedia inherit biases from the traditional media but we also have our own biases brought upon by the local media spheres that we inhabit. What makes our bias more problematic is that Wikipedia is taken as an neutral source on the world. A large degree of Wikipedia’s authoritativeness comes from the authority implied by the encyclopaedic form.
Like a dictionary, an encyclopedia gives a very good impression of being comprehensive because comphrehensiveness is its goal, its history and narrative. Having a Wikipedia article brings with it an air of importance. Not every event has a Wikipedia article so there is an assumption that a group of people have reached consensus on the importance of the topic, but there is also an implicit authority from the format of an encyclopedic entry. The journalistic account is written as a single author account of an event gathering evidence from their reliable sources and/or from being there. The encyclopedic account, on the other hand, is written without explicit author credits (which adds to the authority) and with evidence of alternative points of view which gives an added appearance of neutrality. When we as a reader come to a Wikipedia or newspaper article, we come to it with a vast background understanding or assumptions about what encyclopedias or newspapers are, and that influences our understanding.
The context of use also points to Wikipedia’s assumed authority. Whereas people generally go to newspapers to find out what is happening in the world, people go to Wikipedia to get the authoritative take. Wikipedia is used to settle arguments in bars about how many people there are in Britain or when the London Underground was built. They go to Wikipedia to find ‘facts’ about the world. The encyclopedia gains its authority because it is an encyclopedia, and this is a very different authority gained from being the New York Times for its readers, for example. (You won’t find someone saying that they’re going to go to the New York Times to find the authoritative answer to how many people were killed in World War 1.)
Wikipedia, then, is not just powerful because it is widely consulted, it is powerful because it is seen as unbiased, as neutral and as a reflection of all of us, instead of the ‘some of us’ that it actually represents.
Wikipedians need to recognise their power as newsmakers — newsmakers who are making decisions to prioritise one narrative at the expense of others and to make sure we wield that power with care. We need to take a much closer look at our policies and practices and make sure that we are building a conducive environment for future articles about a big part of the world that is as yet unrepresented. It means re-looking at efforts to expand our definitions of ‘original research’ such as those currently being discussed by Peter Gallert and others to accommodate oral citations in the developing world. It means recognising that, although we are doing a great job in broadening the scope of the events we cover, we have not begun to represent them in any way that can be considered truly global. This requires a whole lot more work in bringing editors from other countries into the Wikipedia fold, and in being more flexible about how we define what constitutes reliable knowledge.
Finally, I think that understanding Wikipedia bias enables us to develop an understanding of how media bias itself is changing – because it’s not so much about how the media industry is failing to give us an accurate, balanced picture of the world so much as it is about us getting out of our filter bubbles and recognising the role of “unreliable sources” in our understandings about the world.
* But there are others like Global Voices, that are, in many ways, more globally representative than Wikipedia.