There are obvious gaps in access to the Internet, particularly the participation gap between those who have their say, and those whose voices are pushed to the sidelines. Despite the rapid increase in Internet access, there are indications that people in the Middle East and North Africa (MENA) region remain largely absent from websites and services that represent the region to the larger world.
We explore this phenomenon through one of the MENA region's most visible and most accessed source of content: Wikipedia. It currently contains over 9 million articles in 272 languages, far surpassing any other publicly available information repository. It is widely considered the first point of contact for most general topics, thus making it an effective site for framing any subsequent representations. Content from Wikipedia also has begun to form a central part of services offered elsewhere on the Internet.
Wikipedia is therefore an important platform from which we can learn whether the Internet facilitates increased open participation across cultures, or reinforces existing global hierarchies and entrenched power dynamics. Because the underlying political, geographic and social structures of Wikipedia are hidden from users, and because there have not been any large scale studies of the geography of these structures and their relationship to online participation, groups of people may be marginalized without their knowledge.
This relative lack of MENA voice and representation means that the tone and content of this globally useful resource that represents MENA, in many cases, is being determined by outsiders with a potential misunderstanding of the significance of local events, sites of interest and historical figures. Furthermore, in an area that has seen substantial social conflict, participation from local actors enables people to ensure balance in content about contentious issues. Unfortunately, most research on MENA's Internet presence has been drawn from anecdotal evidence, and no comprehensive studies currently exist.
This project will therefore employ a range of (primarily quantitative) methods to assess the connection between access and representation, using MENA as the first step in an assessment of the inequalities in the global system.
Our key academic objective is to discern the visibility of the MENA region, and residents of the MENA region, in the production of online knowledge. To do this, we outline a number of more specific objectives:
To examine whether there are disproportionately fewer articles on the MENA region compared to the rest of the world, and of these articles, whether authors from MENA will comprise disproportionately fewer of the contributors.
To determine if the centralized political structure of Wikipedia undervalues new contributors from MENA. In particular, we explore whether authors from MENA have their contributions undermined because of: competitive practices such as content deletion; indifference to content created by authors from MENA; and marginalization through bullying or dismissal.
Our key practical objective is to find the appropriate social mirror that will effectively represent Wikipedia's presentation of MENA content and MENA contributors in such a way as to facilitate more content, more accurate content and more effective knowledge transfer between MENA and other global regions.
We intend to do this through both community outreach workshops and a website resource that enables individuals to compare the breadth and quality of articles on areas of similar population size across MENA.
- Per capita, Arabic is the most under-represented major world language on Wikipedia, which is why it was of particular focus for us. Additionally, we were interested in Sub-Saharan Africa, as it is woefully under-represented per-capita in all major languages.
- Of the barriers we discovered in general, open government data was the one that surprised us until we delved deep into the logic of Wikipedia. As a compendium of secondary data, Wikipedia depends on good sourcing. Without official statistics, lists of towns or sub-national level facts, it is hard to ensure articles stay inside Wikipedia. Similarly, there is a great deal of unfamiliarity with legitimate but small newspapers and books in foreign languages.
- Arabic authors in particular face a number of hurdles. State actors in the Gulf have been known to meddle in Wikipedia and use it for surveillance. The Arabic language itself is presented in Wikipedia with a poor typeface. As Arabic people go online they are leapfrogging laptops in favour of mobile devices that are less amenable to content creation. Civil discourse between adherents to Islam and non-Islamic people suffers from disagreement about the veracity of religious texts. There is also a perception that Wikipedia is a Western enterprise that does not meet Arabic needs or that is trying to co-opt Arabic participation. There is a difference of opinion within the Arabic Wikipedia community that has led to the splintering of the site into Arabic and Egyptian Arabic.
- We did both big data analysis (primarily GIS and spatial analysis) on data dumps from Wikipedia and qualitative analysis through focus groups and interviews with key Wikipedians. By combining these two we were able to come up with a comprehensive picture that shows both the scale of variation in representation and many of the micro-level processes that either caused or reinforced this variation.
- From a global analysis, we discovered that one of the most significant barriers to geographic representation on Wikipedia is broadband Internet. It is stronger than population, education and GDP. Connectivity is critical.
We have now published several papers from this project, but most notably: Graham, M., Hogan, B., Straumann, R.K., and Medhat, A. (2014) Uneven Geographies of User-Generated Information: Patterns of Increasing Informational Poverty. Annals of the Association of American Geographers.
This project is supported by the International Development Research Centre (IDRC).