The term ‘crowdsourcing’ is increasingly used to define online projects entailing the active contribution of an undefined public. But what does that notion mean? And what is the value of crowdsourcing within digital humanities? 

Howe (2006), who coined the term ‘crowdsourcing’, defines it as “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. This can take the form of peer-production (when the job is performed collaboratively), but is also often undertaken by sole individuals. The crucial prerequisite is the use of the open call format and the large network of potential labourers”.

‘Crowdsourcing’ is an evolving phenomenon and an exhaustive definition has yet to be identified. As newly shown in research, 40 original definitions of ‘crowdsourcing’ were found in 32 articles published between 2006 and 2011 (Estelles-Arolas & Gonzalez-Ladron-de-Guevara, 2012). The term certainly derives from business to identify the process of outsourcing part of an activity to an external provider, but it is currently used to identify a wide array of initiatives, both commercial (e.g. Amazon Mechanical Turk) and non-commercial (e.g. Wikipedia).

Being a recent ‘practice’, crowdsourcing is still debated. For instance, Wikipedia is commonly cited as an exemplar crowdsourcing initiative, despite the sceptical considerations of Jimmy Wales, one of its co-founders, who considers ‘crowdsourcing’ just a business model to get the public doing work cheaply or for free.

Whether critical issues may arise in the profit sector, the perspective seems quite different in the public and non-profit sectors, where volunteering has a long and consolidated tradition and unpaid work is done for a common good.

Within Art Maps research project, a web survey was carried out on 36 crowdsourcing projects promoted by galleries, libraries, archives, museums, education and research institutions. The findings suggest that the initiatives are undertaken for two main purposes: enriching or creating cultural resources and exploring new forms of public engagement.

Even if crowdsourcing projects rely on free contributions, the notion of crowdsourcing is more extensive than that of digital volunteering. Volunteering refers to people who freely offer to undertake a task or work for an organization without being paid. In digital humanities, crowdsourcing refers to the process of aggregating distributed resources (e.g. information, artefacts, data) to improve existing assets or to create new ones.

Crowdsourcing projects in digital humanities can be seen as novel paths of collaboration between institutions and their audiences. In fact, institutions are not merely taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people” (Howe, 2006). They are collaborating with the public to augment or build digital assets through the aggregation of dispersed resources.

As emerged in our research, crowdsourcing can be particularly helpful for tasks that are difficult or impossible to perform computationally, but are simple for humans (e.g. spotting objects in artworks, transcribing handwritten documents).

For instance, the Victoria and Albert Museum in London has a collection of 140,000 images, selected from a database automatically and, as a result, some of them may not be the best view of the object to display on the homepage of ‘Search the Collections’. So, the public is invited to select the best images to use in the collections database, through a bespoke application. The Ancient Lives project, promoted by the Citizen Science Alliance (CSA), asks the public to transcribe Greek papyri fragments (no knowledge of ancient Greek is needed). Another CSA initiative, Old Weather, asks participants to transcribe weather handwritten observations made by Royal Navy ships around the time of World War I. Optical Character Recognition software can deliver poor results when dealing with historical newspapers, so the National Library of Australia invites the public to correct OCR texts, within the national newspaper digitisation programme. In those cases, the crowdsourcing process is carried for tasks requiring general human skills.

Crowdsourcing can be also useful to gather resources that are ‘owned’ by the public to enrich existing collections, or building new ones (e.g. storytelling, personal memorabilia, and information).

1001 Stories of Denmark, curated by the Danish Agency for Culture, displays stories about places, written by 180 country’s expert on cultural heritage and history. The website is user-driven, so participants can contribute photos, stories and recommendations, creating additional visiting routes. Europeana 1914-1918 is collecting material (e.g. pictures, letters, postcards, souvenirs, and anecdote) across Europe within Your Family History of World War I initiative. The 9/11 Memorial Museum is actively acquiring materials (e.g. photos, videos, voice messages, personal effects, workplace memorabilia) for its permanent collection, through Make History initiative, a collective telling of the events of 9/11 through the eyes of those who experienced it, both at the attack sites and around the world.  Pin a Tale, promoted by the British Library, invites the public to curate a comprehensive literary landscape of the British and Irish Isles: participants can choose a piece of writing that they know from any period and any form (e.g. a novel, a poem, song lyric or a play), relating to a specific location, and then, through a bespoke web application, reference the location and then drag the pin to refine the location. In those cases, the crowdsourcing process is undertaken to seek information, stories, items provided by specific groups of people, motivated by a sort of ‘relation’ with the resource provided (e.g. personal stories, family memorabilia, familiar locations, known literature).

Research on motivational factors highlights the distinction between intrinsic and extrinsic motivations. Intrinsically motivated activities are defined as those that individuals find interesting and would do in the absence of operationally separable consequences. Intrinsic motivation concerns active engagement with tasks that people find interesting. When extrinsically motivated, people behave in a manner that attains a desired consequence such as tangible rewards or to avoid a threatened punishment. Previous research suggests that motivational factors in crowdsourcing initiatives, in the cultural domain, are largely intrinsic and need to be further investigated discretely from other crowdsourcing initiatives for commercial purposes (Carletti, 2011).

In commercial crowdsourcing initiatives (e.g. Amazon Mechanical Turk, InnoCentive), the original business-related meaning of the term is evident: organisations outsource a task to an undefined crowd who is paid to solve the task (extrinsic motivation). The ‘crowd’ is a source: a place, person, or thing from which something originates or can be obtained (Oxford Dictionaries Online). In the 36 projects promoted by cultural and education institutions and analysed here, no material reward is involved (intrinsic motivation). Hence, the ‘crowd’ represents a resource: a stock or supply of money, materials, staff, and other assets that can be drawn on by a person or organization in order to function effectively (Oxford Dictionaries Online).

The crowdsourcing projects investigated seem more similar to collaborative endeavours between institutions and their public towards a shared and mutual objective, than a simple process “once performed by employees and then outsourced to an undefined network of people” (Howe, 2006).

Accordingly, Art Maps project aims both to explore new forms of public engagement and to enrich part of the Tate online collection with specific geographical information. Tate launched its new digital collection in April 2012, comprising over 70.000 artworks. 1/3 of them has been indexed with information about locations, typically the site represented. For some works this information is quite specific, but in many cases it is quite general, referring only to a city, region or major geographical feature.

Art Maps, as crowdsourcing project, originates to fill this gap through the contribution of the public. The Art Maps application will be available by the end of 2012 and will allow people to view artworks displayed on a map, and then to suggest specific locations for the artworks.

For instance, the public will be invited to search for familiar locations in the Tate digital collection and to help positioning more precisely the works shown in the Art Maps platform. 

In the medium-long term, public contributions within Art Maps will enrich Tate digital collection with a large set of new geographical information. Making sense of patterns in large sets of data is an emerging challenge, as well as an opportunity both for creating new knowledge in the humanities and for exploring how researchers interpret and use data (Chang et al., 2009).

In digital humanities, the value of crowdsourcing for research, for the institutions and for the public is still an open question. Crowdsourcing practices require a mutual exchange between institution and public, and the social history of knowledge is defined as a “history of interaction between outsiders and establishments, between amateur and professionals, intellectual entrepreneurs and intellectual rentiers, […] official and unofficial knowledge” (Burke, 2000: 51). 

Crowdsourcing is a recent and debated phenomenon. Nonetheless, it is supposed, as some ‘older’ experiences may suggest (e.g. Wikipedia, the Galaxy Zoo project), that collaborative projects between institutions and their audience are mutually beneficial, and that research can also gain from these novel forms of collaboration.