It’s a problem that all social scientists face. You have a brilliant idea for a study. You have the experimental design all worked out, and your university’s review board has approved it. But you still have to recruit hundreds of people as subjects for the experiment.
Gabriel Lenz, a political scientist at the University of California, Berkeley, faced this problem last year when he and collaborators wanted to follow up on another group’s study of voting behavior (Science, 10 June 2005, p. 1623). For that study, Americans were shown photographs of past U.S. congressional candidates and asked to rate the politicians on various characteristics, such as competence and attractiveness. Even though the study subjects had no information beyond an image of the candidates’ faces, their snap judgments were a signiﬁcant predictor of who actually won the races. Lenz wanted to see if that surprising result collapsed when those evaluating the photos come from cultures different from those of the candidates. But how to recruit people living in multiple countries?
Lenz and his research assistant Michael Myers had an idea: Why not order research subjects through Amazon.com? The company runs an online marketplace called Mechanical Turk for people across the world available to do work on computers. (The name is a reference to an 18th century chess-playing “machine” that actually worked by virtue of a man hidden inside.) For tiny sums, anyone can hire people to perform almost any kind of simple task, such as tagging items in images. Lenz’s experiment required people to look at photographs of Brazilian political candidates and ﬁll in a data sheet.
But first, he and his colleagues had to decide on how much they would pay each participant. Those offering a job through MTurk, known as requestors, compete with each other to recruit Turkers, the 500,000 people currently registered with the MTurk site as available for work. The task of rating the political candidate photos required about 4 minutes. “We played around with various payment rates,” Lenz says. For Turkers based in India, the researchers started low, offering 15 cents. In just 4 days, they received data from 100 people. Then for a control group, they recruited more than 300 Americans for between 20 and 50 cents each. The total cost? About $160, and that includes the 10% fee Amazon charges.
In just a few weeks, Lenz had all the data his group needed. In spite of the cultural differences, the snap-judgment effect persisted: American and Indian subjects predicted the winners of Brazilian political races based on nothing more than a mug shot, the researchers reported last year in the social science journal World Politics.
As others follow Lenz’s lead, many more social science papers using MTurk will appear in the coming years, predicts Adam Berinsky, a political scientist at the Massachusetts Institute of Technology in Cambridge. “Everyone I know is using it,” he says. For example, social scientists used 10,000 Turkers to create a tool for tracking the emotional content of Twitter messages (Science, 30 September, p. 1814).
For now, most researchers are using MTurk for pilot studies, quickly and cheaply testing online versions of experiments that they then perform with subjects face to face. But the use of MTurk subjects will eventually become mainstream, Berinsky says. The obvious advantage is the speed and cost. “Generally, we pay $8 for a 15- to 20-minute experiment in a lab. We can run the same study on MTurk for 75 cents to a dollar.”
There are other advantages. “Turkers are amazingly focused research subjects,” Berinsky says. Unlike the typical university undergraduates used for social science studies, Turkers get paid only if they
generate usable data. This is necessary to eliminate not only people who don’t understand the task but also “spammers,” people who try to exploit MTurk by skimming through the jobs and giving random responses wherever possible to accelerate the process.
For example, Lenz had to reject about 20% of his American and 50% of his Indian Turkers for those reasons. But that is a manageable problem, Berinsky says. A counterintuitive solution is to keep the price low. “If you offer more than a dollar, you attract the spammers who sort jobs by level of pay,” he says. “You have to ﬁnd the sweet spot where the payment is not too high but still attractive enough for most Turkers.” So far, that sweet spot seems to be between 15 and 50 cents for a 10-minute job.
Even if MTurk is cheap and fast, doubts will linger about interpreting data from research subjects whom you never meet. To address those concerns, Berinsky and Lenz are teaming up with Gregory Huber, a political scientist at Yale University, to study the Turker population. And of course, they are using MTurk to do so. They recently replicated two classic survey experiments and a political science experiment. In each case, the data obtained with MTurk were consistent with published studies that tested people in laboratories.
The scientists have found some differences, too. Turkers “are younger and more ideologically liberal than the U.S. public,” Berinsky says. However, they are more representative of the U.S. population than a typical cohort of university undergraduates.There is one long-term concern: the “super-Turkers,” people who are essentially professional workers on MTurk, some of them logging more than 20 hours per week. Many social science experiments rely on the subjects not knowing the researchers’ intentions. Berinsky says super-Turkers could potentially skew experiments if they try too hard to please researchers. There is incentive to do that because MTurk uses a reputation system. If a Turker does not have at least a 95% positive approval rating from their requestors, they’ll often go unhired.
“Mechanical Turk seems like the proverbial goose that lays the golden eggs,” Berinsky says. “But I worry that in the rush for cheap research subjects, we’re going to trample the goose to death.”
SOURCE : SCIENCE MAGAZINE VOL 334