A beginner’s guide and best practices for using crowdsourcing platforms for survey research: The Case of Amazon Mechanical Turk (MTurk)
Abstrak
Introduction Researchers around the globe are utilizing crowdsourcing tools to reach respondents for quantitative and qualitative research (Chambers & Nimon, 2019). Many social science and business journals are receiving studies that utilize crowdsourcing tools such as Amazon Mechanical Turk (MTurk), Qualtrics, MicroWorkers, ShortTask, ClickWorker, and Crowdsource (e.g., Ahn, & Back, 2019; Ali et al., 2021; Esfahani, & Ozturk, 2019; Jeong, & Lee, 2017; Zhang et al., 2017). Even though the use of these tools presents a great opportunity for sharing large quantities of data quickly, some challenges must also be addressed. The purpose of this guide is to present the basic ideas behind the use of crowdsourcing for survey research and provide a primer for best practices that will increase their validity and reliability. What is crowdsourcing research? Crowdsourcing describes the collection of information, opinions, or other types of input from a large number of people, typically via the internet, and which may or may not receive (financial) compensation (Hargrave, 2019; Oxford Dictionary, n.d.). Within the behavioral science realm, crowdsourcing is defined as the use of internet services for hosting research activities and for creating opportunities for a large population of participants. Applications of crowdsourcing techniques have evolved over the decades, establishing the strong informational power of crowds. The advent of Web 2.0 has expanded the possibilities of crowdsourcing, with new online tools such as online reviews, forums, Wikipedia, Qualtrics, or MTurk, but also other platforms such as Crowdflower and Prolific Academic (Peer et al., 2017; Sheehan, 2018). Crowdsourcing platforms in the age of Web 2.0 use remote labor recruited via the internet to assist employers complete tasks that cannot be left to machines. Key characteristics of crowdsourcing include payment for workers, their recruitment from any location, and the completion of tasks (Behrend et al., 2011). They also allow for a relatively quick collection of data compared to data collection in the field, and participants are rewarded with an incentive—often financial compensation. Crowdsourcing not only offers a large participation pool but also a streamlined process for the study design, participant recruitment, and data collection as well as integrated participant compensation system (Buhrmester et al., 2011). Also, compared to other traditional marketing firms, crowdsourcing makes it easier to detect possible sampling biases (Garrow et al., 2020). Due to advantages such as reduced costs, diversity of participants, and flexibility, crowdsourcing platforms have surged in popularity for researchers. Advantages MTurk is one of the most popular crowdsourcing platforms among researchers, allowing Requesters to submit tasks for Workers to complete (Cummings & Sibona, 2017). MTurk has been used as an online crowdsourcing platform for the recruitment of human subjects for research purposes (Paolacci & Chandler, 2014). Research has also shown MTurk to be a reliable and cost-effective tool, capable of providing representative data for research in the behavioral sciences (e.g., Crump et al., 2013; Goodman et al., 2013; Mason & Suri, 2012; Rand, 2012; Simcox & Fiez, 2014). In addition to its use in social science studies, the platform has been used in marketing, hospitality and tourism, psychology, political science, communication, and sociology contexts (Sheehan, 2018). To illustrate, between 2012 and 2017, more than 40% of the studies published in the Journal of Consumer Research used crowdsourcing websites for their data collection (Goodman & Paolacci, 2017). Disadvantages Although researchers have assessed crowdsourcing platforms as reliable and cost-effective for data collection in the behavioral sciences, they are not exempt of flaws. One disadvantage is the possibility of unsatisfactory data quality. In fact, the virtual setting of the survey implies that the investigator is physically separated from the participant, and this lack of monitoring could lead to data quality issues (Sheehan, 2018). In addition, participants in survey research on crowdsourcing platforms are not always who they claim to be, creating issues of trust with the data provided and, ultimately, the quality of the research findings (McGonagle, 2015; Smith et al., 2016). A recurrent concern with MTurk workers, for instance, is their assessment as experienced survey takers (Chandler et al., 2015). This experience is mainly acquired through completion of dozens of surveys per day, especially when they are faced with similar items and scales. Smith et al. (2016) identified two types of problems performing data collection using MTurk; namely, cheaters and speeders. As compared to Qualtrics—which has a strict screening and quality-control processes to ensure that participants are who they claim to be—MTurk appears to be less exigent regarding the workers. However, a downside for data collection with Qualtrics is more expensive fees—about $5.00 per questionnaire on Qualtrics, against $0.50 to $1.50 on MTurk (Ford, 2017). Hence, few researchers were able to conduct surveys and compare respondent pools with Qualtrics or other traditional marketing research firms (Garrow et al., 2020). Another challenge using MTurk arises when trying to collect a desired number of responses from a population targeted to a specific city or area (Ross et al., 2010). The issues inherent to the selection process of MTurk have been the subject of investigations in several studies (e.g., Berinsky et al., 2012; Chandler et al., 2014; 2015; Harms & DeSimone, 2015; Paolacci et al., 2010; Rand, 2012). Feitosa et al. (2015) pointed out that international respondents may still identify themselves as U.S. respondents with the use of fake addresses and accounts. They found that 5% to 10% of participants identifying themselves as U.S. respondents were actually from overseas locations. Moreover, Babin et al. (2016) assessed that the use of trap questions allowed researchers to uncover that many respondents change their genders, ages, careers, or income within the course of a single survey. The issues of (a) experienced workers for the quality control of questions and (b) speeders, which, for MTurk can be attributed to the platform being the main source of revenue for a given respondent, remain the inherent issues of crowdsourcing platforms used for research purposes. Best practices Some best practices can be recommended in the use of crowdsourcing platforms for data collection purposes. Workers IDs can be matched with IDs from previous studies, thus allowing researchers to exclude responses from workers who had answered previous similar studies (Goodman & Paolacci, 2017). Furthermore, proceed to a manual assignment of qualification on MTurk prior to data collection (Litman et al., 2015; Park & Park, 2020). When dealing with experienced workers, both using multiple attention checks and optimizing the survey in a way to have the participants exposed to the stimuli for a sufficient length of time to better address the questions are also recommended (Sheehan, 2018). In this sense, shorter surveys are preferred to longer ones, which affect the participant’s concentration, and may, in turn, adversely impact the quality of their answers. Most importantly, pretest the survey to make sure that all parts are working as expected. Researchers should also keep in mind that in the context of MTurk, the primary method for measurement is the web interface. Thus, to avoid method biases, researchers should ponder whether or not method factors emerge in the latent measurement models (Podsakoff et al., 2012). As such, time-lagged research designs may be preferred as predictor and criterion variables can be measured at different points in time or administered in different platforms, such as Qualtrics vs MTurk (Cheung et al., 2017). In general, the use of crowdsourcing platforms including MTurk may be appropriate according to the research question; and the quality of data is reliant on the quality-control strategies used by researchers to enhance data quality. Trade-offs between various validity types need to be prioritized according to the research objectives (Cheung et al., 2017). From our experience using crowdsourcing tools for our own research as the editorial team members of several journals and chair of several conferences, we provide the best practices as outlined below: MTurk Worker (Respondent) Selection: Researchers should consider their study population before using MTurk for data collection. The MTurk platform should be used for the appropriate study population. For example, if the study targets restaurant owners or company CEOs, MTurk workers may not be suitable for the study. However, if the target population is diners, hotel guests, grocery shoppers, online shoppers, students, or hourly employees, utilizing a sample from MTurk would be suitable. Researchers should use the selection tool in the software. For example, if you target workers only from one country, exclude responses that came from an internet protocol (IP) address outside the targeted country and report the results in the method section. Researchers should consider the demographics of workers on MTurk which must reflect the study targeted population. For example, if the study focuses on baby boomers use of technology, then the MTurk sample should include only baby boomers. Similarly, the gender balance, racial composition, and income of people on MTurk should mirror the targeted population. Researchers should use multiple screening tools that identify quality respondents and avoid problematic response patterns. For example, MTurk provides the approval rate for the respondents. This refers to how many times a respondent is rejected for various reasons (i.e., wrong code entered). We recommend using a 90% or higher approv
Topik & Kata Kunci
Penulis (3)
C. Çobanoğlu
Muhittin Cavusoglu
Gozde Turktarhan
Akses Cepat
- Tahun Terbit
- 2021
- Bahasa
- en
- Total Sitasi
- 83×
- Sumber Database
- Semantic Scholar
- DOI
- 10.5038/2640-6489.6.1.1177
- Akses
- Open Access ✓