by Ricardo Kawase

Internet Fraud: The Case of Account Takeover in Online Marketplace

Account takeover is a form of online identity theft where a fraudster gains unauthorized access to an individual’s account in a given system. Depending on the system, this unauthorized access can lead to severe consequences of privacy breach and financial loss to the victims, to the companies that maintain the system and to other users. In this paper, we present the work done in order to prevent and detect account takeovers at mobile.de, an online vehicle marketplace. To tackle the prevention problem, we first present a behavioral analysis of how fraudsters operate, and implemented a mutual two-factor authentication that achieved a reduction of 43% of account takeovers. To tackle the detection problem, we introduce a concept drift sensitive machine learning training approach that was able to improve our baseline methods by 18% in detection rates. The automatic detection reduced the exposure of fraudulent listings by 69%, resulting in a safer marketplace for buyers and sellers.

Authors: Ricardo Kawase, Francesca Diana, Mateusz Czeladka, Markus Schüler, Manuela Faust

PDF: kawase_ht2019 Download

11.19.18

by Ricardo Kawase

Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection.

The suitability of crowdsourcing to solve a variety of problems has been investigated widely. Yet, there is still a lack of understanding about the distinct behavior and performance of workers within microtasks. In this paper, we first introduce a fine-grained data-driven worker typology based on different dimensions and derived from behavioral traces of workers. Next, we propose and evaluate novel models of crowd worker behavior and show the benefits of behavior-based worker pre-selection using machine learning models. We also study the effect of task complexity on worker behavior. Finally, we evaluate our novel typology-based worker pre-selection method in image transcription and information finding tasks involving crowd workers completing 1,800 HITs. Our proposed method for worker pre-selection leads to a higher quality of results when compared to the standard practice of using qualification or pre-screening tests. For image transcription tasks our method resulted in an accuracy increase of nearly 7% over the baseline and of almost 10% in information finding tasks, without a significant difference in task completion time. Our findings have important implications for crowdsourcing systems where a worker’s behavioral type is unknown prior to participation in a task. We highlight the potential of leveraging worker types to identify and aid those workers who require further training to improve their performance. Having proposed a powerful automated mechanism to detect worker types, we reflect on promoting fairness, trust and transparency in microtask crowdsourcing platforms.

Authors: Ujwal Gadiraju, Gianluca Demartini, Ricardo Kawase, Stefan Dietze

gadiraju_jcscw2018 Download

08.29.17

by Ricardo Kawase

Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks

Paid crowdsourcing platforms have evolved into remarkable marketplaces where requesters can tap into human intelligence to serve a multitude of purposes, and the workforce can benefit through monetary returns for investing their efforts. In this work, we focus on individual crowd worker competencies. By drawing from self-assessment theories in psychology, we show that crowd workers often lack awareness about their true level of competence. Due to this, although workers intend to maintain a high reputation, they tend to participate in tasks that are beyond their competence. We reveal the diversity of individual worker competencies, and make a case for competence-based pre-selection in crowdsourcing marketplaces. We show the implications of flawed self-assessments on real-world microtasks, and propose a novel worker pre-selection method that considers accuracy of worker self-assessments. We evaluated our method in a sentiment analysis task and observed an improvement in the accuracy by over 15%, when compared to traditional performance-based worker pre-selection. Similarly, our proposed method resulted in an improvement in accuracy of nearly 6% in an image validation task. Our results show that requesters in crowdsourcing platforms can benefit by considering worker self-assessments in addition to their performance for pre-selection.

Authors: Ujwal Gadiraju, Besnik Fetahu, Ricardo Kawase, Patrick Siehndel, Stefan Dietze

PDF: gadiraju_tochi2017 Download

06.06.17

by Ricardo Kawase

Improving Reliability of Crowdsourced Results by Detecting Crowd Workers with Multiple Identities

Quality control in crowdsourcing marketplaces plays a vital role in ensuring useful outcomes. In this paper, we focus on tackling the issue of crowd workers participating in tasks multiple times using different worker-ids to maximize their earnings. Workers attempting to complete the same task repeatedly may not be harmful in cases where the aim of a requester is to gather data or annotations, wherein more contributions from a single worker are fruitful. However, in several cases where the outcomes are subjective, requesters prefer the participation of distinct crowd workers. We show that traditional means to identify unique crowd workers such as worker-ids and ip-addresses are not sufficient. To overcome this problem, we propose the use of browser fingerprinting in order to ascertain the unique identities of crowd workers in paid crowdsourcing microtasks. By using browser fingerprinting across 8 different crowdsourced tasks with varying task difficulty, we found that 6.18% of crowd workers participate in the same task more than once, using different worker-ids to avoid detection. Moreover, nearly 95% of such workers in our experiments pass gold-standard questions and are deemed to be trustworthy, significantly biasing the results thus produced.

Authors: Ujwal Gadiraju and Ricardo Kawase.

PDF: gadiraju_icwe2017 Download

12.25.15

by Ricardo Kawase

Human Beyond the Machine: Challenges and Opportunities of Microtask Crowdsourcing.

In the 21st century, where automated systems and artificial intelligence are replacing arduous manual labor by supporting data-intensive tasks, many problems still require human intelligence. Over the last decade, by tapping into human intelligence through microtasks, crowdsourcing has found remarkable applications in a wide range of domains. In this article, the authors discuss the growth of crowdsourcing systems since the term was coined by columnist Jeff Howe in 2006. They shed light on the evolution of crowdsourced microtasks in recent times. Next, they discuss a main challenge that hinders the quality of crowdsourced results: the prevalence of malicious behavior. They reflect on crowdsourcing’s advantages and disadvantages. Finally, they leave the reader with interesting avenues for future research.

Authors: Ujwal Gadiraju, Gianluca Demartini, Ricardo Kawase, Stefan Dietze.

gadiraju_intelsys2015 Download

05.13.15

by Ricardo Kawase

Methods for web revisitation prediction: survey and experimentation

More than 45 % of the pages that we visit on the Web are pages that we have visited before. Browsers support revisits with various tools, including bookmarks, history views and URL auto-completion. However, these tools only support revisits to a small number of frequently and recently visited pages. Several browser plugins and extensions have been proposed to better support the long tail of less frequently visited pages, using recommendation and prediction techniques. In this article, we present a systematic overview of revisitation prediction techniques, distinguishing them into two main types and several subtypes. We also explain how the individual prediction techniques can be combined into comprehensive revisitation workflows that achieve higher accuracy. We investigate the performance of the most important workflows and provide a statistical analysis of the factors that affect their predictive accuracy. Further, we provide an upper bound for the accuracy of revisitation prediction using an ‘oracle’ that discards non-revisited pages.

Authors: George Papadakis, Ricardo Kawase, Eelco Herder & Wolfgang Nejdl

papadakis_umuai2015 Download

04.18.15

by Ricardo Kawase

Understanding Malicious Behavior in Crowdsourcing Platforms: The Case of Online Surveys.

Crowdsourcing is increasingly being used as a means to tackle problems requiring human intelligence. With the ever-growing worker base that aims to complete microtasks on crowdsourcing platforms in exchange for financial gains, there is a need for stringent mechanisms to prevent exploitation of deployed tasks. Quality control mechanisms need to accommodate a diverse pool of workers, exhibiting a wide range of behavior. A pivotal step towards fraud-proof task design is understanding the behavioral patterns of microtask workers. In this paper, we analyze the prevalent malicious activity on crowdsourcing platforms and study the behavior exhibited by trustworthy and untrustworthy workers, particularly on crowdsourced surveys. Based on our analysis of the typical malicious activity, we define and identify different types of workers in the crowd, propose a method to measure malicious activity, and finally present guidelines for the efficient design of crowdsourced surveys.

Authors: Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, Gianluca Demartini

PDF: gadiraju_chi2015 Download

04.03.15

by Ricardo Kawase

Breaking Bad: Understanding Behavior of Crowd Workers in Categorization Microtasks

Crowdsourcing systems are being widely used to overcome several challenges that require human intervention. While there is an increase in the adoption of the crowdsourcing paradigm as a solution, there are no established guidelines or tangible recommendations for task design with respect to key parameters such as task length, monetary incentive and time required for task completion. In this paper, we propose the tuning of these parameters based on our findings from extensive experiments and analysis of categorization tasks. We delve into the behavior of workers that consume categorization tasks to determine measures that can make task design more effective.

Authors: Ujwal Gadiraju, Patrick Siehndel, Besnik Fetahu, Ricardo Kawase

PDF: gadiraju_ht2015 Download

Ricardo Kawase

Ricardo Kawase's Wordpress page.

Author Archives: Ricardo Kawase

Internet Fraud: The Case of Account Takeover in Online Marketplace

Crowd Anatomy Beyond the Good and Bad: Behavioral Traces for Crowd Worker Modeling and Pre-selection.

Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks

Improving Reliability of Crowdsourced Results by Detecting Crowd Workers with Multiple Identities

Human Beyond the Machine: Challenges and Opportunities of Microtask Crowdsourcing.

Methods for web revisitation prediction: survey and experimentation

Understanding Malicious Behavior in Crowdsourcing Platforms: The Case of Online Surveys.

Breaking Bad: Understanding Behavior of Crowd Workers in Categorization Microtasks