oalib
Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Adaptive Crowdsourcing Algorithms for the Bandit Survey Problem  [PDF]
Ittai Abraham,Omar Alonso,Vasilis Kandylas,Aleksandrs Slivkins
Computer Science , 2013,
Abstract: Very recently crowdsourcing has become the de facto platform for distributing and collecting human computation for a wide range of tasks and applications such as information retrieval, natural language processing and machine learning. Current crowdsourcing platforms have some limitations in the area of quality control. Most of the effort to ensure good quality has to be done by the experimenter who has to manage the number of workers needed to reach good results. We propose a simple model for adaptive quality control in crowdsourced multiple-choice tasks which we call the \emph{bandit survey problem}. This model is related to, but technically different from the well-known multi-armed bandit problem. We present several algorithms for this problem, and support them with analysis and simulations. Our approach is based in our experience conducting relevance evaluation for a large commercial search engine.
Bayesian Incentive-Compatible Bandit Exploration  [PDF]
Yishay Mansour,Aleksandrs Slivkins,Vasilis Syrgkanis
Computer Science , 2015,
Abstract: Individual decision-makers consume information revealed by the previous decision makers, and produce information that may help in future decisions. This phenomenon is common in a wide range of scenarios in the Internet economy, as well as in other domains such as medical decisions. Each decision-maker would individually prefer to "exploit": select an action with the highest expected reward given her current information. At the same time, each decision-maker would prefer previous decision-makers to "explore", producing information about the rewards of various actions. A social planner, by means of carefully designed information disclosure, can incentivize the agents to balance the exploration and exploitation so as to maximize social welfare. We formulate this problem as a multi-armed bandit problem (and various generalizations thereof) under incentive-compatibility constraints induced by the agents' Bayesian priors. We design an incentive-compatible bandit algorithm for the social planner whose regret is asymptotically optimal among all bandit algorithms (incentive-compatible or not). Further, we provide a black-box reduction from an arbitrary multi-arm bandit algorithm to an incentive-compatible one, with only a constant multiplicative increase in regret. This reduction works for very general bandit setting that incorporate contexts and arbitrary auxiliary feedback.
The Multi-Armed Bandit, with Constraints  [PDF]
Eric V. Denardo,Eugene A. Feinberg,Uriel G. Rothblum
Mathematics , 2012,
Abstract: The early sections of this paper present an analysis of a Markov decision model that is known as the multi-armed bandit under the assumption that the utility function of the decision maker is either linear or exponential. The analysis includes efficient procedures for computing the expected utility associated with the use of a priority policy and for identifying a priority policy that is optimal. The methodology in these sections is novel, building on the use of elementary row operations. In the later sections of this paper, the analysis is adapted to accommodate constraints that link the bandits.
Bandit-Based Task Assignment for Heterogeneous Crowdsourcing  [PDF]
Hao Zhang,Yao Ma,Masashi Sugiyama
Computer Science , 2015,
Abstract: We consider a task assignment problem in crowdsourcing, which is aimed at collecting as many reliable labels as possible within a limited budget. A challenge in this scenario is how to cope with the diversity of tasks and the task-dependent reliability of workers, e.g., a worker may be good at recognizing the name of sports teams, but not be familiar with cosmetics brands. We refer to this practical setting as heterogeneous crowdsourcing. In this paper, we propose a contextual bandit formulation for task assignment in heterogeneous crowdsourcing, which is able to deal with the exploration-exploitation trade-off in worker selection. We also theoretically investigate the regret bounds for the proposed method, and demonstrate its practical usefulness experimentally.
Multi-armed Bandit Problem with Known Trend  [PDF]
Djallel Bouneffouf,Rapha?l Feraud
Computer Science , 2015,
Abstract: We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different online problems like active learning, music and interface recommendation applications, where when an arm is sampled by the model the received reward change according to a known trend. By adapting the standard multi-armed bandit algorithm UCB1 to take advantage of this setting, we propose the new algorithm named A-UCB that assumes a stochastic model. We provide upper bounds of the regret which compare favourably with the ones of UCB1. We also confirm that experimentally with different simulations
A two armed bandit type problem revisited  [PDF]
Gilles Pagès
Mathematics , 2005,
Abstract: In a recent paper, M. Bena\"{i}m and G. Ben Arous solve a multi-armed bandit problem arising in the theory of learning in games. We propose an short elementary proof of this result based on a variant of the Kronecker Lemma.
On Cost-Effective Incentive Mechanisms in Microtask Crowdsourcing  [PDF]
Yang Gao,Yan Chen,K. J. Ray Liu
Computer Science , 2013,
Abstract: While microtask crowdsourcing provides a new way to solve large volumes of small tasks at a much lower price compared with traditional in-house solutions, it suffers from quality problems due to the lack of incentives. On the other hand, providing incentives for microtask crowdsourcing is challenging since verifying the quality of submitted solutions is so expensive that will negate the advantage of microtask crowdsourcing. We study cost-effective incentive mechanisms for microtask crowdsourcing in this paper. In particular, we consider a model with strategic workers, where the primary objective of a worker is to maximize his own utility. Based on this model, we analyze two basic mechanisms widely adopted in existing microtask crowdsourcing applications and show that, to obtain high quality solutions from workers, their costs are constrained by some lower bounds. We then propose a cost-effective mechanism that employs quality-aware worker training as a tool to stimulate workers to provide high quality solutions. We prove theoretically that the proposed mechanism, when properly designed, can obtain high quality solutions with an arbitrarily low cost. Beyond its theoretical guarantees, we further demonstrate the effectiveness of our proposed mechanisms through a set of behavioral experiments.
Algorithms for multi-armed bandit problems  [PDF]
Volodymyr Kuleshov,Doina Precup
Computer Science , 2014,
Abstract: Although many algorithms for the multi-armed bandit problem are well-understood theoretically, empirical confirmation of their effectiveness is generally scarce. This paper presents a thorough empirical study of the most popular multi-armed bandit algorithms. Three important observations can be made from our results. Firstly, simple heuristics such as epsilon-greedy and Boltzmann exploration outperform theoretically sound algorithms on most settings by a significant margin. Secondly, the performance of most algorithms varies dramatically with the parameters of the bandit problem. Our study identifies for each algorithm the settings where it performs well, and the settings where it performs poorly. Thirdly, the algorithms' performance relative each to other is affected only by the number of bandit arms and the variance of the rewards. This finding may guide the design of subsequent empirical evaluations. In the second part of the paper, we turn our attention to an important area of application of bandit algorithms: clinical trials. Although the design of clinical trials has been one of the principal practical problems motivating research on multi-armed bandits, bandit algorithms have never been evaluated as potential treatment allocation strategies. Using data from a real study, we simulate the outcome that a 2001-2002 clinical trial would have had if bandit algorithms had been used to allocate patients to treatments. We find that an adaptive trial would have successfully treated at least 50% more patients, while significantly reducing the number of adverse effects and increasing patient retention. At the end of the trial, the best treatment could have still been identified with a high level of statistical confidence. Our findings demonstrate that bandit algorithms are attractive alternatives to current adaptive treatment allocation strategies.
An Optimal Dynamic Mechanism for Multi-Armed Bandit Processes  [PDF]
Sham M. Kakade,Ilan Lobel,Hamid Nazerzadeh
Computer Science , 2010,
Abstract: We consider the problem of revenue-optimal dynamic mechanism design in settings where agents' types evolve over time as a function of their (both public and private) experience with items that are auctioned repeatedly over an infinite horizon. A central question here is understanding what natural restrictions on the environment permit the design of optimal mechanisms (note that even in the simpler static setting, optimal mechanisms are characterized only under certain restrictions). We provide a {\em structural characterization} of a natural "separable: multi-armed bandit environment (where the evolution and incentive structure of the a-priori type is decoupled from the subsequent experience in a precise sense) where dynamic optimal mechanism design is possible. Here, we present the Virtual Index Mechanism, an optimal dynamic mechanism, which maximizes the (long term) {\em virtual surplus} using the classical Gittins algorithm. The mechanism optimally balances exploration and exploitation, taking incentives into account.
Double or Nothing: Multiplicative Incentive Mechanisms for Crowdsourcing  [PDF]
Nihar B. Shah,Dengyong Zhou
Computer Science , 2014,
Abstract: Crowdsourcing has gained immense popularity in machine learning applications for obtaining large amounts of labeled data. Crowdsourcing is cheap and fast, but suffers from the problem of low-quality data. To address this fundamental challenge in crowdsourcing, we propose a simple payment mechanism to incentivize workers to answer only the questions that they are sure of and skip the rest. We show that surprisingly, under a mild and natural "no-free-lunch" requirement, this mechanism is the one and only incentive-compatible payment mechanism possible. We also show that among all possible incentive-compatible mechanisms (that may or may not satisfy no-free-lunch), our mechanism makes the smallest possible payment to spammers. We further extend our results to a more general setting in which workers are required to provide a quantized confidence for each question. Interestingly, this unique mechanism takes a "multiplicative" form. The simplicity of the mechanism is an added benefit. In preliminary experiments involving over 900 worker-task pairs, we observe a significant drop in the error rates under this unique mechanism for the same or lower monetary expenditure.
Page 1 /100
Display every page Item


Home
Copyright © 2008-2017 Open Access Library. All rights reserved.