Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost
Citations Over TimeTop 10% of 1988 papers
Abstract
The authors consider multiarmed bandit problems with switching cost, define uniformly good allocation rules, and restrict attention to such rules. They present a lower bound on the asymptotic performance of uniformly good allocation rules and construct an allocation scheme that achieves the bound. It is found that despite the inclusion of a switching cost the proposed allocation scheme achieves the same asymptotic performance as the optimal rule for the bandit problem without switching cost. This is made possible by grouping together samples into blocks of increasing sizes, thereby reducing the number of switches to O(log n). Finally, an optimal allocation scheme for a large class of distributions which includes members of the exponential family is illustrated.>
Related Papers
- → An upper bound for the condition number of a matrix in spectral norm(2002)44 cited
- → Controlling Concurrent Accesses in Multimedia Database Systems(2001)
- Using Crosstool-ng to Construct the Cross-compiler Tool Chain(2011)
- ADHERE TO THE OUTLOOK OF SCIENTIFIC DEVELOPMENT AND CONSTRUCT THE PERSONNEL TRAINING SCHEME OF THE UNDERGRADUATE COURSES(2006)
- On Practical Teaching System Construction of Chinese Course and Teaching Theory(2012)