Stochastic bandits robust to adversarial corruptions

Renato Paes Leme

Thodoris Lykouris

Vahab Mirrokni

Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, ?114-122

Google Scholar

Abstract

We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be adversarially changed to trick the algorithm, e.g., click fraud, fake reviews and email spam. The goal of this model is to encourage the design of bandit algorithms that (i) work well in mixed adversarial and stochastic models, and (ii) whose performance deteriorates gracefully as we move from fully stochastic to fully adversarial models. We consider a setting with $K$ arms with underlying stochastic distributions such that $\Delta(a)$ is the difference between the mean of arm $a$ and the optimal arm $a^*$. The total corruption $C$ corresponds to the total perturbation that an adversary can add to the stochastic input. Our main result is an algorithm with regret bounded by {$\bigO\prn*{\sum_{a{\neq a^*}}\frac{K\cdot C\log\prn*{\nicefrac{KT}{\delta}}+\log\prn*{T}}{\Delta(a)}\cdot\log\prn*{\nicefrac{KT}{\delta}}}$ with $1-\delta$ probability and pseudoregret $\bigO\prn*{\sum_{a{\neq a^*}}\frac{K\cdot C+\log\prn*{T}}{\Delta(a)}\cdot\log\prn*{KT}}$.} Notably, our algorithm is agnostic to the total corruption and the bound holds with respect to the total corruption that was added in retrospect. We also provide a lower bound showing that the linear dependency on the corruption level is necessary.

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Stochastic bandits robust to adversarial corruptions

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Stochastic bandits robust to adversarial corruptions

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities