Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem.

Udari Madhushani
Princeton Universiery
Naomi Leonard
Princeton Universiery
European Control Conference ECC 2019

Overview

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an algorithm for each agent to maximize its own expected cumulative reward and prove performance bounds that depend on the sociability of the agents and the network structure. We use the bounds to predict the rank ordering of agents according to their performance and verify the accuracy analytically and computationally.
project
Average expected cumulative group regret agent 1 with observation probability \(p_1=1\). Agent 1 perfoms better when its neighbors do not observe their neighbors.

Other related works


Bibtex

@inproceedings{madhushani2019heterogeneous,
  title={Heterogeneous stochastic interactions for multiple agents in a multi-armed bandit problem},
  author={Madhushani, Udari and Leonard, Naomi Ehrich},
  booktitle={2019 18th European Control Conference (ECC)},
  pages={3502--3507},
  year={2019},
  organization={IEEE}
}