Work in Progress
Multi-robot optimal coverage in unknown fields.
Improving sample efficiency of multi-agent \(Q\)-Learning.
Bio-inspired risk aversion in sequential decision making.
A fundamental drawback in natural extensions of UCB to multi-agent multi-armed bandits (Workshop paper).
Theme: Importance-sampling for data-efficient RL (2020)
Hamiltonian \(Q\)-Learning: Leveraging Importance-sampling for Data Efficient RL
We propose a data efficient modification of the \(Q\)-learning approach which uses Hamiltonian Monte Carlo to compute \(Q\) function for problems with stochastic, high-dimensional dynamics.
Theme: Cost-effective communication protocols for distributed bandits (2019-2020)
We study how agents can minimize communication cost by deciding when and what to communicate depending on the sequence of options they chose.
Broadcast when Exploring: Cost-effective Communication in Distributed Stochastic Bandits
Cooperative Bandits: A Class of Communication Protocols with Logarithmic Communication Cost
Distributed Learning: Sequential Decision Making in Resource-Constrained Environments
PML4DC workshop, ICLR 2020
We design a partial communication protocol that obtains the same order of performance as full communication for a significantly smaller communication cost.
A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem
We propose a new communication protocol for multi-agent multi-armed bandit problem that improves group performance with only a logarithmic communication cost.
Theme: Role of network structure and agent heterogeneity in Multi-agent Bandits (2017-2020)
We study multi-agent multi-armed bandit problem where agents observe their neighbors probabilistically.
Decentralized Stochastic Bandits with Probabilistic Communications
Distributed Bandits: Probabilistic Communication on \(d\)-regular Graphs
We analyze how agent-based strategies contribute to minimizing group regret under communication failures
Heterogeneous Explore-Exploit Strategies on Multi-Star Networks
IEEE Control Systems Letters, 2020
For distributed bandits with a multi-star communication graph, we show how sampling rules for center agents that favor exploring over exploiting make the information that center agents broadcast to their neighbors more useful and improve group performance.
Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem
We consider the case where each agent observes all its neighbors independently with the same probability. We show that the performance of each agent depends on observation probabilities of its own and its neighbors.
Theme: Geometric Controls for Path Planning (2015-2018)
Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model
We propose a geometric PID controller to stabilize a three-link planar bipedal hybrid dynamic walking robot.
Semi-globally Exponential Trajectory Tracking for a Class of Spherical Robots
We propose a geometric feedback controller for spherical robots capable of tracking a desired position on an inclined plane, in the presence of parameter uncertainty and uncertainty of the inclination of the rolling surface.
Feedback Regularization and Geometric PID Control for Trajectory Tracking of Mechanical Systems: Hoop Robots on an Inclined Plane
We propose a geometric control strategy for semi-almost global output tracking for a class of interconnected under actuated mechanical systems.
I work as an assistant-in-teaching at Princeton University.