Udari Madhushani Sehwag

Udari Madhushani Sehwag

Visiting Postdoc, Stanford Univeristy

AI Research Scientist, JP Morgan

I am a research scientist at JPMorgan AI Research where I lead an effort on developing socially intelligent and aligned generative models. I am also a visiting postdoctoral researcher at Department of Computer Science, Stanford University working with Prof. Diyi Yang and Prof. Jeannette Bohg. I completed my PhD at Princeton University, where I was advised by Prof. Naomi Leonard. My research vision is to embed AI agents with the ability to enhance their capabilities through collective intelligence, ultimately enabling them to seamlessly coexist with humans, augmenting their cognitive and physical capabilities. In my research work I primarily focus on developing socially intelligent and aligned AI agents. Before joining Princeton, I completed my undergraduate at University of Peradeniya, Sri Lanka.
Research Experience
stanford

Visiting postdoc
Stanford University
August 2023 - Present

jpmc

Research scientist
JPMorgan AI Research
July 2023 - Present

princeton

Graduate student
Princeton University
September 2017 - May 2023

dm

Research scientist intern
Deepmind
May 2022 - September 2022

meta

Research scientist intern
FAIR
May 2021 - August 2021

siemens

Summer research intern
Siemens
May 2020 - August 2020


News

02/2024
Paper on "A Heterogeneous Agent Model of Mortgage Servicing: An Income-based Relief Analysis" at AIFinSi workshop, AAAI 2024.
12/2023
Paper on "O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models" at FMDM workshop, NeurIPS 2023.
07/2023
Defended my PhD.
05/2023
Paper on "Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas" at ALA workshop, AAMAS 2023.
05/2023
Defended my PhD.
12/2022
Presented our work "On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning" at CDC 2022.
12/2022
Organized a PNAS special issue symposium on "Collective Artificial Intelligence and Evolutionary Dynamics"
09/2022
Finished summer internship (Research Scientist Intern: Game Theory and Multi-agent team) at Deepmind.
07/2022
Paper on " A Regret Minimization Approach to Multi-Agent Control" at ICML 2022.
06/2021
Paper on "Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication" at ACC 2022.
12/2021
Presented our work "One More Step Towards Reality: Cooperative Bandits with Imperfect Communication" at NeurIPS 2021.
08/2021
Finished summer internship (Research Scientist Intern: FAIR labs) at Meta AI Research.
06/2021
Presented our work "Distributed Bandits: Probabilistic Communication on \(d\)-regular Graphs" at ECC 2021.
06/2021
Presented our work "Cost-effective Communication Strategies for Distributed Learning Systems" at ECC 2021.
05/2021
Presented our work "Heterogeneous Explore-Exploit Strategies on Multi-Star Networks" at ACC 2021.
12/2020
Presented our work "It Doesn't Get Better and Here's Why: A Fundamental Drawback in Natural Extensions of UCB to Multi-agent Bandits" at ICBINB workshop, NeurIPS 2020.
11/2020
Our paper "Heterogeneous Explore-Exploit Strategies on Multi-Star Networks" got accepted to IEEE Control Systems Letters.
09/2020
Received Britt and Eli Harari Fellowship from the Department of Mechanical and Aerospace Engineering, Princeton University.
08/2020
Finished summer internship (Graduate Intern: AI/Deep Learning for Predictive Analytics) at Siemens.
05/2020
Presented our work "A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem" at ECC 2020.
04/2020
Presented our work "Distributed Learning: Sequential Decision Making in Resource-Constrained Environments" at PML4DC workshop, ICLR 2020.
09/2019
Received Larisse Rosentweig Klein Memorial Award from the Department of Mechanical and Aerospace Engineering, Princeton University.
08/2019
Received a Presidential Award for Scientific Publication from the Sri Lankan National Research Council.
06/2019
Presented our work "Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem" at ECC 2019.
09/2018
Received Martin Summerfield Graduate Fellowship from the Department of Mechanical and Aerospace Engineering, Princeton University.
09/2018
Received Athena-Feron Prize from the Department of Mechanical and Aerospace Engineering, Princeton University.
06/2018
Presented our work "Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model" at ACC 2018.
03/2018
Received Elliotte Robinson Little '25 Student Aid Fund Fellowship from the School of Engineering and Applied Science, Princeton University.

Recent Projects

Work in Progress

I am co-organizing a Proceedings of National Academy of Science (PNAS) special issue on "Collective Artificial Intelligence"

with Karl Tuyls, Jakob Foerster, Joshua Plotkin and Arne Traulsen

A Holistic Approach to Collective Alignment.

with Alex McAvoy, Weiyan Shi, Diyi Yang, Sayash Kapoor, Peter Henderson, Arvind Narayanan

Socially Intelligent LLMs: Can LLMs Understand Reputation.

with Weiyan Shi

Generative policy.

with Arjun Karanam, José Enríquez, Sanmi Koyejo, Michael Bernstein

Coordinating Collaborative, Multi-Agent Manipulation through Large Language Models.

with Arjun Karanam, Rika Antonova, Mandi Zhao, Shuran Song, Jeannette Bohg


Generative AI: Social and resposible generative AI (2022-present)

Perspective on managing risks in Generative AI.

AI Risk Management Should Understand and Account for Both Safety and Security

Xiangyu Qi, ....., Udari Madhushani Sehwag, ....., Prateek Mittal

under review at ICML 2024

We develop methods for effective taks decomposition using LLMs.

O3D: Offline Data-Driven Discovery and Distillation for Sequential Decision Making with Large Language Models

Yuchen Xiao, Yanchao Sun, Mengda Xu, Udari Madhushani, Jared Vann, Deepeka Garg, Sumitra Ganesh

uder review at COLM 2024

FMDM workshop at NeurIPS 2023

Multi-agent Learning: Social intelligence in multi-agent RL (2015-present)

Autocratic Learning and Unilateral Incentive Alignment in Two-player Stochastic Games

Alex McAvoy, Udari Madhushani, Christian Hilbe, Wolfram Barfuss, Krishnendu Chatterjee, Qi Su, Naomi Ehrich Leonard, Joshua B. Plotkin

conditionally accepted at PNAS 2024

Collective Cooperative Intelligence

Wolfram Barfuss, Jessica Flack, Chaitanya S. Gokhale, Lewis Hammond, Christian Hilbe, Joel Leibo, Tom Lenaerts, Naomi Leonard, Simon Levin, Udari Madhushani, Alex McAvoy, Janusz M. Meylahn, Fernando P. Santos

under review at PNAS 2024

Zero-shot generalization: We develop methods that allow agents to successfully interact with novel partners during test time in mixed motive games.

Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas

Udari Madhushani, Kevin McKee, John Agapiou, Joel Z Leibo, Thomas Anthony, Richard Everett Edward Hughes, Karl Tuyls, and Edgar Duéñez-Guzmán

AAMAS 2023

Effective communication: We study how agents can minimize communication cost by deciding when and what to communicate depending on the sequence of options they chose.

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland

NuerIPS 2021

Distributed Bandits: Probabilistic Communication on \(d\)-regular Graphs

Udari Madhushani, Naomi Ehrich Leonard

ECC 2021

We analyze how agent-based strategies contribute to minimizing group regret under communication failures

Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

Udari Madhushani, Naomi Ehrich Leonard

PML4DC workshop, ICLR 2020

We design a partial communication protocol that obtains the same order of performance as full communication for a significantly smaller communication cost.

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

Udari Madhushani, Naomi Ehrich Leonard

ECC, 2020

We propose a new communication protocol for multi-agent multi-armed bandit problem that improves group performance with only a logarithmic communication cost.

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

Udari Madhushani, Naomi Ehrich Leonard

IEEE Control Systems Letters, 2020

For distributed bandits with a multi-star communication graph, we show how sampling rules for center agents that favor exploring over exploiting make the information that center agents broadcast to their neighbors more useful and improve group performance.

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

Udari Madhushani, Naomi Ehrich Leonard

ECC, 2019

We consider the case where each agent observes all its neighbors independently with the same probability. We show that the performance of each agent depends on observation probabilities of its own and its neighbors.



Embodied AI: Human-robot coordination, control and planning (2015-present)

Multi-robot Learning and Coverage of Unknown Spatial Fields

Maria Santos, Udari Madhushani, Alessia Benevento, Naomi Leonard

MRS, 2021

We propose a novel explore-exploit based method for coverage in unknown special fields.

Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model

Lasitha Weerakoon, Udari Madhushani, Sanjeeva Maithripala, Jordan Berg

ACC, 2018

We propose a geometric PID controller to stabilize a three-link planar bipedal hybrid dynamic walking robot.

Semi-globally Exponential Trajectory Tracking for a Class of Spherical Robots

Udari Madhushani, Sanjeeva Maithripala, Janaka Wijayakulasooriya, Jordan Berg

Automatica, 2017

We propose a geometric feedback controller for spherical robots capable of tracking a desired position on an inclined plane, in the presence of parameter uncertainty and uncertainty of the inclination of the rolling surface.

Feedback Regularization and Geometric PID Control for Trajectory Tracking of Mechanical Systems: Hoop Robots on an Inclined Plane

Udari Madhushani, Sanjeeva Maithripala, Jordan Berg

ACC, 2017

We propose a geometric control strategy for semi-almost global output tracking for a class of interconnected under actuated mechanical systems.


Deep RL: Importance-sampling for data-efficient RL (2020-2022)

On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

CDC 2022

We propose a data efficient modification of the \(Q\)-learning approach which uses Hamiltonian Monte Carlo to compute \(Q\) function for problems with stochastic, high-dimensional dynamics.


Teaching

I work as an assistant-in-teaching at Princeton University.

Fall '20
MAE 542 Advanced Dynamics.
Instructor: Naomi Leonard
Spring '20
MAE 502/APC 506 Mathematical Methods of Engineering Analysis.
Instructor: Clarence Rowley
Fall '19
MAE 345/MAE 549 Introduction to Robotics.
Instructor: Anirudha Majumdar