Udari Madhushani Sehwag

Udari Madhushani Sehwag

AI Research Scientist, JP Morgan

Visiting Researcher, Stanford Univeristy

I am a research scientist at JPMorgan AI Research where I lead efforts on improving capabilities, evaluating and aligning generative models. I am also a visiting postdoctoral researcher at Department of Computer Science, Stanford University. I completed my PhD at Princeton University, where I was advised by Prof. Naomi Leonard. My research vision is to embed AI agents with the ability to enhance their capabilities through collective intelligence, ultimately enabling them to seamlessly coexist with humans. In my research work I primarily focus on developing capable and aligned AI agents safely and responsibly. Before joining Princeton, I completed my undergraduate at University of Peradeniya, Sri Lanka.
Research Experience
stanford

Visiting researcher
Stanford University
August 2023 - Present

jpmc

Research scientist
JPMorgan AI Research
July 2023 - Present

princeton

Graduate student
Princeton University
September 2017 - May 2023

dm

Research scientist intern
Deepmind
May 2022 - September 2022

meta

Research scientist intern
FAIR
May 2021 - August 2021

siemens

Summer research intern
Siemens
May 2020 - August 2020


News

02/2024
Paper on "A Heterogeneous Agent Model of Mortgage Servicing: An Income-based Relief Analysis" at AIFinSi workshop, AAAI 2024.
12/2023
Paper on "O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models" at FMDM workshop, NeurIPS 2023.
07/2023
Defended my PhD.
05/2023
Paper on "Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas" at ALA workshop, AAMAS 2023.
05/2023
Defended my PhD.
12/2022
Presented our work "On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning" at CDC 2022.
12/2022
Organized a PNAS special issue symposium on "Collective Artificial Intelligence and Evolutionary Dynamics"
09/2022
Finished summer internship (Research Scientist Intern: Game Theory and Multi-agent team) at Deepmind.
07/2022
Paper on " A Regret Minimization Approach to Multi-Agent Control" at ICML 2022.
06/2021
Paper on "Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication" at ACC 2022.
12/2021
Presented our work "One More Step Towards Reality: Cooperative Bandits with Imperfect Communication" at NeurIPS 2021.
08/2021
Finished summer internship (Research Scientist Intern: FAIR labs) at Meta AI Research.
06/2021
Presented our work "Distributed Bandits: Probabilistic Communication on \(d\)-regular Graphs" at ECC 2021.
06/2021
Presented our work "Cost-effective Communication Strategies for Distributed Learning Systems" at ECC 2021.
05/2021
Presented our work "Heterogeneous Explore-Exploit Strategies on Multi-Star Networks" at ACC 2021.
12/2020
Presented our work "It Doesn't Get Better and Here's Why: A Fundamental Drawback in Natural Extensions of UCB to Multi-agent Bandits" at ICBINB workshop, NeurIPS 2020.
11/2020
Our paper "Heterogeneous Explore-Exploit Strategies on Multi-Star Networks" got accepted to IEEE Control Systems Letters.
09/2020
Received Britt and Eli Harari Fellowship from the Department of Mechanical and Aerospace Engineering, Princeton University.
08/2020
Finished summer internship (Graduate Intern: AI/Deep Learning for Predictive Analytics) at Siemens.
05/2020
Presented our work "A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem" at ECC 2020.
04/2020
Presented our work "Distributed Learning: Sequential Decision Making in Resource-Constrained Environments" at PML4DC workshop, ICLR 2020.
09/2019
Received Larisse Rosentweig Klein Memorial Award from the Department of Mechanical and Aerospace Engineering, Princeton University.
08/2019
Received a Presidential Award for Scientific Publication from the Sri Lankan National Research Council.
06/2019
Presented our work "Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem" at ECC 2019.
09/2018
Received Martin Summerfield Graduate Fellowship from the Department of Mechanical and Aerospace Engineering, Princeton University.
09/2018
Received Athena-Feron Prize from the Department of Mechanical and Aerospace Engineering, Princeton University.
06/2018
Presented our work "Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model" at ACC 2018.
03/2018
Received Elliotte Robinson Little '25 Student Aid Fund Fellowship from the School of Engineering and Applied Science, Princeton University.

Recent Projects

Work in Progress

I am co-organizing a Proceedings of National Academy of Science (PNAS) special issue on "Collective Artificial Intelligence"

with Karl Tuyls, Jakob Foerster, Joshua Plotkin and Arne Traulsen

Agentyx: A Framework for Building LLM Based Multi-agent Systems

with Leo Ardon, Jared Vann, Sivapriya Vellaichamy, Mani Ganapathy, Sumitra Ganesh

A Holistic Approach for Evaluating Agentic Systems

with Francesca Mosca, Deepeka Garg, Leo Ardon, Sumitra Ganesh

Weak to Strong Generalization

with Aakriti Agrawal, Mucong Ding, Furong Huang

Enhanced Embodied Intelligence Through Multimodal Foundation Model Based Agents.

with Arjun Karanam, Rika Antonova, Shuran Song, Jeannette Bohg


Generative AI: Social and resposible generative AI (2022-present)

GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-Time Alignment

Yuancheng Xu, Udari Madhushani Sehwag, Alec Koppel, Sicheng Zhu, Bang An, Furong Huang, Sumitra Ganesh

under review at Pluralistic Alignment workshop at NeurIPS 2024

under review at ICLR 2025

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment

Souradip Chakraborty, Sujay Batt, Udari Madhushani Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh

under review at Pluralistic Alignment workshop at NeurIPS 2024

under review at ICLR 2025

AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment

Pankayaraj Pathmanathan, Udari Madhushani Sehwag, Michael-Andrei Panaitescu-Liess, Furong Huang

under review at Safe Generative AI workshop at NeurIPS 2024

under review at ICLR 2025

Policy Dreamer: Diverse Public Policy Generation Via Elicitation and Simulation of Human Preferences

Arjun Karanam, Jose Ramon Enriquez, Udari Madhushani Sehwag, Kanishk Gandhi, Micheal Elabd, Noah Goodman, Sanmi Kyejo

under review at Pluralistic Alignment workshop at NeurIPS 2024

Generative AI Agents for Knowledge Work Augmentation in Finance

Sumitra Ganesh, Leo Ardon, Daniel Borrajo, Deepeka Garg, Udari Madhushani Sehwag, Annapoorni Narayanan, Giuseppe Canonaco, Manuela Veloso

under review at Annual Reviews 2024

Can LLMs be Scammed? A Baseline Measurement Study

Udari Madhushani Sehwag*, Kelly Patel*, Francesca Mosca*, Vineeth Ravi, Jessica Staddon

under review at EMNLP 2024

In-Context Learning with Topological Information for LLM-Based Knowledge Graph Completion

Udari Madhushani Sehwag*, Kassiani Papasotiriou*, Jared Vann, Sumitra Ganesh

under review at EMNLP 2024

SPIGM workshop at ICML 2024

O3D: Offline Data-Driven Discovery and Distillation for Sequential Decision Making with Large Language Models

Yuchen Xiao, Yanchao Sun, Mengda Xu, Udari Madhushani Sehwag, Jared Vann, Deepeka Garg, Sumitra Ganesh

COLM 2024

FMDM workshop at NeurIPS 2023

SORRY-Bench: A Systematic Evaluation on Large Language Model Safety Refusal Behaviors

Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Boyi Wei, Luxi He, Kaixuan Huang, Dacheng Li, Ying Sheng, Bo Li, Danqi Chen, Kai Li, Peter Henderson, Prateek Mittal

under review at NeurIPS 2024

AI Risk Management Should Understand and Account for Both Safety and Security

Xiangyu Qi, ....., Udari Madhushani Sehwag, ....., Prateek Mittal

under review 2024

Multi-agent Learning: Social intelligence in multi-agent RL (2015-present)

Autocratic Learning and Unilateral Incentive Alignment in Two-player Stochastic Games

Alex McAvoy, Udari Madhushani, Christian Hilbe, Wolfram Barfuss, Krishnendu Chatterjee, Qi Su, Naomi Ehrich Leonard, Joshua B. Plotkin

Accepted at PNAS 2024

Collective Cooperative Intelligence

Wolfram Barfuss, Jessica Flack, Chaitanya S. Gokhale, Lewis Hammond, Christian Hilbe, Joel Leibo, Tom Lenaerts, Naomi Leonard, Simon Levin, Udari Madhushani, Alex McAvoy, Janusz M. Meylahn, Fernando P. Santos

Accepted at PNAS 2024

Zero-shot generalization: We develop methods that allow agents to successfully interact with novel partners during test time in mixed motive games.

Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas

Udari Madhushani, Kevin McKee, John Agapiou, Joel Z Leibo, Thomas Anthony, Richard Everett Edward Hughes, Karl Tuyls, and Edgar Duéñez-Guzmán

AAMAS 2023

Effective communication: We study how agents can minimize communication cost by deciding when and what to communicate depending on the sequence of options they chose.

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland

NuerIPS 2021

Distributed Bandits: Probabilistic Communication on \(d\)-regular Graphs

Udari Madhushani, Naomi Ehrich Leonard

ECC 2021

We analyze how agent-based strategies contribute to minimizing group regret under communication failures

Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

Udari Madhushani, Naomi Ehrich Leonard

PML4DC workshop, ICLR 2020

We design a partial communication protocol that obtains the same order of performance as full communication for a significantly smaller communication cost.

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

Udari Madhushani, Naomi Ehrich Leonard

ECC, 2020

We propose a new communication protocol for multi-agent multi-armed bandit problem that improves group performance with only a logarithmic communication cost.

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

Udari Madhushani, Naomi Ehrich Leonard

IEEE Control Systems Letters, 2020

For distributed bandits with a multi-star communication graph, we show how sampling rules for center agents that favor exploring over exploiting make the information that center agents broadcast to their neighbors more useful and improve group performance.

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

Udari Madhushani, Naomi Ehrich Leonard

ECC, 2019

We consider the case where each agent observes all its neighbors independently with the same probability. We show that the performance of each agent depends on observation probabilities of its own and its neighbors.



Embodied AI: Human-robot coordination, control and planning (2015-present)

Multi-robot Learning and Coverage of Unknown Spatial Fields

Maria Santos, Udari Madhushani, Alessia Benevento, Naomi Leonard

MRS, 2021

We propose a novel explore-exploit based method for coverage in unknown special fields.

Feedback Regularization and Geometric PID Control for Robust Stabilization of a Planar Three-link Hybrid Bipedal Walking Model

Lasitha Weerakoon, Udari Madhushani, Sanjeeva Maithripala, Jordan Berg

ACC, 2018

We propose a geometric PID controller to stabilize a three-link planar bipedal hybrid dynamic walking robot.

Semi-globally Exponential Trajectory Tracking for a Class of Spherical Robots

Udari Madhushani, Sanjeeva Maithripala, Janaka Wijayakulasooriya, Jordan Berg

Automatica, 2017

We propose a geometric feedback controller for spherical robots capable of tracking a desired position on an inclined plane, in the presence of parameter uncertainty and uncertainty of the inclination of the rolling surface.

Feedback Regularization and Geometric PID Control for Trajectory Tracking of Mechanical Systems: Hoop Robots on an Inclined Plane

Udari Madhushani, Sanjeeva Maithripala, Jordan Berg

ACC, 2017

We propose a geometric control strategy for semi-almost global output tracking for a class of interconnected under actuated mechanical systems.


Deep RL: Importance-sampling for data-efficient RL (2020-2022)

On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

CDC 2022

We propose a data efficient modification of the \(Q\)-learning approach which uses Hamiltonian Monte Carlo to compute \(Q\) function for problems with stochastic, high-dimensional dynamics.


Teaching

I work as an assistant-in-teaching at Princeton University.

Fall '20
MAE 542 Advanced Dynamics.
Instructor: Naomi Leonard
Spring '20
MAE 502/APC 506 Mathematical Methods of Engineering Analysis.
Instructor: Clarence Rowley
Fall '19
MAE 345/MAE 549 Introduction to Robotics.
Instructor: Anirudha Majumdar