Research interests: Reinforcement learning; large language models; code generation and reasoning; value alignment; artificial general intelligence.
Post-training of vision language models for physical AI reasoning.
Developed a code agent that plans, verifies, and discovers new skills and knowledge.
Research on reinforcement learning and post-training of language models, with a focus on code generation and reinforcement learning from human feedback.
Research on meta-reinforcement learning and AI for scientific discovery.
Created a WebRTC-related internal tool to resolve cross-departmental communication issues.
Created a user interface for a programming language analysis tool for better visualization.
Advisors: Prof. Satinder Singh and Prof. Ed Durfee.
Research on value alignment and AI safety in reinforcement learning.
Undergraduate/master research advisors: Prof. Peter Stone and Prof. Dana Ballard.
Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble
arXiv, 2024
paper
Graph-Transformer-based Surrogate Model for Accelerated Converter Circuit Topology Design
Design Automation Conference (DAC), 2024
paper
Adaptive Online Replanning with Diffusion Models
Conference on Neural Information Processing Systems (NeurIPS), 2023
paper
Planning with Large Language Models for Code Generation
International Conference on Learning Representations (ICLR), 2023
paper
Hyper-Decision Transformer for Efficient Online Policy Adaptation
International Conference on Learning Representations (ICLR), 2023
paper
Prompting Decision Transformer for Few-shot Policy Generalization
International Conference on Machine Learning (ICML), 2022
paper
Power Converter Circuit Design Automation using Parallel Monte Carlo Tree Search
ACM Transactions on Design Automation of Electronic Systems (TODAES), 2022
paper
From Specification to Topology: Automatic Power Converter Design via Reinforcement Learning
International Conference on Computer Aided Design (ICCAD), 2021
paper
Efficiently Finding Approximately-Optimal Queries for Improving Policies and Guaranteeing Safety
Ph.D. Dissertation, 2020
paper
Querying to Find a Safe Policy Under Uncertain Safety Constraints in Markov Decision Processes
AAAI Conference on Artificial Intelligence (AAAI), 2020
paper
Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes
International Joint Conference on Artificial Intelligence (IJCAI), 2018
paper
Modeling Sensory-Motor Decisions in Natural Behavior
PLoS Computational Biology, 2018
paper
Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes
International Conference on Automated Planning and Scheduling (ICAPS), 2017
paper
Determining Placements of Influencing Agents in a Flock
Autonomous Agents and Multiagent Systems (AAMAS), 2015
paper
Autonomous Intersection Management for Semi-Autonomous Vehicles
Handbook of Transportation, 2015
paper
AAAI 2019, AISTATS 2023-24, CVPR 2023, ICML 2023-24, NeurIPS 2023-25, ICLR 2024-25.
Style modified from mikepqr/resume-markdown