All Publications

All Agent-ready Systems Human-agent Collaboration Agent Intelligence Automation White & Position Papers
  1. Panprediction: optimal predictions for any downstream task and loss
    Sivaraman Balakrishnan, Nika Haghtalab, Daniel Hsu, Brian Lee, Eric Zhao
    AISTATS 2026
  2. Prior makes it possible: from sublinear graph algorithms to LLM test-time methods
    Avrim Blum, Daniel Hsu, Cyrus Rashtchian, Donya Saless
    AISTATS 2026
  3. Group-realizable multi-group learning by minimizing empirical risk
    Navid Ardeshir, Samuel Deng, Daniel Hsu, Jingwen Liu
    ALT 2026
  4. Please Don't Kill My Vibe: Empowering Agents with Data Flow Control
    Charlie Summers, Haneen Mohammed, Eugene Wu
    CIDR 2026 Slides
  5. LLM Generated Persona is a Promise with a Catch
    Ang Li, Haozhe Chen, Hongseok Namkoong, Tianyi Peng
    NeurIPS 2025 Position Paper
  6. Agents for Web Testing: A Case Study in the Wild
    Naimeng Ye, Xiao Yu, Ruize Xu, Tianyi Peng, Zhou Yu
    LAw Workshop at NeurIPS 2025
  7. Data Mixture Optimization: A Multi-Fidelity Multi-Scale Bayesian Framework
    Tzu-Ching Yen, Andrew Wei Tung Siah, Haozhe Chen, C. Daniel Guetta, Tianyi Peng, Hongseok Namkoong
    NeurIPS 2025
  8. Tail-Optimized Caching for LLM Inference
    Wenxin Zhang, Yueying Li, Ciamac C. Moallemi, Tianyi Peng
    NeurIPS 2025
  9. Multi-Agent Markov Entanglement
    Shuze Chen, Tianyi Peng
    NeurIPS 2025 (Spotlight)
  10. Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper
    Xinyue Zhu*, Binghao Huang*, Yunzhu Li
    NeurIPS 2025
  11. Q-learning with Posterior Sampling
    Priyank Agrawal, Shipra Agrawal, Azmat Azati
    NeurIPS 2025
  12. RAISE: Reliable Agent Improvement via Simulated Experience
    Sahar Omidi Shayegan, Joshua Meyer, Victor Shih, Sebastian Sosa, Tianyi Peng, Kostis Kaffes, Eugene Wu, Andi Partovi, Mehdi Jamei
    NeurIPS 2025 (SEA Workshop)
  13. LLM Agents for Always-On Operating System Tuning
    Georgios Liargkovas, Vahab Jabrayilov, Hubertus Franke, Kostis Kaffes
    NeurIPS 2025
  14. A Decade of Systems for Human Data Interaction
    Eugene Wu, Yiru Chen, Haneen Mohammed, Zezhou Huang
    ArXiV 2025
  15. SAGE: A Top-Down Bottom-Up Knowledge-Grounded User Simulator for Multi-turn AGent Evaluation
    Ryan Shea, Yunan Lu, Liang Qiu, Zhou Yu
    EACL 2026
  16. Set It and Forget It: Zero-Mod ML Magic for Linux Tuning
    Georgios Liargkovas, Prabhpreet Singh Sodhi, Kostis Kaffes
    PACMI Workshop at SOSP 2025
  17. Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving
    Nikos Pagonas, Yeounoh Chung, Kostis Kaffes, Arvind Krishnamurthy
    SAA Workshop at SOSP 2025
  18. Toward Systems Foundations for Agentic Exploration
    Jiakai Xu, Tianle Zhou, Eugene Wu, Kostis Kaffes
    SAA Workshop at SOSP 2025
  19. Suna: Scalable Causal Confounder Discovery over Relational Data
    Jiaxiang Liu, Siyuan Xia, Daniel Alabi, Eugene Wu
    VLDB 2025
  20. Performance of LLMs on Stochastic Modeling Operations Research Problems: From Theory to Practice.
    Akshit Kumar, Tianyi Peng, Yuhang Wu, Assaf Zeevi
    Winter Simulation Conference 2025
  21. Prompt Editor: A Taxonomy-driven System for Guided LLM Prompt Development in Enterprise Settings
    Jeffery Cao, Lampros Flokas, Yujian Xu, Eugene Wu, Xu Chu, Cong Yu
    SIGMOD Demo 2025
  22. Towards a Framework for Optimizing Hierarchical Text Segmentation using LLMs
    Lampros Flokas, Jeffrey Cao, Yujian Xu, Eugene Wu, Xu Chu, Cong Yu
    DEEM Workshop at SIGMOD 2025
  23. Position Paper: A System-Centric Approach is Necessary for AI Agents
    Nikos Pagonas, Haonan Wang, Jiaxiang Liu, Tianle Zhou, Deepak Dastrala, Raman Jatkar, Anirudh Sivaraman, Zhou Yu, Kostis Kaffes, Eugene Wu
    ArXiv 2025
  24. Twin-2K-500: A dataset for building digital twins of over 2,000 people based on their answers to over 500 questions
    Olivier Toubia, George Z. Gui, Tianyi Peng, Daniel J. Merlau, Ang Li, and Haozhe Chen
    Marketing Science
  25. Diversity Helps Jailbreak Large Language Models
    Weiliang Zhao, Daneil Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao
    NAACL 2025
  26. CrashFixer: A Crash Resolution Agent for the Linux Kernel
    Alex Mathai, Chenxi Huang, Suwei Ma, Jihwan Kim, Hailie Mitchell, Aleksandr Nogikh, Petros Maniatis, Franjo Ivančić, Junfeng Yang, Baishakhi Ray
    Arxiv 2025
  27. FeedQUAC: Quick Unobtrusive Agent-Generated Commentary
    Tao Long, Kendra Wannamaker, Jo Vermeulen, George Fitzmaurice, Justin Matejka
    arXiv 2025
  28. Steering Semantic Data Processing With DocWrangler
    Shreya Shankar, Bhavya Chopra, Mawil Hasan, Stephen Lee, Bjoern Hartmann, Joseph Hellerstein, Aditya Parameswaran, Eugene Wu
    UIST 2025
  29. Throughput-Optimal Scheduling Algorithms for LLM Inference and AI Agents
    Yueying Li, Jim Dai, Tianyi Peng
    Arxiv 2025
  30. AgentDynEx: Nudging the Mechanics and Dynamics of Multi-Agent Simulations
    Jenny Ma, Riya Sahni, Karthik Sreedhar, Lydia B. Chilton
    Under Submission
  31. DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
    Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G. Parameswaran, Eugene Wu
    VLDB 2025
  32. Program Synthesis Dialog Agents for Interactive Decision-Making
    Matthew Toles, Nikhil Balwani, Rattandeep Singh, Valentina Giulia Sartori Rodriguez, Zhou Yu
    ArXiv 2025
  33. How Well do LLMs Compress their Own Chain-of-Thought? A Token Complexity Approach
    Ayeong Lee, Ethan Che, Tianyi Peng
    ICML, Efficient Systems for Foundation Models Workshop 2025
  34. ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
    Xiao Yu, Baolin Peng, Vineeth Vajipey, Hao Cheng, Michel Galley, Jianfeng Gao, Zhou Yu
    ICLR 2025
  35. AnimationAgents: A Multi-Modal Team of Agents for Generating, Debugging, and Human Editing of Animation Code
    Vivian Liu, Rubaiat Habib Kazi, Li-Yi Wei, Matthew Fisher, Timothy Langlois, Seth Walker, Lydia B. Chilton
    CHI 2025
  36. ACE: A LLM Agent-based Negotiation Coaching System
    Ryan Shea, Aymen Kallala, Xin Lucy Liu, Michael W. Morris, Zhou Yu
    EMNLP 2024
  37. Fast Userspace Networking for the Rest of Us
    Alireza Sanaee, Vahab Jabrayilov, Ilias Marinos, Anuj Kalia, Divyanshu Saxena, Prateesh Goyal, Kostis Kaffes, Gianni Antichi
    ArXiv 2025
  38. DynEx: Agentic Assistance to Bridge Design and Code
    Jenny Ma, Karthik Sreedhar, Vivian Liu, Pedro Alejandro Perez, Sitong Wang, Riya Sahni, Lydia B. Chilton
    CHI 2025
  39. Data Cleaning Using Large Language Models
    Shuo Zhang, Zezhou Huang, Eugene Wu
    DAIS Workshop at ICDE 2025
  40. Alexpaca: Learning Factual Clarification Question Generation Without Examples
    Matthew Toles, Yukun Huang, Zhou Yu, Luis Gravano
    GEM^2 Workshop at ACL 2025
  41. KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution
    Alex Mathai, Chenxi Huang, Petros Maniatis, Aleksandr Nogikh, Franjo Ivančić, Junfeng Yang, Baishakhi Ray
    NeurIPS 2024
  42. Simulating Cooperative Prosocial Behavior with Multi-Agent LLMs
    Karthik Sreedhar, Alice Cai, Jenny Ma, Jeffrey V. Nickerson, Lydia B. Chilton
    IUI 2025