DAPLab - Data, Agents, and Processes

Upcoming Events

We solved trust for AI Agents in 1973 (we just forgot) Jacopo Tagliabue Seminar

12PM February 17, 2026 CSB 453

Most enterprises do not trust agents to operate on production data lakes: as a result, data engineering is lagging behind the general programming population in leveraging coding assistants and AI tools. We argue that the popular strategy of trading generality for perceived trust is misguided: the agentic lakehouse should happen without having to assume trust and global coordination at all. We operationalize our agenda by re-interpreting in the OLAP world three insights from the MVCC literature: process isolation, data versioning, declarative APIs. Taken together, they allow a swarm of agents to perform safe and concurrent work on production data. Going from theory to practice, we describe the core implementation choices behind Bauplan - from git-for-data to transactional pipelines and SDK design -, and share lessons learned and open problems from running data workloads at scale.

Jacopo Tagliabue is the co-founder of Bauplan, a data and AI infrastructure company. Previously, he co-founded Tooso (acquired by TSX:CVO), led Coveo’s AI through its IPO, and built Coveo Labs, whose libraries, models, and datasets have garnered thousands of stars and millions of downloads. His research spans AI, NLP, information retrieval, data management, and computer systems, with collaborations at Netflix, NVIDIA, Stanford, and Wisconsin-Madison. While building Bauplan, he moonlights as a professor of ML Systems at NYU (which is only notable because it’s the only job he’s ever had that his parents understand).

TBA Amit Agrawal, Structured Template Labs Entrepreneurship

11:40AM February 19, 2026 Davis Auditorium

Scalable Image AI via Self-Designing Storage Utku Sirin Seminar

12PM February 24, 2026 CSB 453

Image AI is very expensive. We show that the root cause of the problem is a long-overlooked and largely unexplored dimension: storage. Storage determines how much data is moved and processed. Most images today are stored as JPEG files. JPEG is designed for the human eye. Image AI applications, however, span a wide range of domains, such as histopathology and robotics, each with very different characteristics and requirements. Using a single file format across all applications and domains results in inefficient and costly AI systems. This talk presents Image Calculator, a self-designing file format that finds the optimal storage for a given image AI task. Image Calculator achieves this by identifying design primitives for image storage and co-designing image storage with application domains. It creates a design space of thousands of candidate formats based on these design primitives, each capable of storing and representing data differently, with varying accuracy and cost trade-offs. It efficiently searches within this design space by using locality among its file formats. It exploits the inherent frequency structure in image data to efficiently serve inference and training requests. We evaluate Image Calculator across diverse datasets, tasks, models, and hardware, and show that it can generate file formats that improve accuracy by up to 8%, reduce end-to-end inference and training time by up to 14.2x, and reduce storage space by up to 8.2x compared to state-of-the-art image file formats.

Utku Sirin is a postdoctoral researcher at the Data and AI Systems Lab at Harvard University, advised by Stratos Idreos. He is interested in making AI systems efficient via vertical integration and principled design. His work on images led to the first image file format designed for AI workloads, Image Calculator, enabling a data-centric view of image AI pipelines and novel system architectures. Utku received the Microsoft Research PhD Fellowship in 2017 and the Swiss National Science Foundation Postdoctoral Fellowship in 2020 and 2023. He is also a winner of the ACM SIGMOD Student Research Competition (2017) and a recipient of distinguished reviewer awards at ICDE 2023, EDBT 2025, and VLDB 2025. Prior to Harvard, Utku earned his PhD from the Data-Intensive Applications and Systems Lab at EPFL, advised by Anastasia Ailamaki on hardware-conscious database systems. In his free time, Utku performs theatrical acting and plays classical guitar.

TBA Ivan Burazin, Daytona Entrepreneurship

11AM March 05, 2026 Davis Auditorium

TBA Neil Daswani, Firebolt Ventures Entrepreneurship

11AM March 19, 2026 Davis Auditorium

TBA Vaikkunth Mugunthan Seminar

12PM March 31, 2026 CSB 453

TBA Sidharth Shanker, Baseten Entrepreneurship

11AM April 02, 2026 Davis Auditorium

TBA Alexander L. Gaeta Entrepreneurship

11AM April 16, 2026 Davis Auditorium

Past Events

It's a surprise! Parag Agrawal, Parallel.ai Entrepreneurship

11AM February 05, 2026 Davis Auditorium

Parag Agrawal is the founder of Parallel Web Systems, a company unlocking the web for AI agents. Previously, he spent 11 years at Twitter, where he joined as an engineer before serving as CTO, and then CEO. Parag has a PhD from Stanford University in Computer Science and a Bachelor’s degree in Computer Science and Engineering from IIT, Bombay.

AI Agents That Reverse Engineer Legacy Code at Scale Asaf Bord Seminar

12PM January 27, 2026 CSB 480

Most enterprises still run mission-critical workloads on legacy systems—Informatica, Sybase, COBOL, stored procedures, ETL pipelines written decades ago. Modernizing them requires thousands of hours just to understand what the code does. This talk presents a practical, deployed approach for building AI agents that analyze, map, and explain legacy codebases with high reliability. Based on real production work at a Fortune 100, we cover the architecture, evaluation patterns, safety guardrails, and workflow design that make these agents viable. We also show how deploying them through GitHub Copilot (GHCP) became both a technical accelerator and an organizational adoption driver. Attendees will leave with a clear blueprint for how to make agentic automation safe, predictable, and impactful in real modernization programs.

Asaf Bord has 20+ years of experience in data and AI, working across startups and enterprise companies like eBay, WeWork, and Northwestern Mutual. At Northwestern Mutual, he leads the company’s Data and GenAI efforts—from corporate strategy to hands-on project delivery from POC to production. Asaf also co-founded Multinear, an open-source platform for evaluating and maintaining GenAI systems.

Building Liquid AI Ramin Hasani, Liquid AI Entrepreneurship

7PM December 16, 2025 Davis Auditorium

AI progress today is constrained by rising energy use, cost, latency, and complexity. In this talk, I will share the scientific and entrepreneurial journey behind Liquid AI, an MIT spin-off founded to rethink artificial intelligence from first principles. I will describe how breakthroughs in brain-inspired, continuous-time neural networks enabled a new class of Liquid Foundation Models that deliver high-quality multimodal intelligence with dramatically lower compute, memory, and latency. The lecture will explore how these technical advances translate into company-building strategy: designing models around hardware, deploying AI on-device and at scale, and creating defensibility beyond the frontier-model race. I will conclude with lessons on fundraising, partnerships, talent, and building sustainable AI businesses that can operate at planetary scale.

Ramin Hasani is the co-founder and CEO of Liquid AI and a research affiliate at MIT CSAIL. Previously, he was jointly appointed as a Principal Machine Learning Scientist at the Vanguard Group and a Research Affiliate at MIT CSAIL. Ramin’s research focuses on robust deep learning and decision-making in complex dynamical systems. Prior to that, he was a Postdoctoral Associate at CSAIL MIT, leading research on modeling intelligence and sequential decision-making. He received his Ph.D. degree with distinction in Computer Science from the Vienna University of Technology, Austria. His Ph.D. dissertation and continued research on Liquid Neural Networks got recognized internationally with numerous nominations and awards.

AI for IT Automation: Benchmarking and Agent Development Research @ IBM Saurabh Jha, Yu Deng, Daby Sow, Ruchi Mahindru Seminar

12PM December 16, 2025 CSB 453

IT outages remain frequent and expensive and stem from diverse causes spanning network failures, third-party/cloud dependencies, configuration drift, and software bugs—making automation uniquely challenging in the IT domain. This talk introduces ITBench, an open and extensible benchmark of real-world IT automation scenarios with a rigorous evaluation framework, designed to assess and improve LLM/agent performance across SRE, FinOps, and security tasks—and to enable gym-style RL fine-tuning. We describe our broader benchmarking effort: how we operationalize scenarios + environments, standardize evaluation, and build a community around open-source agents and tooling to make results reproducible and comparable. We also share our Kaggle leaderboards release (ITBench Snapshots)—a diagnosis-focused subset intended to catalyze open evaluation and rapid iteration. The leaderboards highlight a key reality: even strong models still struggle with end-to-end reliability, and baseline performance remains far from “production-ready.” Finally, we outline how we’re pushing beyond today’s architectures—showing progress toward 80%+ accuracy in targeted settings—and why treating ITBench as a gym-like environment is a practical path to train RL-driven agents that achieve higher accuracy at lower cost.

Validation Techniques for Offensive Security Agents Brendan Dolan-Gavitt Seminar

12PM December 09, 2025 CSB 453

Large language models are increasingly helping to automate vulnerability discovery and exploit development in real-world software. However, naïvely asking LLMs to identify vulnerabilities leads to a deluge of false positives that can drown out real findings. In this talk, we will present techniques that enable AI agents to find vulnerabilities at scale, fully autonomously and with minimal false positives. The key to our approach is developing robust exploit validators that can conclusively determine whether an exploit claimed by the agent is real, allowing the agent to make arbitrarily many attempts without the amount of human effort needed to review the results. Using these techniques, we were able to test thousands of web apps found on Docker Hub, identifying over 200 zero days and obtaining multiple CVEs, and identify vulnerabilities in black-box bug bounty targets from HackerOne, identifying more than 1000 vulnerabilities and reaching the #1 rank in the US on the HackerOne leaderboard.

Brendan Dolan-Gavitt is an AI researcher at XBOW, where he helps build agents to identify critical vulnerabilities in software before they can be exploited by attackers (or attackers’ agents!). Prior to joining XBOW, he was an associate professor at NYU Tandon, researching software security and machine learning.

AI Entrepreneurship: Obstacles, Opportunities, and Outcomes Anish Das Sarma, Reinforce Labs Entrepreneurship

7PM December 02, 2025 Davis Auditorium

Building an AI startup today is exciting, but also uniquely challenging. In this talk, I’ll offer a transparent, behind-the-scenes look at the realities of modern AI entrepreneurship. We are in one of the fastest-moving technology cycles in history. Drawing from firsthand experience as a repeat founder, I’ll unpack the full spectrum of an entrepreneur’s journey exploring obstacles, opportunities, and outcomes. You will leave with an honest, grounded understanding of what it takes, what to expect, and why, despite the challenges, this may be one of the most exciting times in history to build in AI.

Anish Das Sarma is the Founder and CEO of Reinforce Labs, a startup dedicated to ensuring safe, secure, and compliant adoption of enterprise AI systems. A repeat founder, Anish previously built Trooly, an AI-powered identity and trust platform that was acquired by Airbnb, where he went on to lead key initiatives in AI, trust & safety. Most recently, Anish served as a Director of Engineering at Google, where he led large-scale AI/ML teams across Google Ads Safety.

Anish holds a Ph.D. in Computer Science from Stanford University and a B.Tech in Computer Science from IIT Bombay. Across more than a decade in AI, he has combined deep technical expertise with hands-on operational leadership, building products and teams across multiple domains of applied ML.

Building Foundation Models at Scale: System Experiences and Challenges Jingren Zhou Seminar

11AM November 21, 2025 Davis Auditorium

The rapid evolution of AI has led to the emergence of massive and complex foundation models that require enormous computational resources, making efficient training and inference systems essential. Training such models requires large-scale distributed computation, effective overlap of computation and communication, sophisticated parallelization strategies, and robust fault- tolerant mechanisms. Inference systems, on the other hand, must support diverse workloads with varying service-level agreements (SLAs), rapidly integrate engineering optimizations, and carefully balance trade-offs among throughput, latency, cost, and availability, particularly in distributed environments. In this talk, I will discuss the major systems challenges in building large-scale foundation models, focusing on our experiences developing Qwen (large language models) and Wan (video generative models). I will also present ongoing research and system designs that enhance the efficiency of training and inference at scale, enabling more effective management of complex AI workloads in cloud environments.

Jingren Zhou ‘01SEAS, ‘04SEAS is the Chief Technology Officer at Alibaba Cloud, where he drives technology innovation and product development across a wide range of cloud computing services. He also leads the development of AI foundation models, such as Qwen and Wan models, and their applications in diverse business applications within Alibaba Cloud. Prior to this role, he played a key role in building Alibaba’s cloud-scale distributed data analytics platform and developing advanced techniques for personalized search, product recommendation, and advertising on Alibaba’s e-commerce platform. Before joining Alibaba, he was a veteran at Microsoft, focusing on big data and database research and development. His research interests include cloud computing, distributed systems, databases, and large-scale machine learning. He has served as PC co-chair and core committee member for many academic conferences and technical forums. He received his PhD in Computer Science from Columbia University. He is a Fellow of ACM and IEEE.

AI Site Reliability Engineers: Automating Incident Response in Complex Systems Anish Agrawal, Traversal Entrepreneurship

7PM November 18, 2025 Davis Auditorium

In this presentation, I will describe ongoing work at Traversal, where our team is developing an AI Site Reliability Engineer (SRE) designed to assist enterprises in diagnosing and mitigating production incidents, with the broader goal of improving the resilience of large-scale, mission- critical systems. I will outline why incident troubleshooting is rapidly becoming a central bottleneck in realizing end-to-end automation of the software development lifecycle (SDLC), particularly as organizations adopt increasingly complex cloud-native architectures and integrate AI-driven tooling across their operational workflows.

From a research perspective, automated incident response represents a technically rich and largely underexplored problem space. I will highlight how it brings together challenges at the intersection of agentic system design, large language model evaluation, causal inference on unstructured data, and time-series modeling of high-dimensional telemetry. These domains converge in the task of enabling AI systems to form hypotheses, reason under uncertainty, and propose actionable remediations in environments characterized by incomplete information, noisy signals, and strict reliability constraints. Our goal in this work is not only to build a practical system, but also to surface open problems and novel research opportunities for the broader community.

Anish Agarwal is the CEO and Cofounder of Traversal, a startup building AI Site Reliability Engineer agents to help teams diagnose and remediate complex production incidents, and an Assistant Professor at Columbia University. His work focuses on causal machine learning and data-driven decision-making in complex, real-world systems.

Enterprise App Platform Powered by Data and AI Justin DeBrabant Seminar

12PM November 18, 2025 CSB 453

Modern enterprises increasingly rely on the ability to rapidly iterate and deploy custom applications that tap into their core data assets. However, the transition from a “vibe coding” prototype to a production-ready application remains a significant hurdle due to the complexities of infrastructure management, security, and data governance. In this talk, Justin DeBrabant will be discussing how Databricks Apps addresses this “last-mile” challenge by providing a managed, serverless framework that allows developers to build and host applications directly alongside their data. He will break down the operational friction caused by moving data to external runtimes and demonstrate how integrating identity and governance into a single platform accelerates the delivery of interactive data-intelligence tools.

Justin DeBrabant is the the Director of Product Management at Databricks, where he leads the initiative to enable developers to build data and AI applications directly on the Lakehouse platform. He previously served as the Chief Product Officer at ActionIQ and worked on the Big Data Platform at Two Sigma. He has also served as an Adjunct Professor at the University of Massachusetts Dartmouth. He holds a Ph.D. in Computer Science from Brown University, where his research focused on main-memory database architectures and “anti-caching” techniques for high-performance transaction processing.

2025 Undergraduate Computer and Data Science Research fair Workshop

5PM November 06, 2025

The UCDS research fair celebrates the leading undergraduate research across Columbia University’s many schools as well as Barnard College. This is co-organized with Columbia’s Data Science Institute, and check more details here!

Warehouse Adaptation of Robots and AI Systems Vivian Zhang, WarehouseRobot.ai Entrepreneurship

7PM November 04, 2025 Davis Auditorium

Autonomous warehouse systems represent one of the most advanced and large-scale examples of embodied multi-agent intelligence operating in the physical world. Thousands of mobile robots collaborate continuously to move, pick, and sort goods in dynamic, uncertain environments—achieving high efficiency, adaptability, and robustness in real time.

In this talk, I will discuss how warehouse robotics offers both a practical foundation and conceptual inspiration for the design of AI systems. I will introduce the market context and technical underpinnings of large-scale robotic automation, then explore how intelligence emerges through coordination, perception, and control across distributed agents. I will also discuss how large language models (LLMs) may extend this paradigm—enabling natural language tasking, high-level reasoning, and seamless human–AI collaboration in embodied systems.

Finally, I will reflect on how lessons from large-scale robotic coordination can inform the broader design of the warehouse automation ecosystems, highlighting new directions for research at the intersection of autonomy, learning, and collective intelligence.

Vivian Zhang is the Chief Executive Officer of WarehouseRobot.AI, where she leads research and development in intelligent robotics, large-scale multi-agent coordination, and autonomous warehouse systems. Her work spans planning under uncertainty, human–AI collaboration, and the integration of large language models into embodied intelligence. Vivian’s perspective bridges industrial-scale deployment and AI research, with a focus on how real-world robotics can inform the next generation of AI agents and collective intelligence systems. Her portfolio of AI startups and AI VC fund can be found at https://viviancompanies.com/ and at https://extelligenceinvest.com/.

Vivian was named one of the Top 100 Most Influential Chinese by Forbes and recognized among 50 top most well known Data Scientists by CBNData. Vivian’s always happy to hear from founders and fellows — feel free to reach out anytime!

October 2025 Workshop: AI Agents for Work Workshop

9AM October 27, 2025

On October 27, 2025, DAPLab ran the second iteration of the AI Agents for Work workshop. The two-day event brought together researchers, practitioners, students, faculty, and industry partners to explore the latest advances and real-world deployments of agentic AI systems. The first day was a public program featuring cutting-edge research talks, war stories from production deployments, and engaging panel discussions. The second day was a private, invitation-only session for DAPLab industry partners, providing an opportunity to deeply engage with Columbia students and faculty, as well as fellow partners, through focused discussions and collaborative activities.

How to Build Cursor for Law Thomas Bueler-Faudree, August Entrepreneurship

7PM October 21, 2025 Davis Auditorium

Modern document-heavy work in law, finance, and consulting is repetitive, bespoke, and review heavy. Code generation tools like Cursor have reshaped how developers work, but that shift has not happened in law. Legal tasks are hard to measure and even harder to “unit test,” which is why many AI tools miss the mark.

This talk shows how August is building agents that do better: the right indexing methods, visible state, disciplined tool use, and citations at every step to cleanly automate real professional workflows. We’ll walk through a multi-step research agent (e.g., over SEC filings) that plans the task, retrieves the right sources, routes to the relevant sections, and drafts grounded legal answers.

Attendees will leave with a lightweight design pattern for credence domains like law. We will also share our growth and go-to-market. We are productizing law-firm workflows and, in some cases, selling with the firm to their clients. This is a new model that is gaining traction across professional services.

Thomas Bueler-Faudree is the co-founder of August, where he leads engineering and product. He earned a bachelor’s degree in computer science and history from Columbia University. At Columbia he worked in the CRIS Lab and CML.

Disaggregated LLM Inference: Past, Present, and Future Junda Chen Seminar

12PM October 14, 2025 CSB 453

Large language model (LLM) serving faces a fundamental challenge: the prefill phase is compute-bound while the decode phase is memory-bound. Existing serving systems co-locate these two phases and batch their computation across users and requests, a strategy that not only introduces strong prefill–decode interference but also couples resource allocation and parallelism choices across both phases. We introduce DistServe, a low-latency serving engine that assigns prefill and decode computation to different GPUs, thereby eliminating interference between the two. DistServe co-optimizes resource allocation and parallelism strategies tailored for each phase, while placing them across the cluster to account for bandwidth constraints and minimize the communication overhead introduced by disaggregation. As a result, DistServe substantially improves serving performance, measured as the maximum request rate that can be sustained under strict time-to-first-token (TTFT) and time-per-output-token (TPOT) constraints. Since DistServe, disaggregation has moved into the spotlight of LLM inference, with many companies and open-source projects adopting or extending this paradigm in practice. We will also highlight related academic and open-source work that has emerged around disaggregated inference, and reflect on the open challenges faced today as well as the opportunities that lie ahead.

Junda Chen is a third-year Ph.D. student in Computer Science at the University of California, San Diego. His research focuses on large-scale LLM systems, with an emphasis on efficient inference and training of large language models. He is an author of DistServe, a disaggregated inference system adopted by major industry partners (including NVIDIA Dynamo, llm-d, etc.). His broader work spans GPU scheduling, distributed inference frameworks, and the design of next-generation serving systems for foundation models.

Agents at Scale: Lessons from LinkedIn's Agent Journey Xiaofeng Wang, LinkedIn Entrepreneurship

7PM October 07, 2025 Davis Auditorium

Join Xiaofeng Wang, Sr. Engineering Manager at LinkedIn’s Agents Platform, as he takes you behind the scenes of the platform powering all of LinkedIn’s member-facing agentic applications at scale. This talk will trace the GenAI product revolution, from simple prompt‑in‑string use cases to the latest LinkedIn Hiring Assistant. It will show how the Agent Platform evolved to enable these experiences. Xiaofeng will share the technical foundations, organizational challenges, hiring strategies, and hard‑won lessons that shaped LinkedIn’s journey, highlighting how the team bridged open‑source innovation with enterprise‑grade infrastructure, built seamless developer experiences, and enforced trust and responsible AI from day one. Whether you’re a researcher, engineer, product leader, or entrepreneur, this talk offers a front-row look at LinkedIn’s agentic journey, the latest trends in agent development, and what it takes to transform prototypes into impactful, production-grade agents at enterprise scale.

Xiaofeng Wang is a Senior Engineering Manager at LinkedIn, where he leads the Agents Platform team, building the foundational technologies that enable the entire engineering organization to deliver agentic experiences. His team is responsible for the agent runtime platform, agent framework, and Cognitive Memory Agent. He and his team have also been invited to speak at various leading industry conferences. In his previous role at LinkedIn, Xiaofeng led the first team dedicated to Generative AI Foundations, developing infrastructure, platforms, and tooling from the ground up to power all of LinkedIn’s member-facing GenAI products since 2022. Before that, he worked as an engineer and data scientist in the Data Science Productivity and Research team, building platforms to automate advanced data science analyses such as A/B testing, deep dives, and observational causal studies. Prior to LinkedIn, Xiaofeng worked at Bank of America Merrill Lynch, where he led the team building recommendation systems for the Merrill Edge’s personalized investing platform. He received his Ph.D. in Systems and Information Engineering from the University of Virginia.

Keep Computing Systems Safe under Disruptive Agents Tianyin Xu Seminar

12:30PM October 07, 2025 CSB 453

The brains of computing systems today have increasingly been realized by generative AI such as Large Language Models (LLMs) together with agentic technologies. While the AI approaches are arguably more creative and cheaper to develop, their safety becomes a fundamental challenge. Instead of using another LLM as a judge or waiting for a provably trustworthy model to exist, we may want to figure out how to build seatbelts and airbags for computing systems driven by LLMs/agents. In this talk, I will share our preliminary experience of developing a site reliability engineering (SRE) agent which attempts to autonomously mitigate production system failures. I would also discuss the potential of advanced system intelligence – AI that can understand “systems” beyond code and reason about their safety properties.

Tianyin Xu is an Associate Professor of Computer Science at the University of Illinois Urbana-Champaign (UIUC). His research focuses on building reliable computer systems that empower next-generation computing. He has been on the UIUC List of Teachers Ranked as Excellent for nine times. His work received two OSDI Jay Lepreau Best Paper Awards, two ASPLOS Best Paper Awards, two SIGSOFT Distinguished Paper Awards, a Gilles Muller Best Artifact Award, a CACM Research Highlight, and an ICML Spotlight. He is also a recipient of the C.W. Gear Outstanding Junior Faculty Award, a Dean’s Award for Excellence in Research, an Intel Rising Star Faculty Award, and a Facebook Distributed Systems Research Award.

Reality Checks Kyunghyun Cho Seminar

12PM September 30, 2025 CSB 453

Despite its amazing success, leaderboard chasing has become something researchers dread and mock. When implemented properly and executed faithfully, leaderboard chasing can lead to both faster and easily reproducible progress in science, as evident from the amazing progress we have seen with machine learning, or more broadly artificial intelligence, in recent decades. It does not however mean that it is easy to implement and execute leaderboard chasing properly. In this talk, I will go over four case studies demonstrating the issues that ultimately prevent leaderboard chasing from a valid scientific approach. The first case study is on the lack of proper hyperparameter tuning in continual learning, the second on the lack of consensus on evaluation metrics in machine unlearning, the third on the challenges of properly evaluating the evaluation metrics in free-form text generation, and the final one on wishful thinking. By going over these cases, I hope we can collectively acknowledge some of our own fallacies, think of underlying causes behind these fallacies and come up with better ways to approach artificial intelligence research.

Kyunghyun Cho is a professor of computer science and data science at New York University and an executive director of frontier research at the Prescient Design team within Genentech Research & Early Development (gRED). He became the Glen de Vries Professor of Health Statistics in 2025. He is also a CIFAR Fellow of Learning in Machines & Brains and an Associate Member of the National Academy of Engineering of Korea. He served as a (co-)Program Chair of ICLR 2020, NeurIPS 2022 and ICML 2022. He was one of the three founding Editors-in-Chief of the Transactions on Machine Learning Research (TMLR) until 2024. He was a research scientist at Facebook AI Research from June 2017 to May 2020 and a postdoctoral fellow at University of Montreal until Summer 2015 under the supervision of Prof. Yoshua Bengio, after receiving MSc and PhD degrees from Aalto University April 2011 and April 2014, respectively, under the supervision of Prof. Juha Karhunen, Dr. Tapani Raiko and Dr. Alexander Ilin. He received the Samsung Ho-Am Prize in Engineering in 2021. He tries his best to find a balance among machine learning, natural language processing, and life, but almost always fails to do so.

Lessons for Building Valuable Domain-Specific AI Products Tom Effland, Noetica Entrepreneurship

7PM September 23, 2025 Davis Auditorium

This talk shares lessons learned going from NLP researcher to building Noetica, a platform that analyzes populations of complex contracts to determine market standards for deal terms. Along the way I’ll discuss (1) how LLMs have enabled a new opportunity space for hybrid systems that combine extraction capabilities with formal analytical frameworks, (2) how building these systems still requires deep fusion of subject-matter and technical judgment to identify tractable decompositions of seemingly impossible problems, and (3) how this converges to a new version of the Technical Product Manager role. This role is highly suited to technical researchers willing to tackle sector-specific challenges.

Tom Effland is the technical founder and CTO of Noetica AI, a fast-growing VC-backed Series A startup. The company’s AI-powered knowledge platform helps many top law firms improve outcomes in valuable corporate debt, securities, and M&A transactions. Before founding Noetica, Tom earned his PhD in Computer Science from Columbia University. He was advised by Prof. Michael Collins and received support from a NSF Graduate Research Fellowship.

Simulation Environments for testing and training task-focused AI agents Mehdi Jamei Seminar

12PM September 16, 2025 CSB 453

AI agents hold great value for enterprises but are notoriously difficult to productionize in a robust and reliable way. Enterprise agents must internalize task context, constraints, and success metrics to be reliable, yet learning directly in production is risky and often infeasible. Veris is a simulation-first experiential learning platform for training and evaluating domain-specific AI agents through simulated experience. I will present how Veris constructs high-fidelity interactive environments that mirror target deployments, including tool APIs, data schemas, user behaviors, and organizational policies. Then I will discuss how Veris supports multiple post-training paths like RL with online rollouts and trajectory replay, and iterative prompt or policy optimization. Finally, I will outline current challenges and research opportunities.

Mehdi is the cofounder and CEO of Veris AI, an agentic infrastructure company based in NY, building simulation training grounds for enterprise AI agents. Veris is named on the Future50 list by the Generalist and is backed by Decibel Ventures and ACrew Capital.
Mehdi is an applied AI researcher, builder, and industry leader. He holds a PhD in Electrical Engineering and Computer Science and an M.S. in Physics, both from UC Berkeley. Most recently, he served as the Director of AI at System.com

Focus is Everything - How to Effectively Navigate the AI Landscape Aaron Vontell, Anthropic Entrepreneurship

7PM September 09, 2025 Davis Auditorium

In this talk, Aaron will walk through his journey building in the AI space, detailing his learnings from Instabase, his own AI startup Regression Games, and now his experience at Anthropic. This talk will focus on the importance of focusing on specific problems and navigating frequently-changing technological landscapes, and how to apply cutting-edge research in record time.

MassGen: Multi-Agent Scaling System Chi Wang Seminar

12PM September 09, 2025 CSB 453

Chi will present MassGen, a growing open-source multi-agent scaling system that tackles single-agent limitations through collaborative AI coordination. Built on AG2’s research foundation, the system implements novel coordination of frontier AI agents. Early benchmarking across reasoning and instruction-following tasks shows statistically significant improvements, with growing industrial and academic adoption demonstrating the research opportunity for scaling intelligence through collaborative emergence.
Chi Wang is founder of AutoGen (now AG2), the open-source AgentOS to support agentic AI, and its parent open-source project FLAML, a fast library for AutoML & tuning. He has received multiple awards such as best paper of ICLR’24 LLM Agents Workshop and SIGKDD Data Science/Data Mining PhD Dissertation Award. He has 15+ years of research experience in Computer Science from Google DeepMind, Microsoft Research, Meta, UIUC and Tsinghua.

Fall 2025 Agentic System Made Real Course Junfeng Yang Course

10:10AM - 12:00PM T September 02, 2025 SCEP 415

This second iteration of the course will be led by Junfeng Yang, with a focus on exploring the emerging frontier of AI agents in the workplace through a hands-on, entrepreneurial lens. Check the Spring 2025 version here.

From Serving LLMs to Serving Agents on the Cloud Xiaozhe Yao Seminar

12PM July 24, 2025

In this talk, I will discuss key challenges in building agentic AI systems in the cloud. I will highlight DeltaZip, our recent work on efficiently deploying multiple fine-tuned models – a step we believe is essential toward enabling future AI systems. The core insight behind DeltaZip is that fine-tuning often introduces small-magnitude changes to a pre-trained model. By co-designing the serving system and compression algorithm, DeltaZip achieves a 2x to 12x throughput improvement over state-of-the-art systems. In addition to this project, I will share some ongoing challenges we are tackling in this space. Xiaozhe Yao is a third-year doctoral student at Systems Group, Department of Computer Science, ETH Zurich advised by Prof. Dr. Ana Klimović. His research explores the complex and fundamental tensions between three pillars: from optimizing systems for efficient ML, to improving data quality and organization for ML, to developing frameworks that bridge the gap between algorithms and their practical deployment. Through this multi-faceted approach, his work aims to better understand and build AI systems.

The Road to High-Quality LLM Inference Services: System, Data, and Context Yizheng Jiao

12PM July 17, 2025

This talk is about sharing the experience of building enterprise LLM inference services including

high-level principles of increasing performance and saving costs 2. a data selection algorithm to finetune LLM to increase the accuracy of domain-specific questions 3. a method to enhance users’ prompt with domain-specific knowledge bases. Yizheng Jiao graduated from UNC Chapel Hill with a doctoral degree in 2022. He joined Bytedance after graduation and am doing research on LLM systems. His goal is to build efficient and accurate LLM services with experience includes LLM inference systems, data selection for LLM finetuning, and prompt optimization.

Multi-Agent Systems in the Era of LLMs: Testbeds, Applications, and Beyond Yusen Zhang

12PM July 10, 2025

Autonomous agents powered by large language models (LLMs) are emerging as powerful tools for a wide range of tasks. However, a single agent often faces performance ceilings, especially when tackling complex workflows like running an AI company or AI4Research, and is inherently limited in scenarios that involve multiple instances, such as simulations, embodied agents, and digital twins. In this talk, I will present Multi-Agent Large Language Models (MA-LLMs), a promising paradigm designed to overcome the fundamental limitations of single-agent systems. I will begin by highlighting three threads of my previous work that lay the groundwork for MA-LLMs. Next, I’ll introduce our research on fairness summarization, which demonstrates challenges that a single agent struggles to handle well. Then, I will present how agents can collaborate in a chain-of-agent manner to solve difficult tasks, such as long-document summarization and multi-step reasoning. Finally, I will reflect on current limitations in MA-LLMs and outline my long-term vision of building Agent Societies: a human-centric society consisting of scalable, trustworthy, and collaborative intelligent agents and humans.
Yusen Zhang is a fourth-year CS PhD student at Penn State University, advised by Dr. Rui Zhang. He has done industry research internships at Amazon, Microsoft, and Google. He also worked closely with Dr. Dragomir Radev. He received his master’s degree from Emory University, advised by Dr. Jinho D. Choi.

Efficient Fine-Tuning and Compression of Large Language Models: Towards Low-bit and Ultra-Low Parameter Solutions Jiajun Zhou

12PM July 07, 2025

Efficient fine-tuning of Large Language Models (LLMs) is crucial due to their substantial memory and computational demands. This seminar discusses recent advancements in techniques aimed at significantly reducing these costs, enabling effective adaptation of large-scale models even on resource-constrained hardware. The talk will begin with an overview of current challenges and mainstream approaches to compressing and fine-tuning LLMs, highlighting trade-offs between model size, accuracy, and efficiency. Subsequently, the speaker will introduce novel approaches that enable fine-tuning at extremely low precision and ultra-low parameter regimes, significantly reducing memory requirements without compromising performance. Finally, the discussion will cover recent progress and future directions for achieving efficient deployment of LLMs in real-world applications.
Jiajun Zhou is currently a Ph.D. student in the Department of Electrical and Electronic Engineering at the University of Hong Kong (HKU), supervised by Prof. Ngai Wong, and a visiting scholar at the University of California, Santa Barbara (UCSB). He received his Master’s degree in IC Design Engineering from the Hong Kong University of Science and Technology (HKUST) in 2019. He previously worked as a Research Assistant at the Chinese University of Hong Kong (CUHK). His research primarily focuses on developing innovative frameworks for efficient training and inference of Large Language Models (LLMs), particularly through quantization, low-bit optimization, and tensor decomposition. He has published extensively in AI and hardware acceleration venues, including NAACL, IEEE FCCM, and IEEE TCAD.\n”

March 2025 Workshop: AI Agents for Work Workshop

10AM March 12, 2025

On March 12, 2025, DAPLab ran the first annual workshop at the Columbia Business School. The one-day workshop to brought together over 200 industry leaders, Columbia faculty and students, and technologists who are interested in the concept of AI agents. Speakers and panelists come from enterprises that are deploying agentic solutions, technologists and infrastructure leaders, and researchers at leading AI labs as well as Columbia. These include Jason Wei from OpenAI who led their chain-of-thought and agentic work, Danielle Perszyk from Amazon AGI, Jonathan Frankle from Databricks, Deepak Dastrala from Intellect, Cong Yu who leads AI at Celonis, and more.

Spring 2025 Agentic System Made Real Course Eugene Wu and Kostis Kaffes Course

10:10AM - 11:25AM TR January 21, 2025

LLMs have opened new possibilities of automated agents that plan and complete tasks on the user’s behalf. Such agents have the potential to usher in a new industrial revolution by automating organizational processes. This graduate-level course will cut across the technology stack to examine the research questions that need to be answered for agents to be possible in real tasks that matter. Each session will review 1-3 papers or systems, and discuss research opportunities that arise from the gap between existing research and enterprise requirements. Topics will span systems (data systems and ML systems), AI (LLMs, agent-based planning), HCI, and theory (reinforcement learning, markets).