Back to Projects

Cortex

Workflow-Aware Resource Pooling and Scheduling for Agentic Serving
Posted: October 13, 2025
Tags: Workflow-awareness, Agentic orchestration, Stage isolation
Cortex

Cortex is a prototype workflow-aware serving platform designed for agentic workloads. The core principle of Cortex is stage isolation: it provisions dedicated resource pools for each distinct stage of an agentic workflow. This simple yet powerful strategy mitigates inter-stage interference in compute and memory, leading to better KV cache utilization, higher throughput, and more predictable performance. By customizing resource allocation and scheduling within each distinct stage of agentic workflows, Cortex lays the groundwork for more advanced, agent-native serving paradigms, including malleable resource management, speculative execution of workflow branches, and a shared, multi-tiered cache for “agentic state.”

Contributors

  • Nikos Pagonas
  • ,
  • Yeounoh Chung
  • ,
  • Kostis Kaffes
  • ,
  • Arvind Krishnamurthy

Publications

  • Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving
    1st Workshop on Systems for Agentic AI (SAA 2025) - 2025
    View Publication →