Schedule AI Workloads at Scale

KAI Scheduler is a Kubernetes-native scheduler for AI workloads that optimises GPU allocation across the full AI lifecycle, from data processing through training to inference, while keeping resource shares fair across teams.

Why KAI

01

Efficient

Bin-packing and gang-scheduling pack GPUs tighter, so fewer fragmented nodes carry more workloads in flight.

02

Fair

Hierarchical queues with deserved-share and time-based fairness keep every team moving without throttling.

03

Scalable

Designed and continuously tested to manage large-scale GPU clusters with thousands of nodes and high-throughput workloads, with topology awareness and GPU sharing.

How it works

One scheduler. Every workload.

From quick interactive notebooks to multi-node distributed training, KAI keeps your GPU fleet busy without starving anyone.

WORKLOADShero-training-jobkind:PyTorchJobinference-svckind:Deploymentnotebookkind:Podbatch-evalkind:Jobserving-clusterkind:RayClusterKAI SchedulerQueues · Actions · PluginsGPU CLUSTER22 nodes · 3 node groupsA100 nodes8 nodes6/8 allocatedA100A100A100A100A100A100A100A100H100 nodes8 nodes7/8 allocatedH100H100H100H100H100H100H100H100GB200 nodes6 nodes4/6 allocatedGB200GB200GB200GB200GB200GB200

Purpose-built for managing AI workloads on Kubernetes

Hierarchical Queues

Multi-level queue tree with quotas, limits, and over-quota borrowing across teams.

Gang Scheduling & Elastic Workloads

All-or-nothing placement for distributed training; min and max replicas that grow and shrink with available capacity, governed by fairness rules.

GPU Sharing

Time slicing, MPS, and MIG so inference and dev workloads stop hoarding whole devices.

Topology-Aware

Optimised placement with topology-aware scheduling, plus hierarchical topology-aware scheduling for hierarchical PodGroups.

Queue & Workload Priority

Per-queue priority classes plus per-workload priority. Critical jobs preempt cleanly, non-critical jobs back off.

Time-based Fairshare

Tracks historical GPU usage over a configurable window so over-quota resources are distributed fairly across time, not at a single moment.

Explore all features →

KAI Scheduler is a Cloud Native Computing Foundation sandbox project.

Cloud Native Computing Foundation

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.