Hierarchical Queues
Multi-level queue tree with quotas, limits, and over-quota borrowing across teams.
KAI Scheduler is a Kubernetes-native scheduler for AI workloads that optimises GPU allocation across the full AI lifecycle, from data processing through training to inference, while keeping resource shares fair across teams.
$ kubectl get queues
NAME PRIORITY PARENT CHILDREN
department-a medium team-research-a,team-inference
department-b medium team-training,team-research-b
team-inference high department-a
team-research-a medium department-a
team-training medium department-b
team-research-b low department-b
$ kubectl describe queue team-research-a
Name: team-research-a
Spec:
parentQueue: department-a
resources:
cpu: { quota: 64, limit: 128, overQuotaWeight: 1 }
gpu: { quota: 8, limit: 16, overQuotaWeight: 1 }
Status:
allocated:
cpu: 52
gpu: 14
$ kubectl get schedulingshards
NAME AGE
default 12dBin-packing and gang-scheduling pack GPUs tighter, so fewer fragmented nodes carry more workloads in flight.
Hierarchical queues with deserved-share and time-based fairness keep every team moving without throttling.
Designed and continuously tested to manage large-scale GPU clusters with thousands of nodes and high-throughput workloads, with topology awareness and GPU sharing.
From quick interactive notebooks to multi-node distributed training, KAI keeps your GPU fleet busy without starving anyone.
Multi-level queue tree with quotas, limits, and over-quota borrowing across teams.
All-or-nothing placement for distributed training; min and max replicas that grow and shrink with available capacity, governed by fairness rules.
Time slicing, MPS, and MIG so inference and dev workloads stop hoarding whole devices.
Optimised placement with topology-aware scheduling, plus hierarchical topology-aware scheduling for hierarchical PodGroups.
Per-queue priority classes plus per-workload priority. Critical jobs preempt cleanly, non-critical jobs back off.
Tracks historical GPU usage over a configurable window so over-quota resources are distributed fairly across time, not at a single moment.
KAI Scheduler is a Cloud Native Computing Foundation sandbox project.
The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our Trademark Usage page.