HAMi

Kubernetes GPU virtualization and heterogeneous accelerator scheduling for AI infrastructure.

HAMi stands for Heterogeneous AI Computing Virtualization Middleware. Formerly known as k8s-vGPU-scheduler, HAMi helps platform teams share expensive GPUs and other AI accelerators across Kubernetes workloads, isolate device memory and compute, and schedule pods with device-aware policies without changing application code.

HAMi is a CNCF Sandbox and CNCF Landscape project. It is also listed in the CNAI Landscape.

Why HAMi?

AI infrastructure teams often run into the same Kubernetes accelerator problems: whole GPUs are allocated to small jobs, teams compete for scarce devices, different accelerator vendors expose different operational models, and schedulers lack enough device context to place workloads efficiently.

HAMi provides a Kubernetes-native layer for:

Device sharing: allocate a fraction of a physical accelerator by memory, core, or device count.
Resource isolation: enforce per-workload accelerator memory and compute limits where the device backend supports it.
Device-aware scheduling: place pods with topology-aware, binpack, spread, and device-specific scheduling policies.
Heterogeneous AI clusters: manage NVIDIA GPUs, NPUs, DCUs, MLUs, and other accelerator types through one scheduling and allocation workflow.
Zero application changes: keep using standard Kubernetes resource requests and limits.
Production operations: expose metrics, dashboards, WebUI, Helm installation, and community-supported deployment guidance.

Use Cases

Increase GPU utilization in shared Kubernetes AI clusters.
Run multi-tenant notebook, training, and inference workloads on the same accelerator pool.
Build private cloud AI platforms with fair device allocation and quota control.
Operate heterogeneous accelerator clusters across NVIDIA, Ascend, Cambricon, Hygon, Iluvatar, MetaX, Moore Threads, and other vendors.
Combine HAMi with Kubernetes schedulers such as kube-scheduler and Volcano for batch AI workloads.

How It Works

HAMi is composed of a mutating webhook, scheduler extender, device plugins, and device-specific in-container virtualization components.

Pod submission
  -> HAMi mutating webhook
  -> HAMi scheduler filter / score / bind
  -> device allocation written to pod annotations
  -> device plugin Allocate()
  -> container runtime environment
  -> HAMi monitor and metrics

Device Virtualization

HAMi lets workloads request only the accelerator resources they need. For example, the following pod asks for one physical NVIDIA GPU with 3 GiB of GPU memory:

resources:
  limits:
    nvidia.com/gpu: 1
    nvidia.com/gpumem: 3000

The workload sees the allocated device resources inside the container, while HAMi coordinates scheduling, allocation, and isolation.

Notes:

After installing HAMi, the value of nvidia.com/gpu registered on the node defaults to the number of vGPUs.

When requesting resources in a pod, nvidia.com/gpu refers to the number of physical GPUs required by the current pod.

Supported Devices

HAMi supports multiple heterogeneous accelerator backends, including GPUs, NPUs, DCUs, MLUs, GCUs, XPUs, and more. Device capabilities vary by vendor, model, driver, and hardware generation.

See the current HAMi supported devices page for the maintained support matrix.

Quick Start

Prerequisites

For the NVIDIA device plugin path, prepare:

NVIDIA driver >= 440
nvidia-docker version > 2.0
NVIDIA configured as the default runtime for containerd, Docker, or CRI-O
Kubernetes >= 1.23
glibc >= 2.17 and < 2.30
Linux kernel >= 3.10
Helm > 3.0

Install With Helm

Label GPU nodes so HAMi can manage them:

kubectl label nodes <node-name> gpu=on

Add the HAMi Helm repository:

helm repo add hami-charts https://project-hami.github.io/HAMi/
helm repo update

Install HAMi:

helm install hami hami-charts/hami -n kube-system

Verify that the scheduler and device plugin are running:

kubectl get pods -n kube-system

When hami-device-plugin and hami-scheduler are both Running, submit an example workload:

kubectl apply -f examples/nvidia/default_use.yaml

For the complete installation guide and configuration options, see the HAMi documentation.

Scheduling Policies

HAMi supports multiple scheduling modes for AI workloads:

binpack: pack workloads onto fewer nodes or devices to improve consolidation.
spread: distribute workloads across nodes or devices to reduce contention.
topology-aware scheduling: choose device combinations based on GPU topology when supported.
dynamic MIG: create and allocate NVIDIA MIG instances dynamically for supported cards and modes.

HAMi works with the default Kubernetes scheduler path and can also be used with Volcano for batch-oriented AI workloads. See the HAMi website for current scheduler integration guides.

Observability And WebUI

HAMi exposes metrics for monitoring cluster accelerator usage. After installation, metrics are available through the scheduler monitor endpoint:

http://<scheduler-ip>:<monitor-port>/metrics

The default monitor port is 31993. You can change it with Helm values such as --set scheduler.service.monitorPort=<port>.

HAMi also provides:

HAMi-WebUI for visual cluster and device management.
Grafana dashboard examples for accelerator monitoring.
Benchmark material for evaluating workload behavior and scheduling effects.

Roadmap, Governance, And Contributing

HAMi is governed by maintainers and contributors. Governance is described in the HAMi community repository.

To contribute code, documentation, tests, or device backend improvements, read CONTRIBUTING.md.

Community

The HAMi community is open to users, contributors, hardware vendors, and platform teams building Kubernetes-based AI infrastructure.

Website: project-hami.io
Discord: Join the HAMi Discord (recommended)
Slack: #hami-dev on CNCF Slack
Mailing list: hami-project
Meeting notes and agenda
Chinese community meeting: Friday 16:00 UTC+8, weekly — Meeting link
English community meeting: Wednesday 16:00 UTC+8, biweekly — Meeting link

Talks And References

Event	Talk
CHINA CLOUD COMPUTING INFRASTRUCTURE DEVELOPER CONFERENCE, Beijing 2024	Unlocking heterogeneous AI infrastructure on k8s clusters
KubeDay Japan 2024	Unlocking Heterogeneous AI Infrastructure K8s Cluster: Leveraging the Power of HAMi
KubeCon + AI_dev Open Source GenAI & ML Summit China 2024	Is Your GPU Really Working Efficiently in the Data Center? N Ways to Improve GPU Usage
KubeCon + AI_dev Open Source GenAI & ML Summit China 2024	Unlocking Heterogeneous AI Infrastructure K8s Cluster
KubeCon Europe 2024	Cloud Native Batch Computing with Volcano: Updates and Future

License

HAMi is licensed under the Apache License 2.0. See LICENSE for details.

Copyright Contributors to HAMi, established as HAMi a Series of LF Projects, LLC.

Name		Name	Last commit message	Last commit date
Latest commit History 1,322 Commits
.github		.github
benchmarks		benchmarks
charts		charts
cmd		cmd
docker		docker
docs		docs
examples		examples
hack		hack
imgs		imgs
lib/nvidia		lib/nvidia
libvgpu @ 8c32de6		libvgpu @ 8c32de6
pkg		pkg
skill		skill
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
.golangci.yaml		.golangci.yaml
.trivyignore		.trivyignore
AUTHORS.md		AUTHORS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPENDENCY.md		DEPENDENCY.md
HAMi.jpg		HAMi.jpg
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
Makefile		Makefile
Makefile.defs		Makefile.defs
NOTICE.txt		NOTICE.txt
OWNERS		OWNERS
README.md		README.md
README_cn.md		README_cn.md
README_ja.md		README_ja.md
SECURITY.md		SECURITY.md
VERSION		VERSION
example.yaml		example.yaml
go.mod		go.mod
go.sum		go.sum
version.mk		version.mk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HAMi

Why HAMi?

Use Cases

How It Works

Device Virtualization

Supported Devices

Quick Start

Prerequisites

Install With Helm

Scheduling Policies

Observability And WebUI

Roadmap, Governance, And Contributing

Community

Talks And References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HAMi

Why HAMi?

Use Cases

How It Works

Device Virtualization

Supported Devices

Quick Start

Prerequisites

Install With Helm

Scheduling Policies

Observability And WebUI

Roadmap, Governance, And Contributing

Community

Talks And References

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages