代码库
Kubernetes enhancements for Network Topology Aware Gang Scheduling & Autoscaling
Go
agenticauto-scalingauto-scaling-groupdisaggregatedgang-schedulinggpugroveinferencekubernetesleader-workermultinodeoperatorrole-basedtopology-aware-scheduling
NVIDIA AITune is an inference toolkit designed for tuning and deploying Deep Learning models with a focus on NVIDIA GPUs.
Python
deep-learninginferencenvidianvidia-gpu
A Datacenter Scale Distributed Inference Serving Framework
Rust