Adopters

Where LWS is used and integrated in

Adopters, Integrations and Presentations

Adopters

This is based on public documentations, please open an issue if you would like to be added or removed the list.

AWS:

  • Amazon EKS supports to run superpod with LeaderWorkerSet to server large LLMs, see blog here.
  • A Terraform based EKS Blueprints pattern can be found here. This pattern demonstrates an Amazon EKS Cluster with an EFA-enabled nodegroup that support multi-node inference using vLLM and LeaderWorkerSet.

DaoCloud: LeaderWorkerSet is the default deployment method to run large models crossing multiple nodes on Kubernetes.

Google Cloud:

  • GKE leverages LeaderWorkerSet to deploy and serve multi-host gen AI large open models, see blog here.
  • A guide to serve DeepSeek-R1 671B or Llama 3.1 405B on GKE, see guide here

Nvidia: LeaderWorkerSet deployments are the recommended method for deploying Multi-Node models with NIM, see document here.

Integrations

Feel free to submit a PR if you use LeaderWorkerSet in your project and want to be added here.

llmaz: llmaz, serving as an easy to use and advanced inference platform, uses LeaderWorkerSet as the underlying workload to support both single-host and multi-host inference scenarios.

vLLM: vLLM is a fast and easy-to-use library for LLM inference, it can be deployed with LWS on Kubernetes for distributed model serving, see documentation here.

sglang: sglang, a fast serving framework for large language models and vision language models. It can be deployed with LWS on Kubernetes for distributed model serving, see documentation here

Talks and Presentations

Last modified March 14, 2025: Creating LWS site (#426) (884af7f)