Welcome to LWS

Read the docs Github

An API for deploying a group of pods as a unit of replication for AI/ML Inference Workloads

LeaderWorkerSet (LWS) is an API for deploying a group of pods as a unit of replication.

It aims to address common deployment patterns of AI/ML inference workloads, especially multi-host inference workloads where the LLM will be sharded and run across multiple devices on multiple nodes.

Use LWS to orchestrate distributed AI/ML Inference workoads with out of the box support for rolling updates, topology-aware placement, and all-or-nothing restart for failure handling

Contributions welcome!

We do a Pull Request contributions workflow on GitHub. New users are always welcome!

Read more …

Connect with us

Talk to contributors on #wg-batch channel

Read more …

Join the mailing group

Join the conversation on the mailing group

Read more …