llm-d

Name: llm-d
Brand: llm-d
Availability: InStock
Rating: 5 (1 reviews)
Author: llm-d

AI InfrastructureFree

llm-d Kubernetes-native LLM inference serving stack

Last updated Jun 11, 2026

Claim Tool

What is llm-d?

llm-d is an open-source AI developer tool for Kubernetes-native distributed inference serving for large language models on modern accelerators and production infrastructure. It is useful when a builder wants a concrete workflow instead of another loose prompt. The project lives on GitHub, so the important facts are visible: source code, README, license, stars, forks, issues, and recent commit activity. The repository showed 3,341 stars, 521 forks, Apache-2.0 license, and activity on 2026-06-10. The core job is simple. llm-d gives teams a repeatable way to run AI-assisted work with clearer inputs, clearer outputs, and better review points. Provides distributed orchestration above model servers. It also supports intelligent routing and kv-cache management patterns and targets production llm serving on kubernetes and modern accelerators. That matters because agent work tends to fail in the gaps: missing context, weak handoff notes, unclear acceptance criteria, or one model confidently missing a problem. llm-d is aimed at closing those gaps before code, design, or deployment work reaches a customer. For day-to-day use, llm-d fits best inside an existing engineering loop. A developer can start from the GitHub repository, read the README, install or copy the project files, and connect it to the tools already used for coding-agent work. The README points users to documentation at llm-d.ai and production guides for routing and serving. The page should be treated as a practical builder reference, not a vendor landing page. The source is the project repository, and the best next step is to inspect the README, configuration examples, and issue tracker before putting it into a production workflow. The strongest use cases are platform engineers running llm workloads on kubernetes. These users already rely on Claude Code, Codex, Cursor, OpenCode, local models, Kubernetes, or similar AI infrastructure. They need process control more than hype. llm-d gives them a way to make agent work more explicit: what task is being attempted, which model or tool is involved, what evidence was produced, and where a human should review the result. Pricing is straightforward from the available source: open-source repository with no verified paid plan in the source data. There is no verified hosted SaaS price on the GitHub record used for this listing. Teams should still budget for the surrounding systems they run with it, such as model API calls, local compute, GitHub usage, Kubernetes clusters, or editor subscriptions. The tool itself is best evaluated by cloning the repo, running the documented setup, and testing it on a small internal task before moving it into a larger delivery process. The main tradeoff is maturity. Fast-moving open-source AI tools can change quickly, and README instructions may drift as models, CLIs, and package versions change. Check the latest release notes, open issues, and recent commits before adoption. If the project fits your stack, llm-d can make AI-assisted work more inspectable, less one-shot, and easier to hand off between people and agents.

llm-d's Top Features

Key capabilities that make llm-d stand out.

Provides distributed orchestration above model servers

Adds intelligent routing and KV-cache management patterns

Targets production LLM serving on Kubernetes and modern accelerators

Use Cases

Who benefits most from this tool.

Platform engineers running LLM workloads on Kubernetes

Use llm-d to organize high-scale inference serving, routing, and performance work around existing model servers.

AI infrastructure teams benchmarking open models

Follow the guides and benchmarks to tune inference deployments across accelerators and providers.

Explore Top AI Use Cases

llm-d's Pricing

Free plan available

User Reviews

Share your thoughts

If you've used this product, share your thoughts with other builders

Frequently Asked Questions

What is llm-d?

llm-d is an open-source distributed inference serving stack for running large language model workloads on Kubernetes.

Does llm-d replace vLLM or SGLang?

No. The README says model servers run models on accelerators while llm-d adds orchestration and optimizations above them.

Who founded llm-d?

The README states that llm-d is a CNCF sandbox project founded by Red Hat, Google Cloud, IBM Research, CoreWeave, and NVIDIA.

llm-d

AI InfrastructureFree

llm-d Kubernetes-native LLM inference serving stack

Last updated Jun 11, 2026

Claim Tool

What is llm-d?

llm-d's Top Features

Key capabilities that make llm-d stand out.

Provides distributed orchestration above model servers

Adds intelligent routing and KV-cache management patterns

Targets production LLM serving on Kubernetes and modern accelerators

Use Cases

Who benefits most from this tool.

Platform engineers running LLM workloads on Kubernetes

Use llm-d to organize high-scale inference serving, routing, and performance work around existing model servers.

AI infrastructure teams benchmarking open models

Follow the guides and benchmarks to tune inference deployments across accelerators and providers.

Explore Top AI Use Cases

llm-d's Pricing

Free plan available

User Reviews

Share your thoughts

If you've used this product, share your thoughts with other builders

Frequently Asked Questions

What is llm-d?

llm-d is an open-source distributed inference serving stack for running large language model workloads on Kubernetes.

Does llm-d replace vLLM or SGLang?

No. The README says model servers run models on accelerators while llm-d adds orchestration and optimizations above them.

Who founded llm-d?

The README states that llm-d is a CNCF sandbox project founded by Red Hat, Google Cloud, IBM Research, CoreWeave, and NVIDIA.

llm-d

What is llm-d?

llm-d's Top Features

Use Cases

Platform engineers running LLM workloads on Kubernetes

AI infrastructure teams benchmarking open models

Tags

llm-d's Pricing

User Reviews

Share your thoughts

Frequently Asked Questions

llm-d

What is llm-d?

llm-d's Top Features

Use Cases

Platform engineers running LLM workloads on Kubernetes

AI infrastructure teams benchmarking open models

Tags

llm-d's Pricing

User Reviews

Share your thoughts

Frequently Asked Questions

llm-d

What is llm-d?

llm-d's Top Features

Use Cases

Platform engineers running LLM workloads on Kubernetes

AI infrastructure teams benchmarking open models

Tags

llm-d's Pricing

User Reviews

Share your thoughts

Recent reviews

Frequently Asked Questions

llm-d

What is llm-d?

llm-d's Top Features

Use Cases

Platform engineers running LLM workloads on Kubernetes

AI infrastructure teams benchmarking open models

Tags

llm-d's Pricing

User Reviews

Share your thoughts

Recent reviews

Frequently Asked Questions