Cactus Compute
startupOn-device AI with cloud fallback
Cactus Compute is an AI infrastructure company focused on running models close to users. Its Cactus engine targets low-latency inference on smartphones, laptops, wearables, and edge devices while keeping cloud fallback available for harder requests. The company positions Cactus as a single toolkit for speech, vision, and language models. Its public materials emphasize automatic routing between on-device and cloud execution, lower latency, privacy for sensitive workloads, and lower inference cost when simple requests can stay local. Cactus also publishes open-source work, including the Cactus runtime and Needle, a tiny tool-calling model. The company website lists Y Combinator backing and a team with backgrounds from Oxford, Salesforce, Google, AWS, MIT, and other organizations.
AI Models by Cactus Compute
Large language models from the same organization.
| Model | Context Window | Price (In / Out per M) |
|---|---|---|
| NeedleCurrent | -- | Free / Free |
Similar Companies
Other organizations building in the same space.
Meta
technology company
Build the future of human connection and the technology that makes it possible
xAI
ai lab
Understand the Universe
Tsinghua University
enterprise
Research university with active AI and foundation-model work
OpenBMB
open source
Open-source efficient AI models and agent infrastructure
Prior Labs
startup
Tabular foundation models for structured data
AI4Finance Foundation
open source
Open-source financial AI research and tooling