Cactus Compute
startupOn-device AI with cloud fallback
Cactus Compute is an AI infrastructure company focused on running models close to users. Its Cactus engine targets low-latency inference on smartphones, laptops, wearables, and edge devices while keeping cloud fallback available for harder requests. The company positions Cactus as a single toolkit for speech, vision, and language models. Its public materials emphasize automatic routing between on-device and cloud execution, lower latency, privacy for sensitive workloads, and lower inference cost when simple requests can stay local. Cactus also publishes open-source work, including the Cactus runtime and Needle, a tiny tool-calling model. The company website lists Y Combinator backing and a team with backgrounds from Oxford, Salesforce, Google, AWS, MIT, and other organizations.
AI Models by Cactus Compute
Large language models from the same organization.
| Model | Context Window | Price (In / Out per M) |
|---|---|---|
| NeedleCurrent | -- | Free / Free |
Similar Companies
Other organizations building in the same space.
xAI
ai lab
Understand the Universe
Prior Labs
startup
Tabular foundation models for structured data
OpenBMB
open source
Open-source efficient AI models and agent infrastructure
Intel
enterprise
Innovation starts here
Tsinghua University
enterprise
Research university with active AI and foundation-model work
Docker
enterprise
We simplify the lives of developers building world-changing apps