Mistral Inference

DeveloperApplicationPricing unavailable

Mistral Inference - Run Mistral AI Models Locally on Your Hardware

Last updated May 4, 2026

Claim Tool

What is Mistral Inference?

Mistral Inference is the official Python library from Mistral AI for running their open-weight language models locally. It provides a streamlined way to download, load, and run inference on Mistral's entire model family. With Mistral Inference, developers can run models like Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral 22B, Mathstral 7B, Mistral Nemo, Mistral Large 2, Pixtral 12B, and Mistral Small 3.1 on their own hardware. The library supports both CLI-based demo commands for quick testing and Python APIs for programmatic inference. Installation is straightforward via pip (\`pip install mistral-inference\`) or from source using Poetry. Models can be downloaded from Mistral's CDN or Hugging Face Hub. The library uses xformers for optimized attention and supports multi-GPU setups through torchrun for larger models like the 8x7B and 8x22B Mixtral variants. All Mistral models support function calling capabilities, enabling structured output and tool use. The library also works with Hugging Face's safetensors format for easy model weight management. Several model variants use custom licenses (MNPL for Codestral, MRL for Mistral Large 2), so users should review license terms before deployment. Mistral Inference is ideal for developers who need private, local AI inference without API dependency. It is actively maintained by Mistral AI with 10,800+ GitHub stars and 1,000+ forks, Apache 2.0 licensed for the library code itself (model weights have their own licenses). The repository includes tutorials in Jupyter notebook format via Google Colab for getting started quickly.

Mistral Inference's Top Features

Key capabilities that make Mistral Inference stand out.

Download and run any Mistral AI open-weight model locally

CLI demo tool for quick model testing (mistral-demo)

Python API for programmatic inference

Multi-GPU support via torchrun for large models

Function calling support across all models

Hugging Face Hub integration for model weights

Optimized with xformers for efficient attention computation

Support for instruction-tuned and base model variants

Safetensors format for fast and safe model loading

Extended 32K+ token vocabulary on newer model versions

Use Cases

Who benefits most from this tool.

AI Developers

Run Mistral models locally for prototyping, testing, and building LLM-powered applications without API dependency.

Researchers

Evaluate Mistral model performance, fine-tune open-weight variants, and conduct AI safety research on local hardware.

Privacy-Conscious Teams

Deploy Mistral models on-premise for applications requiring data sovereignty and zero data leakage to external APIs.

Explore Top AI Use Cases

Mistral Inference's Pricing

Mistral AIai lab

4 Models

View full profile

4 Models

AI Models by Mistral AI

Large language models from the same organization.

Model	Context Window	Price (In / Out per M)	Capabilities
Mistral Small 4Current	262K	$0.15 / $0.60	textvisioncodetool use
Mistral Small CreativeCurrent	33K	$0.10 / $0.30	textcodetool use
Devstral 2 2512Current	262K	$0.40 / $2.00	textcodetool use
Ministral 3 14B 2512Current	262K	$0.20 / $0.20	textvisioncodetool use

User Reviews

Share your thoughts

If you've used this product, share your thoughts with other builders

Frequently Asked Questions

What hardware do I need to run Mistral Inference?

A GPU is required as the library depends on xformers for optimized attention. Smaller models like Mistral 7B run on a single GPU, while Mixtral 8x22B requires multi-GPU setup via torchrun.

Does Mistral Inference support all Mistral models?

Yes, it supports the full Mistral model family including Mistral 7B, Mixtral 8x7B/8x22B, Codestral 22B, Codestral Mamba, Mathstral, Mistral Nemo, Mistral Large 2, Pixtral 12B, and Mistral Small 3.1.

What license applies to the Mistral Inference library?

The library code itself is Apache 2.0 licensed. However, some model weights have custom licenses (Codestral uses MNPL for non-commercial, Mistral Large 2 uses MRL for research). Always check individual model license terms.

Can I use Mistral Inference for commercial applications?

Yes for most models. Mistral 7B and Mixtral models are Apache 2.0. Codestral and Mistral Large 2 have non-commercial restrictions. Check each model's license file before commercial use.