Mistral Inference screenshot

Mistral Inference

By Mistral AI
DeveloperApplicationPricing unavailable

Mistral Inference - Run Mistral AI Models Locally on Your Hardware

Last updated May 4, 2026

Claim Tool

What is Mistral Inference?

Mistral Inference is the official Python library from Mistral AI for running their open-weight language models locally. It provides a streamlined way to download, load, and run inference on Mistral's entire model family. With Mistral Inference, developers can run models like Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral 22B, Mathstral 7B, Mistral Nemo, Mistral Large 2, Pixtral 12B, and Mistral Small 3.1 on their own hardware. The library supports both CLI-based demo commands for quick testing and Python APIs for programmatic inference. Installation is straightforward via pip (\`pip install mistral-inference\`) or from source using Poetry. Models can be downloaded from Mistral's CDN or Hugging Face Hub. The library uses xformers for optimized attention and supports multi-GPU setups through torchrun for larger models like the 8x7B and 8x22B Mixtral variants. All Mistral models support function calling capabilities, enabling structured output and tool use. The library also works with Hugging Face's safetensors format for easy model weight management. Several model variants use custom licenses (MNPL for Codestral, MRL for Mistral Large 2), so users should review license terms before deployment. Mistral Inference is ideal for developers who need private, local AI inference without API dependency. It is actively maintained by Mistral AI with 10,800+ GitHub stars and 1,000+ forks, Apache 2.0 licensed for the library code itself (model weights have their own licenses). The repository includes tutorials in Jupyter notebook format via Google Colab for getting started quickly.

Mistral Inference's Top Features

Key capabilities that make Mistral Inference stand out.

Download and run any Mistral AI open-weight model locally

CLI demo tool for quick model testing (mistral-demo)

Python API for programmatic inference

Multi-GPU support via torchrun for large models

Function calling support across all models

Hugging Face Hub integration for model weights

Optimized with xformers for efficient attention computation

Support for instruction-tuned and base model variants

Safetensors format for fast and safe model loading

Extended 32K+ token vocabulary on newer model versions

Use Cases

Who benefits most from this tool.

AI Developers

Run Mistral models locally for prototyping, testing, and building LLM-powered applications without API dependency.

Researchers

Evaluate Mistral model performance, fine-tune open-weight variants, and conduct AI safety research on local hardware.

Privacy-Conscious Teams

Deploy Mistral models on-premise for applications requiring data sovereignty and zero data leakage to external APIs.

Tags

mistralllm-inferenceopen-source-llmlocal-aipython-librarymistral-7bmixtralmodel-inferenceai-modelsopen-weights

Mistral Inference's Pricing

AI Models by Mistral AI

Large language models from the same organization.

ModelContext WindowPrice (In / Out per M)
Mistral Small 4Current262K$0.15 / $0.60
Mistral Small CreativeCurrent33K$0.10 / $0.30
Devstral 2 2512Current262K$0.40 / $2.00
Ministral 3 14B 2512Current262K$0.20 / $0.20

User Reviews

Share your thoughts

If you've used this product, share your thoughts with other builders

Recent reviews

Frequently Asked Questions

What hardware do I need to run Mistral Inference?
A GPU is required as the library depends on xformers for optimized attention. Smaller models like Mistral 7B run on a single GPU, while Mixtral 8x22B requires multi-GPU setup via torchrun.
Does Mistral Inference support all Mistral models?
Yes, it supports the full Mistral model family including Mistral 7B, Mixtral 8x7B/8x22B, Codestral 22B, Codestral Mamba, Mathstral, Mistral Nemo, Mistral Large 2, Pixtral 12B, and Mistral Small 3.1.
What license applies to the Mistral Inference library?
The library code itself is Apache 2.0 licensed. However, some model weights have custom licenses (Codestral uses MNPL for non-commercial, Mistral Large 2 uses MRL for research). Always check individual model license terms.
Can I use Mistral Inference for commercial applications?
Yes for most models. Mistral 7B and Mixtral models are Apache 2.0. Codestral and Mistral Large 2 have non-commercial restrictions. Check each model's license file before commercial use.