LLM Comparison
Qwen3.5-9B vs Claude Opus 4.7
Side-by-side specs, pricing & capabilities · Updated May 2026
Price vs Intelligence
Add to comparison
2/6 modelsSame tier:
| Organization | ||
| OpenTools Score | 35 282 | 71 4.7 |
| Family | Qwen | Claude |
| Status | Current | Current |
| Release Date | Mar 2026 | Apr 2026 |
| Context Window | 262K tokens | 1.0M tokens |
| Input Price | $0.10/M tokens | $5.00/M tokens |
| Output Price | $0.15/M tokens | $25.00/M tokens |
| Pricing Notes | — | Cache read: $0.5000/M tokens |
| Capabilities | textvisionvideocode | textvisioncodetool-use |
| Max Output | — | 128K tokens |
| API Identifier | qwen/qwen3.5-9b | anthropic/claude-opus-4.7 |
| Benchmarks | ||
| MMLU-Pro | 82.5alibaba | 78.1anthropic |
| GPQA Diamond | 81.7alibaba | 94.2anthropic |
| IFEval | 91.5alibaba | — |
| HMMT 2025 | 83.2alibaba | — |
| TAU2-Bench | 79.1alibaba | — |
| MMMU-Pro | 70.1alibaba | — |
| MMLU | — | 84.7anthropic |
| MMMLU | — | 92anthropic |
| HLE | — | 54.7artificial-analysis |
| SWE-bench Verified | — | 87.6anthropic |
| SWE-bench Pro | — | 64.3anthropic |
| SWE-bench Multilingual+Multimodal | — | 80.5anthropic |
| Terminal-Bench | — | 69.4anthropic |
| MCP-Atlas | — | 77.3anthropic |
| Berkeley Function Calling | — | 77.3anthropic |
| OSWorld-Verified | — | 78anthropic |
| BrowseComp | — | 79.3anthropic |
| CharXiv-R | — | 91anthropic |
| DocVQA | — | 93.1anthropic |
| CyberGym | — | 73.1anthropic |
| GDPVal-AA Elo | — | 1753artificial-analysis |
| View Qwen3.5-9B | View Claude Opus 4.7 | |
Cost Calculator
Enter your expected monthly token usage to compare costs.
| Model | Input | Output | Total / mo | vs Best |
|---|---|---|---|---|
| Qwen3.5-9BCheapest | $0.10 | $0.08 | $0.18 | — |
| Claude Opus 4.7 | $5.00 | $12.50 | $17.50 | +9900% |
Alibaba
Qwen3.5-9B
Qwen3.5-9B is a multimodal llm from Alibaba. Supports up to 262,144 token context window. Achieves 87.5% on MMLU. Available from $0.10/M input tokens.
Anthropic
Claude Opus 4.7
Claude Opus 4.7 is Anthropic's most capable generally available model, with significant improvements in advanced software engineering, agentic tool use, and vision resolution. Achieves 87.6% on SWE-bench Verified and 94.2% on GPQA Diamond. Supports up to 1,000,000 token context window with 3.3x higher-resolution vision than Opus 4.6.
More Comparisons
Looking for more AI models?
Browse All LLMs