Agent Leaderboard — June 2026

Dynamic ranking based on real sessions in Agent Mode. Measures how well each model orchestrates tools, completes autonomous tasks and responds to user corrections.

Agent Leaderboard 18 June 2026 Fonte: arena.ai/leaderboard/agent
✦ Disponibile in AIDeskPro
2
Claude Opus 4.8 (Thinking)
Anthropic
8.85%
✦ AIDeskPro
1
Claude Fable 5 (High)
Anthropic
14.05%
3
GPT 5.5 (xHigh)
OpenAI
8.13%
#ModelloNet Impr.ConfirmedPraiseSteer.RecoveryHalluc. ↓
1 Claude Fable 5 (High) Anthropic
▲ 14.05%
16.27%29.86%12.34%10.04%1.75%
2Claude Opus 4.8 (Thinking) Anthropic
▲ 8.85%
10.65%15.23%9.13%9.01%0.22%
3GPT 5.5 (xHigh) OpenAI
▲ 8.13%
5.13%14.82%3.95%15.00%1.75%
4Claude Opus 4.7 (Thinking) Anthropic
▲ 7.98%
3.53%11.79%9.03%13.91%1.65%
5GPT 5.5 (High) OpenAI
▲ 7.92%
6.33%10.87%7.24%13.43%1.75%
6Claude Opus 4.7 Anthropic
▲ 7.86%
5.00%10.73%9.38%12.50%1.69%
7 Claude Opus 4.6 Anthropic
▲ 7.03%
5.11%9.75%7.03%11.51%1.74%
8 GPT 5.5 OpenAI
▲ 6.80%
4.97%8.43%7.27%11.59%1.75%
9GPT 5.4 (High) OpenAI
▲ 6.61%
5.61%5.28%8.07%12.36%1.75%
10GLM 5.2 (Max) Z.ai
▲ 4.51%
9.96%12.69%5.45%3.60%1.75%
11 Claude Opus 4.8 Anthropic
▲ 3.14%
5.60%11.95%7.00%8.86%17.71%
12 Claude Sonnet 4.6 Anthropic
▲ 3.06%
1.46%-3.46%3.75%11.85%1.72%
13GLM 5.1 Z.ai
▲ 2.07%
3.30%0.53%-0.35%5.12%1.75%
14 Gemini 3.5 Flash Google
▲ 0.03%
-1.06%-1.71%-1.11%2.68%1.36%
15 Gemini 3.1 Pro Preview Google
▼ -0.47%
0.26%-0.98%2.16%-5.48%1.69%
16DeepSeek V4 Pro DeepSeek
▼ -0.76%
-0.75%-1.84%-3.64%2.27%0.13%
17Kimi K2.6 Moonshot
▼ -1.01%
-0.43%-2.97%-3.51%0.11%1.75%
18Kimi K2.7 Code Moonshot
▼ -1.11%
3.22%-0.45%-7.31%-2.77%1.75%
19DeepSeek V4 Flash DeepSeek
▼ -1.70%
4.35%-1.60%-7.95%-3.00%0.29%
20Minimax M3 MiniMax
▼ -2.04%
-1.17%-6.97%-7.42%3.61%1.75%
21Qwen 3.6 Plus Alibaba
▼ -4.12%
-0.56%-6.90%-9.75%-1.62%1.76%
22Grok Build 0.1 xAI
▼ -6.26%
-6.85%-11.34%-9.37%-2.00%1.74%
23Grok 4.3 (High) xAI
▼ -6.92%
-8.97%-15.20%-5.85%-4.50%0.11%
24Nemotron 3 Ultra Nvidia
▼ -7.36%
-5.54%-2.30%-19.65%-10.21%0.90%
25Minimax M2.7 MiniMax
▼ -7.83%
-12.51%-15.42%-9.29%-3.58%1.65%
26 Gemini 3 Flash Google
▼ -8.28%
-11.32%-13.09%-4.46%-13.74%1.20%
27Gemma 4 31B Google
▼ -12.72%
-5.55%-7.94%-6.03%-27.58%16.50%
28Grok 4.3 xAI
▼ -17.60%
-12.59%-14.98%-5.04%-56.35%0.95%

Cloud Providers

AIDeskPro integrates LLMs through the best cloud providers, guaranteeing EU data residency and zero data retention. Here are the partners that allow us to offer the best models on the market.

Google Cloud & Model Garden

As a Google Premier Partner we guarantee data hosted in Italy (Milan) or in Europe, access to Gemini in all versions and to Anthropic models via Model Garden, with zero data retention.
Scopri di più →

Anthropic

Claude models (Sonnet, Opus, Haiku, Fable) are available in AIDeskPro via Google Model Garden with EU residency guarantee and no use of data for training for paid versions.
Scopri di più →

OpenAI

GPT models are available through an Enterprise agreement with Zero Data Retention and European Data Residency. These guarantees do not apply to consumer ChatGPT accounts.
Scopri di più →

AWS — Amazon Web Services

AIDeskPro can be deployed on AWS in European Regions (Frankfurt, Ireland, Stockholm) with Amazon Bedrock integration, GDPR compliance and enterprise certifications.
Scopri di più →

Scaleway

100% European cloud provider with data centres in Paris, Amsterdam and Warsaw. Ideal for companies with European digital sovereignty requirements, powered by renewable energy.
Scopri di più →

Nebius AI

Next-generation AI cloud platform with H100/H200 GPUs, data centres in Europe (Helsinki, Amsterdam). Ideal for open source models (Llama, Mistral, Qwen…) in a private environment.
Scopri di più →

Privacy Guarantees across all Providers

Regardless of the provider chosen, AIDeskPro always guarantees:

EU DATA RESIDENCY

Your data never leaves the European Union. In some cases you can also choose the specific region (e.g. Milan with Google Cloud).

ZERO DATA RETENTION

No provider uses your data, documents or conversations to train their models. Contractual guarantee with all partners.

ENTERPRISE AGREEMENTS

All providers we work with have Enterprise agreements with Open Gate, with specific clauses on privacy, security and GDPR compliance.

Need a specific provider?