What is the best LLM for text, coding, vision, search, or X? LMArena knows, and the results may surprise you.

July 18, 2025No CommentsCybersecurity News

TL;DR

LMArena is a widely referenced platform that ranks large AI models across multiple areas—including text, coding, vision, and specialized tasks—using extensive human preference data. Its leaderboards reflect which models users find most capable in real-world tasks, based on head-to-head comparisons and live voting .

LMArena assesses AI models in several distinct arenas:

Text: General language tasks (e.g., writing, reasoning, multi-turn dialogue)
WebDev: Coding and web development capabilities
Vision: Multimodal ability to handle images and text
Search: Information retrieval and synthesis
Copilot: Coding assistant and developer support
Text-to-Image: AI systems generating visual content from text prompts

Current Rankings by Area

Here are the current (July 2025) rankings.

Text Arena

Rank	Model	Score	Votes
1	Gemini-2.5-Pro	1462	18,297
2	o3-2025-04-16	1452	24,554
2	ChatGPT-4o-Latest-20250326	1444	25,715
3	GPT-4.5-Preview-2025-02-27	1437	15,271
3	Grok-4-0709	1433	4,227
5	Claude-Opus-4-20250514-Thinking-16k	1419	13,018
6	Claude-Opus-4-20250514	1416	21,129
6	DeepSeek-R1-0528	1414	14,078
6	Gemini-2.5-Flash	1414	23,738
6	GPT-4.1-2025-04-14	1412	19,766

WebDev Arena

Rank	Model	Score	Votes
1	Gemini-2.5-Pro	1423	3,010
1	DeepSeek-R1-0528	1407	1,978
1	Claude Opus 4 (20250514)	1404	4,322
3	Claude Sonnet 4 (20250514)	1378	3,258
4	Claude 3.7 Sonnet (20250219)	1357	7,481
6	Gemini-2.5-Flash	1299	3,681

Vision Arena

Rank	Model	Score	Votes
1	Gemini-2.5-Pro	1268	4,382
2	ChatGPT-4o-Latest-20250326	1249	6,271
2	o3-2025-04-16	1238	5,097
2	GPT-4.5-Preview-2025-02-27	1231	3,066
3	Gemini-2.5-Flash	1224	5,184

Search Arena

Rank	Model	Score	Votes
1	Gemini-2.5-Pro-Grounding	1142	1,215
1	PPL-Sonar-Reasoning-Pro-High	1136	861
3	PPL-Sonar-Reasoning	1097	1,644
3	PPL-Sonar	1072	1,208

Copilot Arena

Rank	Model	Score	Votes
1	DeepSeek V2.5 (FIM)	1028	2,292
1	Claude 3.5 Sonnet (06/20)	1012	3,544
1	Claude 3.5 Sonnet (10/22)	1004	3,596
1	Codestral (25.01)	1001	2,180
1	Qwen-2.5-Coder (FiM)	998	3,401

Text-to-Image Arena

Rank	Model	Score	Votes
1	GPT-Image-1	1148	22,691
2	Imagen-4.0-Ultra-Generate-Preview-06-06	1113	11,552
3	Imagen-4.0-Generate-Preview-05-20	1097	22,211

And the winner is…

Clearly, Gemini-2.5-Pro leads most areas, especially in Text, Vision, and WebDev. Google, again…

Last updated on July 16, 2025

Comments

No comments yet. Why don’t you start the discussion?

Leave a ReplyCancel reply

United States	~59% of ransomware attacks globally Thousands per year
Poland	1,000+ per week
Russia	Highest cybercrime threat level
China	Thousands per year
India	115% surge in attacks Q2 2024
Ukraine	Significant surge since 2022
Brazil	Among top countries for blocked attacks
Mexico	65% of businesses hit in 2024
Germany	High targeted rate (EU)
France	High targeted rate (EU)

AS Name	ASN
Bharat Sanchar Nigam Ltd	9829
No.31,Jin-rong Street	4134
CHINA UNICOM China169 Backbone	4837
DigitalOcean, LLC	14061
HUAWEI INTERNATIONAL PTE. LTD.	136907
Amazon.com, Inc.	14618
Alibaba (US) Technology Co., Ltd.	45102
Google LLC	396982
Amazon.com, Inc.	16509
3xK Tech GmbH	200373

IP Address	Notable Exploits/Context
104.238.159.149	SharePoint zero-day, broad exploitation
107.191.58.76	SharePoint zero-day, government targets
96.9.125.147	SharePoint, previously Ivanti exploits
139.162.47.194	Exploits on CitrixBleed 2
38.180.148.215	CitrixBleed 2 campaigns
185.224.128.17	High activity, Netherlands
89.248.163.200	High activity, Netherlands
15.235.218.150	Associated with APT, active C2
45.9.148.114	Associated with C2, malicious netflow
91.107.150.184	C2 infrastructure, recent IoC