jeinlee1991/

chinese-llm-benchmark

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括374个大模型，覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.6、ernie4.5、MiniMax-M2.7、deepseek-v4、Qwen3.6、llama4、智谱GLM-5.1、MiMo-V2、LongCat、gemma4、mistral等开源大模型。不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。

agentic-aiartificial-intelligencellm-agentllm-evaluation

Stars

6.0k

+2 today+3 /wk+18 /mo

Forks

242

Issues

Watchers

6.0k

Star History

Repository Info

CreatedJun 4, 2023

Last push5/1/2026

Homepagenonelinear.com

Open on GitHub