The new generation of intelligent agent development platform supports the flexible creation of professional agents through prompts and workflows. The platform has integrated a wide range of models, pl
Open-source prompt management & evals for AI teams
AGI-Eval evaluation community, AI large model evaluation community
arena.ai Review Arena Rankings, a review platform developed by researchers at UC Berkeley SkyLab and LMSYS, is used to evaluate artificial intelligence through human preferences
Artificial Analysis Platform is a leading independent AI benchmarking and analysis platform
Smartest way to discover unseen literature
It is a list warehouse for Chinese large model ability evaluation, covering 195 commercial and open source large models, and conducts multi-dimensional ability evaluation (including medical, education
Chinese language understanding assessment benchmarks, including representative data sets, benchmark (pre-training) models, corpora, and rankings. Select a series of data sets corresponding to certain
A smart platform to manage awards, contests and applications
The FlagEval (Libra) large model evaluation system and open platform aims to establish scientific, fair, and open evaluation benchmarks, methods, and toolsets to assist researchers in comprehensively
Find interesting community members and see how you stack up
LiveBench is an authoritative benchmarking platform for large language models (LLMs), providing a fair, objective and uncontaminated evaluation environment to evaluate and compare the performance of d