FDU-NLP LLMEval Team
Popular repositories Loading
-
LLMEval-Fair
LLMEval-Fair Public[ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines
-
LLMEval-Med
LLMEval-Med Public[EMNLP 2025] A real-world clinical benchmark for medical LLMs with physician validation — 2,996 questions from EHRs
-
Llmeval-Gaokao2024-Math
Llmeval-Gaokao2024-Math PublicLLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats
-
llmeval.github.io
llmeval.github.io PublicOfficial website for the LLMEval research series — Fudan NLP Lab
TypeScript
Repositories
- Llmeval-Gaokao2024-Math Public
LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats
llmeval/Llmeval-Gaokao2024-Math’s past year of commit activity - LLMEval-Med Public
[EMNLP 2025] A real-world clinical benchmark for medical LLMs with physician validation — 2,996 questions from EHRs
llmeval/LLMEval-Med’s past year of commit activity - LLMEval-2 Public
[AAAI 2024] LLMEval Phase II dataset — professional domain evaluation across 12 academic disciplines
llmeval/LLMEval-2’s past year of commit activity - LLMEval-1 Public
[AAAI 2024] LLMEval Phase I dataset — 17 categories, 453 questions, 2186 annotators for Chinese LLM evaluation
llmeval/LLMEval-1’s past year of commit activity - LLMEval-Fair Public
[ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines
llmeval/LLMEval-Fair’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…