Skip to content
@llmeval

FDU-NLP LLMEval Team

Popular repositories Loading

  1. LLMEval-1 LLMEval-1 Public

    [AAAI 2024] LLMEval Phase I dataset — 17 categories, 453 questions, 2186 annotators for Chinese LLM evaluation

    113 2

  2. LLMEval-2 LLMEval-2 Public

    [AAAI 2024] LLMEval Phase II dataset — professional domain evaluation across 12 academic disciplines

    71 4

  3. LLMEval-Fair LLMEval-Fair Public

    [ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines

    37 2

  4. LLMEval-Med LLMEval-Med Public

    [EMNLP 2025] A real-world clinical benchmark for medical LLMs with physician validation — 2,996 questions from EHRs

    Python 25 1

  5. Llmeval-Gaokao2024-Math Llmeval-Gaokao2024-Math Public

    LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats

    19 1

  6. llmeval.github.io llmeval.github.io Public

    Official website for the LLMEval research series — Fudan NLP Lab

    TypeScript

Repositories

Showing 6 of 6 repositories

Top languages

Loading…

Most used topics

Loading…