Skip to content
You must be logged in to sponsor longchung90

Become a sponsor to LongC

Motivation

Reading the ITBench paper was genuinely exciting for me. It immediately connected with my background in quantitative modelling, AI evaluation, and failure-mode analysis, and raised many questions I wanted to explore more deeply.

Goal

My goal is to study the paper in depth—understanding its assumptions, evaluation design, data sources, and limitations—and to build open, reproducible implementations that allow these ideas to be tested, extended, and better understood in practice.

Focus Areas

This work focuses on:

  • Deep reading and validation of the ITBench methodology and sources
  • Reproducing core evaluation setups using open-source models
  • Examining evaluation design choices, including why specific metrics and tasks were selected
  • Investigating robustness, edge cases, and failure modes
  • Identifying gaps in current AI evaluation and exploring how they could be improved

Perspective

I am particularly interested in the evaluation needs highlighted by the paper, and in learning how benchmarks can better reflect real-world reasoning, risk, and decision-making, rather than surface-level performance alone.

Outputs & Support

All outputs will be shared openly as notebooks, code, and clear documentation.
Sponsorship supports the time required for careful reading, experimentation, validation, and communication of insights.

Featured work

  1. longchung90/weather_forecast

    Originally developed as a Coursera assignment and expanded into a fully featured global weather application with enhanced UX, automation, and scalable city management.

    JavaScript 1
  2. longchung90/Portfolio_Project

    The portfolio began as a course project from the IBM Coursera program. By integrating AI-augmented workflows and JavaScript-driven interactivity, it evolved into a lively, dynamic webpage that goes…

    HTML 1

Select a tier

$ a month

You'll receive any rewards listed in the $5 monthly tier. Additionally, a Public Sponsor achievement will be added to your profile.

$5 a month

Select

🟢 Tier 1 — AU$5 / month · Supporter

  • Support open learning and exploration in AI evaluation
  • Helps sustain careful study of ITBench, reproducible notebooks, and clear documentation shared openly with the community

$15 a month

Select

🔵 Tier 2 — AU$15 / month · Contributor

  • Support hands-on reproduction of AI evaluation benchmarks
  • Enables deeper work on reproducing ITBench setups, analysing metrics, robustness, and failure modes, and publishing well-documented notebooks

$50 a month

Select

🟣 Tier 3 — AU$50 / month · Research Supporter

  • Support sustained, research-grade work in open AI evaluation
  • Extend benchmark analysis, robustness studies, and thoughtful documentation, translating ITBench insights into practical, real-world evaluation guidance