This documentation will help you run, reproduce and compare MLPerf benchmarks out-of-the-box across different software, hardware, models and data sets using the open-source and technology-agnostic MLCommons Collective Mind automation language (CM) and MLCommons Collective Knowledge Playground (CK).
Please choose which benchmark you want to run:
- MLPerf inference
- MLPerf training (prototyping phase)
- MLPerf tiny (prototyping phase)
- MLPerf mobile (preparation phase)
This project is under heavy development by the MLCommons Task Force on Automation and Reproducibility, cTuning.org and cKnowledge.org led by Grigori Fursin and Arjun Suresh. You can learn more about our plans and long-term vision from our ACM REP'23 keynote.
Don't hesitate to get in touch with the CM/CK community via our public Discord server to provide your feedback, ask questions, add new benchmark implementations, models, data sets and hardware backends, prepare and optimize your MLPerf submissions and participate in our reproducibility and optimization challenges.