Add model benchmark scores (MMLU, HumanEval, etc.)

## Context
The AI Models Catalog tracks pricing, context windows, and capabilities. One frequently requested feature is **benchmark scores** — MMLU, HumanEval, MATH, GPQA, etc.

## What to do
1. Add a `benchmarks` optional field to the model schema in `types/model.ts`
2. Add the Zod validation in `types/schemas.ts`
3. Add a few benchmark entries for well-known models as examples
4. Update the interactive catalog to show benchmark scores
5. Create a docs/benchmarks.md page

## Suggested schema
```yaml
benchmarks:
  mmlu: 88.7
  humaneval: 92.0
  math: 78.3
  gpqa: 65.2
```

## Notes
- Benchmark data should come from official model cards/reports (first-party)
- Not all models have benchmarks — this is an optional field
- Follow the pattern of existing optional fields

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model benchmark scores (MMLU, HumanEval, etc.) #19

Context

What to do

Suggested schema

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add model benchmark scores (MMLU, HumanEval, etc.) #19

Description

Context

What to do

Suggested schema

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions