Skip to content

Commit 568e2b7

Browse files
authored
Merge pull request #13 from sharpninja/copilot/create-manual-trigger-pipeline
Add manual benchmark report pipeline with GitHub Pages publishing
2 parents 673b67f + 1161777 commit 568e2b7

File tree

6 files changed

+626
-0
lines changed

6 files changed

+626
-0
lines changed
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
name: Benchmark report
2+
3+
on:
4+
workflow_dispatch:
5+
6+
permissions:
7+
contents: read
8+
pages: write
9+
id-token: write
10+
11+
concurrency:
12+
group: benchmark-report-pages
13+
cancel-in-progress: true
14+
15+
jobs:
16+
benchmark:
17+
runs-on: ubuntu-latest
18+
19+
steps:
20+
- name: Checkout repository
21+
uses: actions/checkout@v4
22+
23+
- name: Setup .NET SDK
24+
uses: actions/setup-dotnet@v4
25+
with:
26+
global-json-file: global.json
27+
28+
- name: Restore
29+
run: dotnet restore BitNet-b1.58-Sharp.slnx
30+
31+
- name: Build
32+
run: dotnet build BitNet-b1.58-Sharp.slnx --configuration Release --no-restore
33+
34+
- name: Test
35+
run: dotnet test BitNet-b1.58-Sharp.slnx --configuration Release --no-build --no-restore
36+
37+
- name: Generate benchmark comparison report
38+
run: >
39+
dotnet run --configuration Release
40+
--project "${{ github.workspace }}/src/BitNetSharp.App/BitNetSharp.App.csproj" --
41+
benchmark-report
42+
--model=bitnet-b1.58-sharp
43+
--compare-model=traditional-local
44+
--output="${{ github.workspace }}/artifacts/benchmark-report"
45+
46+
- name: Upload benchmark report artifact
47+
uses: actions/upload-artifact@v4
48+
with:
49+
name: benchmark-report
50+
path: ${{ github.workspace }}/artifacts/benchmark-report
51+
if-no-files-found: error
52+
53+
- name: Setup GitHub Pages
54+
uses: actions/configure-pages@v5
55+
56+
- name: Upload GitHub Pages artifact
57+
uses: actions/upload-pages-artifact@v3
58+
with:
59+
path: ${{ github.workspace }}/artifacts/benchmark-report
60+
61+
deploy:
62+
needs: benchmark
63+
runs-on: ubuntu-latest
64+
environment:
65+
name: github-pages
66+
url: ${{ steps.deployment.outputs.page_url }}
67+
68+
steps:
69+
- name: Deploy GitHub Pages artifact
70+
id: deployment
71+
uses: actions/deploy-pages@v4

docs/benchmarking.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ The benchmark command uses BenchmarkDotNet to measure the same hosted-model oper
1515
- streaming a response for a prompt
1616
- building the agent host
1717

18+
The manual GitHub Actions benchmark report workflow runs the same benchmark suite for both built-in models, then publishes a static comparison site through GitHub Pages. That report combines:
19+
20+
- efficacy, measured as non-empty responses across the shared default query script
21+
- accuracy, measured as exact-match and expected-token recall against the default corpus responses
22+
- performance, measured from the exported BenchmarkDotNet results
23+
1824
## Run the built-in comparison benchmark
1925

2026
```bash
@@ -23,6 +29,20 @@ dotnet run --configuration Release --project /home/runner/work/BitNet-b1.58-Shar
2329

2430
This runs the BenchmarkDotNet suite over both local models so their hosted response and host-construction costs can be compared directly.
2531

32+
## Generate the comparison report site
33+
34+
```bash
35+
dotnet run --configuration Release --project src/BitNetSharp.App/BitNetSharp.App.csproj -- benchmark-report --model=bitnet-b1.58-sharp --compare-model=traditional-local --output=/absolute/path/to/benchmark-report
36+
```
37+
38+
This command writes a static report site with:
39+
40+
- `index.html` for GitHub Pages publishing
41+
- `comparison-report.md` and `comparison-report.json` summaries
42+
- raw BenchmarkDotNet HTML, CSV, and GitHub-flavored Markdown exports under `BenchmarkDotNet.Artifacts/results/`
43+
44+
The repository also includes a manual trigger workflow at `.github/workflows/benchmark-report.yml` that builds, tests, generates the same report, uploads it as an artifact, and deploys it with GitHub Pages.
45+
2646
## Train the traditional local model
2747

2848
```bash

docs/usage.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,14 @@ dotnet run --configuration Release --project /home/runner/work/BitNet-b1.58-Shar
4747

4848
This command runs BenchmarkDotNet over the same hosted-model operations covered by the SpecFlow scenarios so you can compare local models under one agent wrapper.
4949

50+
## Benchmark report
51+
52+
```bash
53+
dotnet run --configuration Release --project src/BitNetSharp.App/BitNetSharp.App.csproj -- benchmark-report --model=bitnet-b1.58-sharp --compare-model=traditional-local --output=/absolute/path/to/benchmark-report
54+
```
55+
56+
This command runs the BenchmarkDotNet suite, evaluates both built-in models against the shared default training corpus/query script, and writes HTML, Markdown, and JSON comparison reports to the selected output directory.
57+
5058
## Train the traditional comparison model
5159

5260
```bash

0 commit comments

Comments
 (0)