| diataxis | tutorial | ||||||
|---|---|---|---|---|---|---|---|
| tags |
|
In this tutorial you will follow a guided walkthrough and verify a working result.
- Developers who want to prove the full loop locally (serve → log → eval)
- A running
recsys-servicein DB-only mode (popularity baseline) - An eval-compatible exposure log file
- A sample
recsys-evalreport you can share internally
Choose your data mode
This tutorial uses DB-only mode (fastest way to prove the loop locally).
- Choose DB-only to validate the full loop quickly: serve → log → eval.
- Choose artifact/manifest mode when you want pipelines to publish versioned artifacts and use the manifest as a ship/rollback lever.
See: Data modes: DB-only vs artifact/manifest. For artifact mode end-to-end, follow production-like run (pipelines → object store → ship/rollback).
- Docker + Docker Compose (v2)
makecurl- POSIX shell
python3(used to parse the exposure log)- Go toolchain (to build
recsys-eval)
Verify you have them:
docker compose version
make --version
curl --version
python3 --version
go versionPOST /v1/recommendreturns a non-empty list for tenantdemoand surfacehome- A local exposure log file exists (eval schema)
recsys-eval runproduces a Markdown report
From repo root:
test -f api/.env || cp api/.env.example api/.env
make devApply database migrations (idempotent):
(cd api && make migrate-up)Verify:
for _ in $(seq 1 60); do
if curl -fsS http://localhost:8000/healthz >/dev/null; then
break
fi
sleep 2
done
curl -fsS http://localhost:8000/healthz >/dev/nullExpected:
make devexits 0 and starts the local stack.- The health check exits 0.
This tutorial uses dev headers for auth and disables admin RBAC roles so you can call admin endpoints without JWT claims.
Apply these settings in api/.env:
Tutorial env settings (copy/paste)
# DB-only mode (no artifact manifest)
RECSYS_ARTIFACT_MODE_ENABLED=false
# Make requests deterministic
RECSYS_ALGO_MODE=popularity
# Enable rules so you can prove control-plane wiring (pin/exclude) works
RECSYS_ALGO_RULES_ENABLED=true
# Enable eval-compatible exposure logs
EXPOSURE_LOG_ENABLED=true
EXPOSURE_LOG_FORMAT=eval_v1
EXPOSURE_LOG_PATH=/app/tmp/exposures.eval.jsonl
# Local dev: disable admin RBAC roles (dev headers don’t carry roles)
AUTH_VIEWER_ROLE=
AUTH_OPERATOR_ROLE=
AUTH_ADMIN_ROLE=Restart the service:
docker compose up -d --force-recreate apiVerify:
for _ in $(seq 1 60); do
if curl -fsS http://localhost:8000/healthz >/dev/null; then
break
fi
sleep 2
done
curl -fsS http://localhost:8000/healthz >/dev/nullExpected:
- The health check exits 0 after the restart.
Insert a tenant row:
docker exec -i recsys-db psql -U recsys-db -d recsys-db <<'SQL'
insert into tenants (external_id, name)
values ('demo', 'Demo Tenant')
on conflict (external_id) do nothing;
SQLExpected:
- The
psqlcommand exits 0.
Create a small config document:
cat > /tmp/demo_config.json <<'JSON'
{
"weights": { "pop": 1.0, "cooc": 0.0, "emb": 0.0 },
"flags": { "enable_rules": true },
"limits": { "max_k": 50, "max_exclude_ids": 200 }
}
JSONUpsert config:
curl -fsS -X PUT http://localhost:8000/v1/admin/tenants/demo/config \
-H 'Content-Type: application/json' \
-H 'X-Dev-User-Id: dev-user-1' \
-H 'X-Dev-Org-Id: demo' \
-H 'X-Org-Id: demo' \
-d @/tmp/demo_config.jsonCreate a small rules document (pin item_3 to prove control works):
cat > /tmp/demo_rules.json <<'JSON'
[
{
"action": "pin",
"target_type": "item",
"item_ids": ["item_3"],
"surface": "home",
"priority": 10
}
]
JSONUpsert rules:
curl -fsS -X PUT http://localhost:8000/v1/admin/tenants/demo/rules \
-H 'Content-Type: application/json' \
-H 'X-Dev-User-Id: dev-user-1' \
-H 'X-Dev-Org-Id: demo' \
-H 'X-Org-Id: demo' \
-d @/tmp/demo_rules.jsonExpected:
- Both admin
PUTcalls exit 0.
Seed item_tags and item_popularity_daily for surface home:
docker exec -i recsys-db psql -U recsys-db -d recsys-db <<'SQL'
with t as (
select id as tenant_id
from tenants
where external_id = 'demo'
)
insert into item_tags (tenant_id, namespace, item_id, tags, price, created_at)
select tenant_id, 'home', 'item_1', array['brand:nike','category:shoes'], 99.90, now() from t
union all
select tenant_id, 'home', 'item_2', array['brand:nike','category:shoes'], 79.00, now() from t
union all
select tenant_id, 'home', 'item_3', array['brand:acme','category:socks'], 12.00, now() from t
on conflict (tenant_id, namespace, item_id)
do update set tags = excluded.tags,
price = excluded.price,
created_at = excluded.created_at;
with t as (
select id as tenant_id
from tenants
where external_id = 'demo'
)
insert into item_popularity_daily (tenant_id, namespace, item_id, day, score)
select tenant_id, 'home', 'item_1', current_date, 10 from t
union all
select tenant_id, 'home', 'item_2', current_date, 7 from t
union all
select tenant_id, 'home', 'item_3', current_date, 3 from t
on conflict (tenant_id, namespace, item_id, day)
do update set score = excluded.score;
SQLExpected:
- The
psqlcommand exits 0.
Send a request with deterministic request_id:
curl -fsS http://localhost:8000/v1/recommend \
-H 'Content-Type: application/json' \
-H 'X-Request-Id: req-1' \
-H 'X-Dev-User-Id: dev-user-1' \
-H 'X-Dev-Org-Id: demo' \
-H 'X-Org-Id: demo' \
-d '{"surface":"home","k":5,"user":{"user_id":"u_1","session_id":"s_1"}}'You should see items with item_id values like item_1, item_2, item_3.
Because you pinned item_3, it should appear first in the list.
Example response shape:
{
"items": [{ "item_id": "item_3", "rank": 1, "score": 0.12 }],
"meta": {
"tenant_id": "demo",
"surface": "home",
"config_version": "W/\"...\"",
"rules_version": "W/\"...\"",
"request_id": "req-1"
},
"warnings": [
{ "code": "DEFAULT_APPLIED", "detail": "segment defaulted to 'default'" },
{ "code": "SIGNAL_UNAVAILABLE", "detail": "content unavailable: unavailable" }
]
}You may see warnings[]; in this tutorial that is expected:
DEFAULT_APPLIEDmeans omitted request fields (for examplesegmentoroptions) were filled with defaults.SIGNAL_UNAVAILABLEfor session/collaborative/content means those optional signals were not seeded in this DB-only walkthrough.- These warnings are non-fatal here. Treat the step as successful when HTTP 200 returns with non-empty
items.
If you get an empty list, check:
- you inserted rows into
item_popularity_dailyfornamespace='home' - you are calling the API with
surface=home
Expected:
- The response has a non-empty
itemslist. item_3appears first (pinned rule).
Copy the exposure file out of the container:
docker compose cp api:/app/tmp/exposures.eval.jsonl /tmp/exposures.jsonlExtract the hashed user_id from the exposure file (this is what recsys-service logs for eval format):
EXPOSURE_USER_ID="$(python3 -c 'import json; print(json.loads(open(\"/tmp/exposures.jsonl\").readline())[\"user_id\"])')"Create a minimal outcome log that joins by request_id (and matches the exposure user_id):
OUTCOME_TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
cat > /tmp/outcomes.jsonl <<JSONL
{"request_id":"req-1","user_id":"${EXPOSURE_USER_ID}","item_id":"item_3","event_type":"click","ts":"${OUTCOME_TS}"}
JSONLExpected:
/tmp/exposures.jsonland/tmp/outcomes.jsonlboth exist and are non-empty.
Create a dataset config:
cat > /tmp/dataset.yaml <<'YAML'
exposures:
type: jsonl
path: /tmp/exposures.jsonl
outcomes:
type: jsonl
path: /tmp/outcomes.jsonl
YAMLCreate a minimal offline config (slice keys match the service eval_v1 context keys):
cat > /tmp/eval.yaml <<'YAML'
mode: offline
offline:
metrics:
- name: hitrate
k: 5
- name: precision
k: 5
slice_keys: ["tenant_id", "surface"]
gates: []
scale:
mode: memory
YAMLBuild + run:
(cd recsys-eval && make build)
(cd recsys-eval && ./bin/recsys-eval validate --schema exposure.v1 --input /tmp/exposures.jsonl)
(cd recsys-eval && ./bin/recsys-eval validate --schema outcome.v1 --input /tmp/outcomes.jsonl)
recsys-eval/bin/recsys-eval run \
--mode offline \
--dataset /tmp/dataset.yaml \
--config /tmp/eval.yaml \
--output /tmp/recsys_eval_report.md \
--output-format markdownInspect the report:
sed -n '1,80p' /tmp/recsys_eval_report.mdYou should see an “Offline Metrics” table with values like:
| hitrate@5 | 1.000000 |
| precision@5 | 0.333333 |
Expected:
- Both
recsys-eval ... validate ...commands exit 0. /tmp/recsys_eval_report.mdexists and is non-empty.
This step proves recsys-pipelines can produce artifacts and a manifest from events.
Ensure MinIO is up:
curl -fsS http://localhost:9000/minio/health/ready >/dev/nullBuild + run one day from the tiny pipelines dataset:
(cd recsys-pipelines && make build)
(cd recsys-pipelines && ./bin/recsys-pipelines run \
--config configs/env/local.json \
--tenant demo \
--surface home \
--start 2026-01-01 \
--end 2026-01-01)Verify the local manifest exists:
cat recsys-pipelines/.out/registry/current/demo/home/manifest.jsonVerify artifacts exist in MinIO (paths are under the recsys/ prefix by default):
docker compose run --rm --entrypoint sh minio-init -c \
'mc alias set local http://minio:9000 minioadmin minioadmin >/dev/null && \
mc ls local/recsys-artifacts/recsys/demo/home/ | head'-
Service is healthy:
curl -fsS http://localhost:8000/readyz >/dev/null -
Tenant exists:
docker exec -i recsys-db psql -U recsys-db -d recsys-db -c \"select external_id from tenants;\"
-
Config and rules exist:
curl -fsS http://localhost:8000/v1/admin/tenants/demo/config \\ -H 'X-Dev-User-Id: dev-user-1' -H 'X-Dev-Org-Id: demo' -H 'X-Org-Id: demo' curl -fsS http://localhost:8000/v1/admin/tenants/demo/rules \\ -H 'X-Dev-User-Id: dev-user-1' -H 'X-Dev-Org-Id: demo' -H 'X-Org-Id: demo'
-
Exposure log exists:
test -s /tmp/exposures.jsonl -
Eval report exists:
test -s /tmp/recsys_eval_report.md
401/403from admin or recommend endpoints- Check you set
AUTH_*_ROLE=empty inapi/.envand recreated theapicontainer. - Ensure you send both
X-Dev-Org-IdandX-Org-Idheaders.
- Check you set
- Empty recommendations
- Check
item_popularity_dailyhas rows fornamespace='home'and forday=current_date.
- Check
- Pipelines cannot connect to MinIO
- Ensure
curl -fsS http://localhost:9000/minio/health/readysucceeds.
- Ensure
- Service not ready: Runbook: Service not ready
- Empty recs: Runbook: Empty recs
- Database migration issues: Runbook: Database migration issues
- Production-like suite tutorial: production-like run (pipelines → object store → ship/rollback)
- Integrate the serving API into your app: How-to: integrate recsys-service into an application
- Operate pipelines: How-to: operate recsys-pipelines