New Features added

Vish Devarajan · Vish Devarajan · commit 71b21930c10c · 2026-03-22T13:56:21.000+11:00
diff --git a/BENCHMARKS.md b/BENCHMARKS.md
@@ -0,0 +1,43 @@
+# Benchmarks and Regression Notes
+
+## Local Micro-benchmarks
+
+Baseline captured on March 22, 2026 from the local development environment with 500 iterations:
+
+- `guard_model_request()` average latency: `0.086 ms`
+- `OutputFirewall.inspect()` average latency: `0.026 ms`
+
+These numbers are for short text-only prompts and responses. Real latency will increase when you add:
+
+- retrieval grounding documents
+- custom prompt detectors
+- named-entity detection
+- larger multimodal message payloads
+
+## False-positive Rollout Guidance
+
+Recommended rollout order:
+
+1. Start with `preset="shadow_first"`
+2. Capture `report["telemetry"]` and `on_telemetry` output in structured logs
+3. Add route-level overrides for high-risk flows such as admin, billing, exports, and tool-calling
+4. Promote specific routes from shadow mode to blocking only after reviewing false-positive rates
+
+## Regression Expectations
+
+Current regression coverage includes:
+
+- prompt-injection overrides
+- system-prompt leakage attempts
+- token and secret leakage
+- Australian PII masking
+- route-policy suppression
+- custom prompt detectors
+- provider adapter wrappers
+- multimodal message-part masking
+
+Run the regression suite with:
+
+```bash
+python3 -m unittest discover -s tests
+```
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,18 @@
 # Changelog
 
+## 0.1.5
+
+- Added route-level operational telemetry summaries for easier rollout visibility
+- Added stronger rollout presets for RAG-safe and agent-tool workflows
+- Expanded enterprise-oriented rollout docs around provider coverage, observability, and control-plane usage
+
+## 0.1.4
+
+- Added richer multimodal message-part normalization and masking
+- Added provider adapters and stable wrapper guidance as first-class release docs
+- Added migration notes, benchmark notes, and rollout guidance for false-positive tuning
+- Expanded route-level and domain-level policy documentation for RAG and agent workflows
+
 ## 0.1.0
 
 - Initial public release
@@ -13,4 +26,3 @@
 - Canary tokens
 - Dashboard helpers
 - Red-team eval harness
-
diff --git a/MIGRATING.md b/MIGRATING.md
@@ -0,0 +1,33 @@
+# Migrating to 0.1.4
+
+## Stable Contracts
+
+The following APIs are intended to be the long-term integration surface for the 0.1.x line:
+
+- `guard_model_request()`
+- `review_model_response()`
+- `protect_model_call()`
+- `protect_with_adapter()`
+- `ToolPermissionFirewall`
+- `RetrievalSanitizer`
+
+These contracts are also exposed in `CORE_INTERFACES` so applications can log or assert the expected interface version.
+
+## What Changed in 0.1.4
+
+- Added richer multimodal/message-part handling for mixed text, image, and file content
+- Added provider adapters for OpenAI, Anthropic, Gemini, and OpenRouter
+- Added presets and route-level policy overrides
+- Added custom prompt detector hooks for domain tuning
+- Expanded rollout guidance, benchmarks, and regression notes
+
+## Migration Notes
+
+- If you previously passed message content as arrays of parts, 0.1.4 now preserves those parts in `content_parts` while still producing the text view in `content`.
+- If you were wrapping providers manually, prefer `protect_with_adapter()` plus the adapter factories in `blackwall_llm_shield.providers`.
+- If you want conservative rollout, switch to `preset="shadow_first"` before enabling hard blocking on every route.
+
+## Compatibility
+
+- Existing string-based `messages[].content` flows remain supported.
+- Existing `guard_model_request()` and `OutputFirewall` usage remain backward-compatible.
diff --git a/README.md b/README.md
@@ -14,6 +14,8 @@ Python security toolkit for AI applications and LLM-enabled services. Blackwall
 - Emits structured telemetry for prompt risk, masking volume, and output review outcomes
 - Includes first-class provider adapters for OpenAI, Anthropic, Gemini, and OpenRouter
 - Inspects outputs for leakage, unsafe code, grounding drift, and tone violations
+- Handles mixed text, image, and file message parts more gracefully in text-first multimodal flows
+- Adds operator-friendly telemetry summaries and stronger presets for RAG and agent-tool workflows
 - Ships drop-in FastAPI/Flask middleware and LangChain/LlamaIndex callback helpers
 - Enforces tool permissions and approval gates
 - Sanitizes retrieval documents for RAG pipelines
@@ -67,6 +69,10 @@ Use `shadow_mode` with `shadow_policy_packs` or `compare_policy_packs` to measur
 
 Use `create_openai_adapter()`, `create_anthropic_adapter()`, `create_gemini_adapter()`, or `create_openrouter_adapter()` with `protect_with_adapter()` when you want Blackwall to wrap the provider call end to end.
 
+### Observability and control-plane support
+
+Use `summarize_operational_telemetry()` with emitted telemetry events when you want route-level summaries, blocked-event counts, and rollout visibility for operators.
+
 ### Output grounding and tone review
 
 `OutputFirewall` can compare a response to retrieval documents and flag unsupported claims or unprofessional tone before the answer leaves your service.
@@ -101,6 +107,17 @@ Helps keep hostile or manipulative text in retrieved documents from becoming mod
 
 Pair it with `protect_model_call()` by passing sanitized documents into `firewall_options={"retrieval_documents": docs}` and gate any tool or admin action with `ToolPermissionFirewall`.
 
+### Contract Stability
+
+The 0.1.x line treats `guard_model_request()`, `protect_with_adapter()`, `review_model_response()`, `ToolPermissionFirewall`, and `RetrievalSanitizer` as the long-term integration contracts. The exported `CORE_INTERFACES` map can be logged or asserted by applications that want to pin expected behavior.
+
+Recommended presets:
+
+- `shadow_first` for low-friction rollout
+- `strict` for high-sensitivity routes
+- `rag_safe` for retrieval-heavy flows
+- `agent_tools` for tool-calling and approval-gated agent actions
+
 ## Example Workflow
 
 ```python
@@ -156,6 +173,44 @@ shield = BlackwallShield(
 )
 ```
 
+## Route and Domain Examples
+
+For RAG:
+
+```python
+shield = BlackwallShield(
+    preset="shadow_first",
+    route_policies=[
+        {
+            "route": "/api/rag/search",
+            "options": {
+                "policy_pack": "government",
+                "output_firewall_defaults": {
+                    "retrieval_documents": kb_docs,
+                },
+            },
+        },
+    ],
+)
+```
+
+For agent tool-calling:
+
+```python
+tool_firewall = ToolPermissionFirewall(
+    allowed_tools=["search", "lookup_customer", "create_refund"],
+    require_human_approval_for=["create_refund"],
+)
+```
+
+## Operational Telemetry Summaries
+
+```python
+summary = summarize_operational_telemetry(events)
+print(summary["by_route"])
+print(summary["highest_severity"])
+```
+
 ### `AuditTrail`
 
 Produces signed events you can summarize into operations dashboards or audit pipelines.
@@ -177,12 +232,18 @@ Produces signed events you can summarize into operations dashboards or audit pip
 - `make version-packages` explains the automated versioning flow for Python
 - merges to `main` trigger release automation that prepares version/release PRs and publishes to PyPI after merge
 
+## Migration and Benchmarks
+
+- See [MIGRATING.md](/Users/vishnu/Documents/blackwall-llm-shield/blackwall-llm-shield-python/MIGRATING.md) for compatibility notes and stable contract guidance
+- See [BENCHMARKS.md](/Users/vishnu/Documents/blackwall-llm-shield/blackwall-llm-shield-python/BENCHMARKS.md) for baseline latency numbers and regression coverage
+
 ## Rollout Notes
 
 - Start with `preset="shadow_first"` or `shadow_mode=True` and inspect `report["telemetry"]` plus `on_telemetry` events before enabling hard blocking.
 - Use `RetrievalSanitizer` and `ToolPermissionFirewall` in front of RAG, search, admin actions, and tool-calling flows.
 - Add regression prompts for instruction overrides, prompt leaks, token leaks, and Australian PII samples so upgrades stay safe.
 - Expect some latency increase from grounding checks, output review, and custom detectors; benchmark with your real prompt and response sizes before enforcing globally.
+- For agent workflows, keep approval-gated tools and route-specific presets separate from end-user chat routes so operators can see distinct risk patterns.
 
 ## New Modules
 
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "vpdeva-blackwall-llm-shield-python"
-version = "0.1.3"
+version = "0.1.5"
 description = "Open-source Python enterprise LLM protection toolkit for Python services"
 readme = "README.md"
 requires-python = ">=3.9"
diff --git a/src/blackwall_llm_shield/__init__.py b/src/blackwall_llm_shield/__init__.py
@@ -19,6 +19,7 @@
     CORE_INTERFACES,
     POLICY_PACKS,
     build_shield_options,
+    summarize_operational_telemetry,
     build_admin_dashboard_model,
     create_fastapi_guard,
     create_langchain_callbacks,
@@ -85,6 +86,7 @@
     "ProviderAdapter",
     "SHIELD_PRESETS",
     "build_shield_options",
+    "summarize_operational_telemetry",
     "build_admin_dashboard_model",
     "create_fastapi_guard",
     "create_langchain_callbacks",
diff --git a/src/blackwall_llm_shield/core.py b/src/blackwall_llm_shield/core.py
diff --git a/tests/test_core.py b/tests/test_core.py