Multi-cloud FinOps control plane for real cloud cost telemetry, deterministic optimization math, and OCI GenAI-assisted advisory workflows.
Current release: 0.9.3 advisor grounding, authenticated Advisor Conversation telemetry, decision-grade recommendation timeout alignment, exact provider console deep links, backend namespace cleanup, dashboard wiring, advisor polish, live rightsizing scan fix, realized savings scorecards, FinOps Control Tower, Inventory Explorer, 5-minute API response cache, UIX review, Terraform + Ansible OCI deployment wiring, and repository hygiene.
Current documentation baseline: May 11, 2026.
A quick, honest note: OptiOra is a personal project that I am actively building and refining. It is not a commercial product today, and some live-provider workflows still depend on cloud account details, permissions, billing exports, utilization telemetry, and recommendation APIs that are not available in every test environment. If you see potential here and are interested in running a pilot, validating it with real cloud data, or exploring how this could become a commercial product, please get in touch.
OptiOra helps FinOps, platform, and cloud operations teams turn cloud billing and resource telemetry into explainable actions:
- Real data only: provider APIs, persisted live scan snapshots, or customer-imported CSV billing data.
- Deterministic first: forecasts, savings, rightsizing, anomaly, and efficiency math stay authoritative.
- GenAI as an overlay: OCI Generative AI explains, prioritizes, and summarizes; it does not invent cost numbers.
- RAG-backed advisor context: Cost Advisor and backend GenAI endpoints retrieve curated FinOps guidance before composing prompts and narratives.
- Operator-ready workflows: scans, approvals, exports, alerts, routing policies, scorecards, and weekly operating review packs.
- OCI production path: repeatable deployment with Terraform infrastructure, Ansible runtime provisioning, systemd services, and smoke verification.
- Tenancy-level OCI inventory: live OCI resource scans discover the tenancy home region, walk tenancy subtree compartments, and scan subscribed regions rather than stopping at the deployment compartment.
- Architecture
- Core Capabilities
- Data Policy
- Repository Layout
- Local Development
- OCI Deployment
- Validation
- Cost Planning
- Documentation Map
- License
- Contact / Commercial Interest
Users / Operators
|
| browser
v
+-------------------------------------------------------------+
| Next.js Dashboard |
| - FinOps cockpit, Cost Advisor, inventory, Kubernetes |
| - Forecasting, scorecards, recommendations, operations |
| - Refresh controls can bypass backend response cache |
| - /api/ai/chat route for OCI GenAI advisor conversations |
+----------------------------+--------------------------------+
|
| REST /api/v1/*
v
+-------------------------------------------------------------+
| FastAPI Backend |
| - Costs, scans, imports, forecasts, anomalies, exports |
| - Rightsizing, recommendations, decision intelligence |
| - Hybrid advisor contract: deterministic data + narrative |
| - 5-minute JSON response cache + active-entry warmer |
+-------------+----------------------+------------------------+
| |
| SQLAlchemy | provider / GenAI/RAG APIs
v v
+-------------------------+ +--------------------------------+
| SQLite or PostgreSQL | | AWS, Azure, GCP, OCI |
| - snapshots | | cost, usage, recommendations |
| - imports | | OCI Generative AI Inference |
| - alerts/audit/exports | | default region: uk-london-1 |
+-------------------------+ +--------------------------------+
Deployment flow:
Local workspace
|
+--> Terraform
| +--> OCI network, compute, data volume, archive bucket, scheduler
|
+--> deploy/deploy-oci.sh
+--> read Terraform instance outputs
+--> source archive upload
+--> Ansible provisioning
+--> dashboard build + backend venv
+--> systemd restart
+--> smoke verification
For the deeper system topology, API surface, and data pipelines, see ARCHITECTURE.md.
- Production default: customers upload cloud credentials from the dashboard (
/dashboard/settings), scoped per organization. - Runtime behavior: live cost collection and provider diagnostics now resolve credentials from the organization runtime store first.
- Test-only VM mode: optional Ansible seed (
optiora_test_seed_cloud_credentials=true) can inject AWS/Azure credentials into.envfor temporary internal validation. - Security guidance: keep test seed disabled in production and rely on dashboard onboarding.
| Area | What OptiOra Provides |
|---|---|
| Cost visibility | Billing & Allocation spend views, account hierarchy, service hotspots, Inventory Explorer, imported billing files |
| Forecasting | Baseline forecasts, percentile bands, budget risk, what-if scenarios, stress tests, model diagnostics |
| Optimization | Optimization Advisor with provider-native Cloud Advisor-style findings, stored/live scan modes, storage cleanup, rightsizing, recommendation ledger, commitment gaps, waste decomposition, savings sequencing |
| Unit economics | Cost allocation, business mapping, normalized dimensions, realized savings scorecards, showback/chargeback views |
| Kubernetes | Live OKE/Container Instance/OCIR inventory, OpenCost sync, cluster modeling, namespace/team/workload/node-pool allocation, optimization recommendations |
| Operations | Scan history, scan diffs, alert lifecycle, routing policy simulation, evidence exports, freshness telemetry |
| Intelligence | Cost Advisor, AI Insights, RAG-guided narratives, operating review packs, decision intelligence frontier |
| Governance | Virtual tags, tag quality, audit logs, data-source banners, export jobs, retention controls |
| Control tower | Unified FinOps Control Tower view for forecast risk, waste, commitment, governance, decision frontier, RAG evidence, and GenAI advisory prompts |
| Performance | Process-local API response cache for dashboard JSON GETs, refreshed every 5 minutes and bypassed by user Refresh actions |
Recent UIX and wiring updates:
- Cost Advisor chat now uses the server-side dashboard
/api/ai/chatroute to call OCI GenAI with signed requests and enriches answers with backend RAG guidance from/api/v1/genai/rag-guidance. - Advisor Conversation now forwards the browser session into server-side backend telemetry calls, so authenticated cost, rightsizing, inventory, and RAG context can be used by the chat path.
- Decision-grade recommendation fetches now use the same longer live-data timeout as the surrounding Advanced FinOps page, avoiding early dashboard-side aborts while the backend gathers cost context.
- Advisor Conversation answers are intentionally English-only for now and rightsizing, lifecycle, VM-cost, and RAG resource prompts are grounded to real AWS, Azure, GCP, and OCI resource candidates from provider-backed rightsizing, utilization, inventory, and resource-intelligence feeds. Tenancy, account, segment, service snapshot, imported aggregate, and non-VM storage rows are excluded from actionable VM answers.
- Backend GenAI narratives now inject retrieved RAG briefs into the OCI GenAI prompt path while preserving deterministic cost, savings, risk, and forecast numbers as the source of truth.
- OCI deployment is wired end to end through Terraform-managed infrastructure and Ansible-managed runtime provisioning, with the deploy script reading Terraform outputs for inventory, upload, provisioning, and smoke checks.
- Rightsizing is now surfaced as Optimization Advisor, with provider-native Cloud Advisor-style rows for recommendation type, count, service, category, estimated savings, importance, status, scope, and provider-console action.
- Rightsizing live provider scans now use a longer dashboard timeout for provider-native calls observed at about
50sin OCI, while still falling back to stored results if the live path fails. - Optimization Advisor highlights storage cleanup opportunities such as unattached OCI block and boot volumes, while retaining per-resource execution details for implementation and rollback.
- Rightsizing recommendations now populate a finance-ready recommendation ledger with planned savings, realized savings, and variance, exposed through JSON, CSV, and the finance workbook.
- Scorecards now include realized savings scorecards by provider, owner, business unit, and realized month, backed by the recommendation ledger.
- Kubernetes now merges billing data with live OCI OKE, Container Instance, and OCIR inventory so newly launched container services appear before cost-management data catches up.
- The former Cloud Resources page has been repositioned as Inventory Explorer: it prefers real OCI tenancy-level provider resource/action rows, labels coverage explicitly, and falls back to account or imported cost scopes only when resource-level inventory is unavailable.
- Dashboard JSON GETs now use a bounded backend response cache so normal navigation is fast; active entries are warmed every
5minutes and Refresh buttons request a fresh backend read. - FinOps Control Tower now consolidates forecast risk, waste, commitment, governance, and decision-frontier signals into one control tower instead of forcing operators to stitch disconnected analytics together manually.
- Dashboard navigation is job-based: primary routes stay focused on Workspace, Intelligence, Optimize, and Operate, while specialist routes such as Saved Views, AI Insights, Action Ledger, Kubernetes, Virtual Tags, Scorecards, and Admin Diagnostics remain searchable and grouped under "More workflows".
- Billing & Allocation now owns finance spend, chargeback, mapping, and export workflows, removing confusing overlap with resource investigation.
- The legacy Kubernetes namespace route wiring was removed;
/dashboard/kubernetesis the only Kubernetes/container/Docker page. - Cost Advisor now separates deterministic decision snapshots, quick wins, provider evidence, and conversation starters into focused sections.
- Provider diagnostics now expose a per-cloud API capability envelope for AWS, Azure, GCP, and OCI: scope model, primary APIs, optimization APIs, telemetry APIs, safe page size, bounded parallelism, timeout, retryable statuses, and throttling signals. Backend provider recommendation collection uses those envelopes to avoid unbounded cross-cloud API fan-out.
- Billing and cost context now uses an explicit provider-data order across the platform: live cloud billing APIs first, latest provider-derived scan snapshots second, and optional CSV imports only as the manual/backfill source. Provider summaries include the concrete API source used, such as AWS Cost Explorer, Azure Cost Management, GCP BigQuery Cloud Billing export, or OCI Usage API.
OptiOra must not fabricate production cost data.
Allowed runtime sources:
- Cloud provider billing APIs from AWS, Azure, GCP, and OCI.
- Persisted scan snapshots derived from those provider APIs.
- Customer-provided CSV imports for manual backfill or reconciliation.
Cost-source priority:
- Live provider billing API.
- Latest persisted live scan snapshot.
- Optional customer CSV import.
Disallowed runtime sources:
- hardcoded demo datasets
- synthetic cost records
- mock production payloads
- generated recommendations used to hide missing telemetry
If no real source exists, the application surfaces an explicit empty or unavailable state. The full policy is tracked in DATA_POLICY.md.
.
|-- optiora_backend/ FastAPI backend, analytics, auth, provider integrations
|-- dashboard/ Next.js dashboard and e2e tests
|-- deploy/deploy-oci.sh OCI deployment and operations entrypoint
|-- terraform/ OCI network, compute, data volume, archive bucket, scheduler resources
|-- ansible/ Oracle Linux host provisioning and service templates
|-- tests/ Backend, smoke, live-data, and deployment checks
|-- scripts/ Local bootstrap, cleanup, evidence, and wiring helpers
|-- ARCHITECTURE.md Runtime topology and pipeline diagrams
|-- DEPLOYMENT.md End-to-end OCI deployment guide
|-- COST_ESTIMATE.md Monthly planning ranges and cost drivers
|-- RELEASE_NOTES.md Release history and validation notes
Local commands are only for build, test, and optional CSV/import checks. Supported application runtime and deployment are OCI VM-only.
Requirements:
- Python
3.10through3.13 - Node.js
20.9.0or newer - npm
- Optional: OCI CLI, Terraform, and Ansible for infrastructure/deployment work
Bootstrap:
./setup.shOptional developer-only backend check, not a deployment path:
source .venv/bin/activate
optioraOptional developer-only dashboard check, not a deployment path:
cd dashboard
npm install
npm run devOptional developer-only URLs:
| Service | URL |
|---|---|
| Dashboard | http://localhost:3000/dashboard |
| Backend health | http://localhost:8000/health |
| Backend docs | http://localhost:8000/docs |
For the easiest local quick start, copy .env.example to .env and leave cloud credentials blank. The example sets REQUIRE_LIVE_PROVIDER_DATA=false, allowing CSV/import workflows without pretending placeholder credentials are real.
Production/runtime policy: OCI VM-only.
The Ansible-rendered environment sets:
DEPLOYMENT_TARGET=oci
OCI_RUNTIME_REQUIRED=trueBoth API and dashboard systemd units perform an OCI instance metadata preflight before starting. The operator workstation may run Terraform, Ansible, tests, and packaging commands, but the application services are deployed and run on the OCI VM.
Recommended guided deployment:
./deploy/deploy-oci.sh menuRecommended direct end-to-end deployment:
./deploy/deploy-oci.sh fullCommon direct operations:
./deploy/deploy-oci.sh status
./deploy/deploy-oci.sh verify
./deploy/deploy-oci.sh logs
./deploy/deploy-oci.sh restartRecommended release order:
terraform init/validate/plan
|
v
terraform apply
|
v
Ansible provisioning from deploy-oci.sh full
|
v
deploy-oci.sh verify
|
v
scripts/generate_evidence_pack.sh
Primary OCI region for hosting and GenAI inference: uk-london-1.
For prerequisites, environment variables, networking, Terraform/Ansible details, and troubleshooting, see DEPLOYMENT.md.
High-signal local gates:
python3 -m py_compile $(find ./optiora_backend -name '*.py')
.venv/bin/python -m unittest discover -s tests -p 'test_*.py'
cd dashboard
npm run build
npm run type-check
npm run lint
npm run test:e2eInfrastructure gates:
terraform fmt -check terraform/*.tf
terraform -chdir=terraform init -backend=false
terraform -chdir=terraform validate
ansible-playbook --syntax-check -i ansible/inventory.example.yml ansible/playbooks/site.ymlThe scoped Terraform format command intentionally ignores local
terraform/terraform.tfvars and state files.
Wiring and cleanup:
./scripts/check-animated-svg-routes.sh
./scripts/cleanup-workspace.shImportant Next.js note: run npm run build before standalone npm run type-check after cleanup so .next/types exists before TypeScript reads generated route types.
Current verified baseline is recorded in RELEASE_NOTES.md, TESTING.md, and E2E_WALKTHROUGH_NOTES.md.
Latest deployed OCI verification snapshot:
deploy/deploy-oci.sh verify
48 passed, 0 failed, 3 skipped
Health
http://140.238.90.95/health
{"status":"healthy","version":"0.9.3"}
Operator dashboard walkthrough
all 20 main screens passed route, heading, active-nav, and canonical Kubernetes checks
Live OCI browser walkthrough
npx playwright test e2e/live-operator-walkthrough.spec.ts --config playwright.live.config.ts
2 passed against http://140.238.90.95
Rightsizing live refresh
provider=oci, refresh_live=true
returned about 730 OCI recommendations in roughly 50 seconds
Advisor Conversation live smoke
POST /api/ai/chat on the OCI VM
German prior history + over-provisioning prompt still returned English
tenancy/account/service aggregates were excluded from actionable OCI VM answers
Planning ranges are tracked in COST_ESTIMATE.md. Current profile guidance:
| Profile | Runtime Guidance | Estimated Monthly Range |
|---|---|---|
| Current OCI VM | Live VM.Standard.E4.Flex, 2 OCPU / 8 GiB, extra data volume disabled |
$60-$120 infra baseline before GenAI/data add-ons |
| Small | 1 OCPU / 4 GiB, SQLite on VM, light telemetry |
$85-$240 |
| Default | 2 OCPU / 8 GiB, PostgreSQL, medium telemetry and GenAI |
$240-$620 |
| High Throughput | 4 OCPU / 16 GiB, PostgreSQL, heavier telemetry and GenAI |
$675-$2120+ |
Treat these as planning bands. The current VM shape-only basis is about
$46/month; the broader baseline includes boot volume, logs, exports, normal
operations, and buffer. Verify region-specific list prices in the OCI cost
estimator before purchase.
| Document | Purpose |
|---|---|
| ARCHITECTURE.md | Runtime topology, APIs, analytics pipelines, ASCII diagrams |
| DEPLOYMENT.md | OCI deployment, operations, networking, troubleshooting |
| RELEASE_NOTES.md | Release history, fixes, validation commands |
| TESTING.md | Test strategy and coverage map |
| E2E_WALKTHROUGH_NOTES.md | Human operator walkthrough notes, process outcomes, and live OCI verification snapshot |
| UIX_REVIEW.md | Page-by-page UIX review, applied shell improvements, and UX backlog |
| ROADMAP.md | Product direction and capability gaps |
| NEXT_PHASE.md | Near-term implementation plan |
| COST_ESTIMATE.md | Monthly cost planning and cost drivers |
| DATA_POLICY.md | Real-data-only source policy and guardrails |
| terraform/README.md | OCI Terraform baseline |
| ansible/README.md | Oracle Linux provisioning and runtime configuration |
- Default deployed dashboard posture is public workspace mode unless auth/RBAC is intentionally enabled.
- Provider credentials are validated before use and stored on the API host for runtime scans.
- Saving valid credentials immediately starts a live provider fetch for every configured provider in that workspace.
- Direct app ports can be closed behind the nginx front door.
- Smoke verification auto-detects direct-port versus front-door exposure.
- Workspace cleanup preserves dependency/runtime state while removing generated build, cache, report, and scratch artifacts.
- OCI deploy archives exclude local secrets, Terraform state/tfvars, databases, build outputs,
node_modules, reports, and scratch folders; Ansible also removes stale generated deploy artifacts before unpacking new source.
Apache-2.0. See LICENSE.
OptiOra is currently a personal project by Leandro Michelino. If you are interested in a pilot, want to validate it with real cloud data, or would like to discuss making it a commercial product, get in touch:
Leandro Michelino - ACE - chord.sighted0m@icloud.com