A production-style, end-to-end churn analytics project:
- Frontend: Dark-themed Shiny dashboard with filters, KPIs, insights, and a live prediction form.
- Backend:
plumberREST API serving a tidymodels gradient boosting model. - Data: PostgreSQL with reproducible load/feature engineering scripts.
- Reports: Confusion matrix, ROC curve, and a markdown report for the trained model.
PostgreSQL ← data/load-data.R
↑
│ ┌──────────────────────────────┐
│ │ Shiny Frontend │
│ │ KPIs, filters, insights, │
│ │ predictions (dark theme) │
│ └──────────────┬───────────────┘
│ │ HTTP (JSON)
│ /predict (plumber)
│ │
└────────── backend/api/train-model.R ──► model.rds
- Modeling: tidymodels workflow with
xgboosttuning, F1-optimized threshold, saved asmodel.rds. - API:
plumberexposes/healthand/predict, loadingmodel.rds. - UI: Shiny dashboard (dark, compact), filterable, with insights plots and a guided prediction form.
- Interactive dashboard: KPIs, churn mix, tenure/charges visuals, contract vs churn, “quick insights” chips, and a downloadable filtered CSV.
- Live predictions: Enter customer attributes → get class & probability from the API.
- Reproducible training: Train script generates reports (
reports/) and the serialized model. - Database-backed: Load Telco churn CSV into Postgres and query from the app.
- R ≥ 4.3 with packages listed in
frontend/R/requirements.txtandbackend/api/requirements.txt(if you maintain one there). - PostgreSQL ≥ 13
- Optional: Docker / docker-compose (a
docker-compose.ymlis included but local run works fine).
R style in this repo uses
=for assignment.
backend/
api/
main.R # plumber API (serves /health, /predict)
train-model.R # tidymodels + xgboost training
model.rds # trained model artifact
reports/ # confusion matrix, ROC, report.md
data/
load-data.R # create schema & load Telco CSV into Postgres
Telco-Customer-Churn.csv
frontend/
app.R # Shiny app entry
R/
about.R # About tab (cards)
insights.R # Correlation / feature-importance
metrics.R # KPI value boxes
plots.R # Plot helpers + dark theming
prediction.R # Prediction form + API call
retrain.R # (optional) hooks to retrain
sidebar.R # Filters module (selectize + slider)
utils.R # helpers
postgres/
create_schema.sql # optional schema helper
media/
demo.gif
Create a local .Renviron (or export in shell) with your DB settings:
export DB_USER="user"
export DB_PASSWORD="password"
export DB_HOST="localhost" # or 'postgres' if running under docker-compose
export DB_PORT="5432"
export DB_NAME="churn_db"In RStudio, you can also use Tools → Global Options → Environment or put these in
~/.Renviron.
# from repo root or the data/ folder
source("data/load-data.R")
# This reads Telco-Customer-Churn.csv and populates the 'customers' table.# from backend/api/
setwd("backend/api")
source("train-model.R")
# Outputs:
# - backend/api/model.rds
# - backend/reports/{confusion_matrix.png, roc_curve.png, classification_report.txt, model_report.md}# from backend/api/
setwd("backend/api")
library(plumber)
pr = plumb("main.R")
pr$run(host = "0.0.0.0", port = 8000)
# Swagger UI: http://127.0.0.1:8000/__docs__/Sanity test:
curl -X POST "http://127.0.0.1:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"tenure": 15,
"monthly_charges": 70,
"contract": "Month-to-month",
"gender": "Female",
"senior_citizen": 0,
"partner": "No",
"dependents": "No",
"internet_service": "Fiber optic",
"paperless_billing": "Yes",
"payment_method": "Electronic check",
"services_list": ["Multiple Lines","Streaming TV"]
}'# from frontend/
setwd("frontend")
shiny::runApp("app.R", launch.browser = TRUE)Base: http://127.0.0.1:8000
- 200 →
{ "status": "ok" }
Body (JSON):
{
"tenure": 12,
"monthly_charges": 70,
"contract": "Month-to-month",
"gender": "Female",
"senior_citizen": 0,
"partner": "Yes",
"dependents": "No",
"internet_service": "DSL",
"paperless_billing": "Yes",
"payment_method": "Electronic check",
"services_list": ["Streaming TV", "Online Security"]
}Response (JSON):
{
"prediction": "No",
"probability": 0.1673
}The API auto-derives engineered fields (e.g., total/avg charges, tenure buckets, service flags) to match the training pipeline.
- Filters: Multi-select dropdowns accept empty selection = “All”. Tenure range slider updates the dataset reactively.
- Dark theme: Custom CSS fixes for
selectizeensure selected chips remain visible. - Model threshold: Training step chooses a probability threshold that maximizes F1 on the test split; the API uses 0.50 by default (you can expose the tuned threshold if desired).
-
“Could not resolve host: backend” in Shiny prediction Use the full URL
http://127.0.0.1:8000/predictinprediction_server(..., api_url = ...)when running locally. Usehttp://backend:8000/predictonly inside docker-compose networks. -
Swagger “Invalid JSON body” Click Try it out → paste a valid JSON object in the request body (don’t post empty).
-
“Could not derive probability from model prediction” Ensure the trained workflow is saved as
model.rdsand predictions supporttype="prob". This repo uses tidymodels (.pred_1) and includes a robustget_positive_proba()inmain.R. -
Connection refused on port 5432 Make sure Postgres is running and credentials/host (
localhostvspostgres) match how you launched the DB.
- Add segmented lift/ICE plots for top features.
- Expose tuned threshold from training into API response.
- Optional authentication for the API.
- Dockerized one-click stack (compose: db + API + Shiny).
PRs and issues are welcome. Keep styles consistent with the repo (e.g., R assignments with =). If you add a feature in the UI, please include a short GIF and update this README.
See LICENSE in the repository root.
- R, Shiny, bslib (
darkly) - plumber
- tidymodels (recipes, workflows, tune, yardstick), xgboost
- dplyr, ggplot2, plotly, DT
- PostgreSQL
