AI-augmented data storytelling for CSV and Excel datasets.
DataStory takes an uploaded dataset, runs it through a backend analysis pipeline, and returns:
- dataset metadata and quality signals
- suggested visualizations
- audience-specific narrative stories
- an optional preprocessing report
- a rendered HTML report
The frontend provides the upload flow and live pipeline status. The backend performs file validation, preprocessing, analysis, chart generation, story generation, and report assembly.
frontend/- React + Vite UIbackend/- FastAPI service and agent pipeline- WebSocket updates for live job progress
- Optional AI provider support through Groq, Gemini, or Ollama
- Upload CSV and Excel datasets
- Optional preprocessing before analysis
- Live progress tracking across the pipeline
- Dataset metadata summary
- Automated chart selection and rendering
- Narrative generation for executive, analyst, investor, or general audiences
- Cleaned dataset download when preprocessing is enabled
- Full HTML report output
Frontend:
- React 18
- Vite
- Tailwind CSS
- Axios
- Plotly
Backend:
- FastAPI
- Pydantic
- Pandas
- Plotly
- Jinja2
- Uvicorn
AI providers:
- Groq
- Gemini
- Ollama
- User uploads a dataset in the frontend.
- The backend validates the file and creates a job.
- The pipeline can optionally preprocess the data.
- The backend analyzes structure, generates chart specs, renders charts, and creates narrative stories.
- The final report and results are stored in the job store and shown in the dashboard.
cd backend
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000cd frontend
npm install
npm run devIf your backend is not running on the same origin as the frontend, set:
VITE_API_URL=http://localhost:8000Backend .env:
AI_PROVIDER=groq
GROQ_API_KEY=your_key_here
GEMINI_API_KEY=your_key_here
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2
CORS_ORIGINS=http://localhost:5173
MAX_FILE_SIZE_MB=200
UPLOAD_DIR=./uploadsThe backend supports these providers:
groqgeminiollama
POST /api/upload- upload a dataset and start a jobGET /api/jobs/{job_id}/status- poll job statusGET /api/jobs/{job_id}/results- fetch completed resultsGET /api/jobs/{job_id}/report- open the rendered HTML reportGET /api/jobs/{job_id}/cleaned- download the cleaned CSVGET /api/jobs/{job_id}/preprocess-report- fetch preprocessing detailsWS /api/ws/{job_id}- receive live progress updatesGET /health- health check
- Frontend: Vercel
- Backend: Docker or any FastAPI-compatible host
- Set
VITE_API_URLin the frontend deployment to point at the backend API
datastory/
frontend/
backend/
README.md
- Dataset uploads are limited to CSV and Excel formats.
- The backend is designed to be provider-agnostic, so you can switch AI providers through environment settings.
- Chart and story generation are handled in the backend pipeline, not in the browser.