Skip to content

SankaiAI/crushdataai-agent

Repository files navigation

CrushData AI

Data Analyst Intelligence for AI IDEs

An AI skill that provides structured, professional data analysis workflows with built-in validation - helping AI coding assistants perform data analysis like a careful human analyst.

CrushData AI Landing Page

🎯 What It Does

CrushData AI provides:

  • 10 Analysis Workflows - EDA, Dashboard, A/B Test, Cohort, Funnel, Time Series, Segmentation, Data Cleaning, Ad-hoc, KPI Reporting
  • 400+ Searchable Patterns - Metrics, SQL, Python, Charts, Database Tips, Common Mistakes
  • Context-Building Protocol - Forces AI to ask questions and validate before delivering results
  • 4 Industry Modules - SaaS, E-commerce, Finance, Marketing specific metrics

πŸš€ Quick Start

Install via CLI

npm install -g crushdataai

What npm install -g crushdataai Does

The -g flag means Global Install:

Local Install (npm install) Global Install (npm install -g)
Location ./node_modules/ in current folder System-wide (e.g., %APPDATA%\npm\)
Scope Only available in that project Available everywhere on your computer
Use Case Libraries for your project CLI tools you want to run anywhere

Then in any project:

cd your-project
crushdataai init --ai all    # All AI IDEs
crushdataai init --ai claude # Claude Code only

What crushdataai init Does

When you run crushdataai init --ai all, the CLI:

  1. Creates .shared/data-analyst/ - Contains the BM25 search engine and 13 CSV knowledge databases (~400 rows of data analyst patterns)

  2. Creates AI IDE config files based on --ai flag:

    Flag Creates
    --ai claude .claude/skills/data-analyst/SKILL.md
    --ai cursor .cursor/commands/data-analyst.md
    --ai windsurf .windsurf/workflows/data-analyst.md
    --ai antigravity .agent/workflows/data-analyst.md
    --ai copilot .github/prompts/data-analyst.prompt.md
    --ai kiro .kiro/steering/data-analyst.md
    --ai all All of the above
  3. Your AI IDE automatically detects the config files and enables the /data-analyst command

Updating

To update the CLI and refresh your project's AI skill files:

npm install -g crushdataai@latest
# Update specific IDE (recommended):
crushdataai init --ai cursor --force

# Or update everything:
crushdataai init --force

πŸ”Œ Data Connections (New in v1.2)

CrushData AI now features a Connection Manager to securely handle your data credentials.

1. Add Data Sources

Run the connect command to open the management UI:

crushdataai connect
  • Supported Types: CSV, MySQL, PostgreSQL, Shopify, BigQuery, Snowflake
  • Private & Secure: Credentials are stored locally on your machine (~/.crushdataai/connections.json). They are never uploaded to any server or included in the npm package.

CrushData AI Connection Manager

Note

Persistence: Once you add a connection, you can close the UI (Ctrl+C). The AI IDE reads the saved connection details directly from your local config file, so the server does NOT need to keep running.

2. View Saved Connections

crushdataai connections

πŸ’» Usage

Step 1: Initialize

crushdataai init --ai all

Step 2: Use in AI IDE

The skill activates automatically (Claude) or via slash command (others).

Example Workflow:

  1. User Request: "Analyze the sales trends in my-shop-data"
  2. AI Action: The AI checks your saved connections.
  3. AI Action: The AI runs:
    npx crushdataai snippet my-shop-data --lang python
  4. Result: The AI receives the secure code to connect to your data (read-only) and proceeds with analysis.

Claude Code

Claude Code

The skill activates automatically when you request data analysis work. Just chat naturally:

Analyze customer churn for my SaaS product

Cursor / Windsurf / Antigravity

Use the slash command to invoke the skill:

/data-analyst Analyze customer churn for my SaaS product

Kiro

Type / in chat to see available commands, then select data-analyst:

/data-analyst Analyze customer churn for my SaaS product

GitHub Copilot

In VS Code with Copilot, type / in chat to see available prompts, then select data-analyst:

/data-analyst Analyze customer churn for my SaaS product

Example Prompts

Analyze customer churn for my SaaS product
Create a dashboard for e-commerce analytics
Calculate MRR and ARR from subscription data
Build a cohort retention analysis
Perform A/B test analysis on conversion rates

Search Directly

# Search workflows
python3 .shared/data-analyst/scripts/search.py "EDA" --domain workflow

# Search metrics
python3 .shared/data-analyst/scripts/search.py "churn" --domain metric

# Search SQL patterns
python3 .shared/data-analyst/scripts/search.py "cohort" --domain sql

# Industry-specific
python3 .shared/data-analyst/scripts/search.py "MRR" --industry saas

πŸ“Š Search Domains

Domain Content
workflow Step-by-step analysis processes
metric Metric definitions with formulas
chart Visualization recommendations
cleaning Data quality patterns
sql SQL patterns (window functions, cohorts)
python pandas/polars code snippets
database PostgreSQL, BigQuery, Snowflake tips
report Dashboard UX guidelines
validation Common mistakes to avoid

🏭 Industry Modules

Industry Key Metrics
saas MRR, ARR, Churn, CAC, LTV, NRR
ecommerce Conversion, AOV, Cart Abandonment
finance Margins, ROI, Cash Flow, Ratios
marketing CTR, CPA, ROAS, Lead Conversion

πŸ”’ How It Works

Context-Building Protocol

  1. Discovery - AI asks about business context before coding
  2. Data Profiling - Mandatory checks before analysis
  3. Data Cleaning (ETL) - Handle missing values/duplicates in etl/ folder
  4. Validation - Verify JOINs, aggregations, and totals
  5. Sanity Checks - Compare to benchmarks before delivery

Python Environment

To prevent global conflicts, the AI is instructed to:

  1. Check: Look for existing venv or .venv.
  2. Create: If missing, run python3 -m venv venv.
  3. Reports: Save all validation/profiling outputs to reports/ folder. Create if missing.

This prevents the common AI mistakes:

  • ❌ Wrong metric definitions
  • ❌ Duplicate row inflation
  • ❌ Incorrect JOIN types
  • ❌ Unreasonable totals
  • ❌ Cluttered workspaces (scripts are organized in analysis/ and etl/)

πŸ“ License

Apache 2.0

About

This is an agent with skills that most AI IDEs can implement for data analytics.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published