- Overview
- Features
- System Architecture
- Tech Stack
- Project Structure
- Installation & Setup
- API Documentation
- Frontend Features
- AI & Backend Logic
- Dashboard Screenshots
- Deployment
- Demo Video
- Contributing
NarrativeScope is an advanced social media narrative intelligence platform that analyzes digital conversations, tracks influence operations, and visualizes community sentiment across Reddit data. It combines semantic search, RAG (Retrieval-Augmented Generation), network analysis, and AI-powered insights to provide deep narrative understanding.
- π Semantic search across millions of posts
- π¬ AI-powered intelligent conversational analysis
- π Real-time time series analytics
- πΈοΈ Network graph visualization of narrative connections
- π€ Topic clustering and detection
- π Advanced sentiment and trend analysis
| Feature | Description |
|---|---|
| Semantic Search | Find narratives and discussions using natural language queries |
| RAG Chat Interface | Ask questions and get AI-synthesized answers grounded in source data |
| Network Analysis | Visualize author connections and narrative propagation patterns |
| Time Series Analysis | Track narrative evolution and sentiment over time |
| Topic Clustering | Discover and analyze emerging topics and narrative themes |
| Advanced Analytics | Comprehensive metrics on engagement, reach, and influence |
- β Responsive React dashboard with Tailwind CSS styling
- β Interactive visualizations with D3.js and Recharts
- β Real-time data updates
- β Intuitive navigation and filtering
- β Export-ready analytics reports
Framework: FastAPI 0.115.0 + Uvicorn
Language: Python 3.9+
LLM: Groq (llama-3.3-70b-versatile)
Embeddings: Sentence-Transformers
Vector DB: ChromaDB
Search: BM25 + Semantic Search
Graph: NetworkX + python-louvain
Clustering: HDBSCAN + UMAP
Data: Pandas + DuckDB
Framework: React 19.2.4
Build Tool: Vite 8.0.4
Styling: Tailwind CSS 4.2
Routing: React Router 7.14
State: Zustand 5.0
HTTP: Axios 1.14
Viz: D3.js 7.9 + Recharts 3.8
Icons: Lucide React 1.7
API Protocol: REST (HTTP/HTTPS)
Deployment: Vercel (Frontend) + Server (Backend)
CORS: Enabled for all origins
Database: ChromaDB + PostgreSQL
Simppl/
βββ π backend/ # FastAPI Backend
β βββ main.py # App entry point
β β
β βββ π routes/ # API endpoints
β β βββ chat.py # Chat/RAG endpoint
β β βββ search.py # Semantic search endpoint
β β βββ timeseries.py # Time series analytics
β β βββ network.py # Network graph endpoint
β β βββ clusters.py # Topic clustering endpoint
β β βββ posts.py # Post details endpoint
β β βββ analytics.py # Advanced analytics endpoint
β β
β βββ π service/ # Business logic
β βββ rag_service.py # RAG with LLM
β βββ search_service.py # Semantic search logic
β βββ analytics_service.py # Analytics processing
β
βββ π frontend/ # React Frontend
β βββ package.json # Dependencies
β βββ vite.config.js # Vite configuration
β βββ tailwind.config.js # Tailwind config
β β
β βββ π src/
β β βββ main.jsx # React entry point
β β βββ App.jsx # Main app component
β β β
β β βββ π pages/ # Page components
β β β βββ ChatPage.jsx # Chat/RAG interface
β β β βββ SearchPage.jsx # Search results
β β β βββ TimeSeriesPage.jsx # Time series charts
β β β βββ NetworkPage.jsx # Network visualization
β β β βββ ClustersPage.jsx # Topic clusters
β β β βββ AnalysisPage.jsx # Analytics dashboard
β β β
β β βββ π components/ # Reusable components
β β β βββ π layout/
β β β β βββ Sidebar.jsx
β β β β βββ PageShell.jsx
β β β β
β β β βββ π ui/
β β β βββ index.jsx
β β β
β β βββ π api/ # API client
β β β βββ client.js # Axios instance
β β β
β β βββ π hooks/ # Custom React hooks
β β β βββ useFetch.js # Data fetching hook
β β β
β β βββ π store/ # Global state
β β β βββ appStore.js # Zustand store
β β β
β β βββ π styles/
β β β βββ index.css # Global styles
β β β
β β βββ index.css # Tailwind imports
β
βββ π Data/ # Data storage
β βββ posts.parquet # Post dataset
β βββ topics_data.json # Topic metadata
β βββ network_graph.json # Graph structure
β β
β βββ π chroma_db/ # Vector database
β βββ embeddings/ # Stored embeddings
β
βββ π Scripts/ # Data processing
β βββ ingest.py # Data ingestion
β βββ embed_all_hf.py # Generate embeddings
β βββ build_graph.py # Build network graphs
β βββ train_topics.py # Topic modeling
β βββ precompute_topics.py # Precompute topics
β
βββ π Analysis/ # Notebooks
β βββ main.ipynb # Analysis notebook
β
βββ π docs/ # Documentation
β βββ API.md # API documentation
β
βββ requirements.txt # Python dependencies
βββ .env # Environment variables
βββ .gitignore # Git config
βββ README.md # This file
- Python 3.9+
- Node.js 18+
# 1. Clone repository
git clone <repository-url>
cd Simppl
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Setup environment variables
cp .env.example .env
# Edit .env with your API keys:
# GROQ_API_KEY=your_groq_key
# HUGGINGFACE_API_KEY=your_hf_key
# 5. Prepare data (optional - if starting fresh)
python Scripts/ingest.py # Ingest posts
python Scripts/embed_all_hf.py # Generate embeddings
python Scripts/build_graph.py # Build network graphs
python Scripts/train_topics.py # Train topic models
# 6. Start backend server
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
# Backend will be available at: http://localhost:8000
# API docs: http://localhost:8000/docs# 1. Navigate to frontend directory
cd frontend
# 2. Install dependencies
npm install
# 3. Create .env file
echo "VITE_API_BASE_URL=http://localhost:8000" > .env
# 4. Start development server
npm run dev
# Frontend will be available at: http://localhost:5173
# 5. Build for production
npm run build
# 6. Preview production build
npm run previewhttps://simppl-reasearch.vercel.app//api/v1
GET /healthPOST /chat/message
Content-Type: application/json
{
"message": "What are the main narratives about AI safety?"
}
Response:
{
"reply": "Based on the retrieved data...",
"sources": [
{
"author": "username",
"score": 450,
"domain": "reddit.com"
}
]
}POST /search/semantic
Content-Type: application/json
{
"query": "blockchain technology discussion",
"top_k": 10
}
Response:
{
"results": [
{
"document": "Post content...",
"metadata": {
"author": "user123",
"score": 300,
"timestamp": "2024-01-15"
}
}
],
"total": 10
}GET /timeseries/narrative-trend?topic=AI&days=30GET /network/graph?limit=100GET /clusters/topics?top_n=20GET /posts/{post_id}GET /analytics/dashboard?date_range=30days- Natural language query interface
- RAG-powered responses with citations
- Source attribution and credibility metrics
- Multi-turn conversation support
- Semantic search across all posts
- Filter by author, date, score
- Result preview and detailed view
- Relevance scoring
- Narrative trend visualization
- Engagement metrics over time
- Peak detection and anomalies
- Recharts-powered interactive charts
- Author connection visualization
- Community detection
- Influence measurement
- D3.js force-directed graphs
- Automatic topic detection
- Cluster composition analysis
- Topic evolution tracking
- Community sentiment
- Comprehensive dashboard
- Multi-metric KPIs
- Export capabilities
- Custom date ranges
# Location: backend/service/rag_service.py
SYSTEM_PROMPT = """
You are the Lead Intelligence Analyst for NarrativeScope...
- STRICT GROUNDING: Only use provided context data
- MANDATORY CITATIONS: Every claim must be cited [Author: name | Score: X]
- NARRATIVE FORMAT: Write flowing analytical paragraphs
- HANDLING MISSING DATA: State when insufficient data exists
- ANALYTICAL TONE: Like an investigative journalist
"""
Process Flow:
1. User Query β Semantic Search (retrieve top-8 relevant posts)
2. Context Building β Format posts with metadata
3. LLM Call β Groq llama-3.3-70b-versatile (with system prompt)
4. Response β Return answer with sourcesUser Query
β
[Semantic Embedding] - Sentence-Transformers
β
ChromaDB Vector Store (similarity search)
β
BM25 Ranking (keyword relevance)
β
Combined Results (semantic + lexical)
β
Ranked Top-K Posts
Data Ingestion
β
Text Preprocessing (cleaning, normalization)
β
Embedding Generation (Sentence-Transformers)
β
UMAP Dimensionality Reduction
β
HDBSCAN Clustering
β
Topic Extraction & Labeling
β
Output: Topic assignments + metadata
Posts Data
β
Extract Author Mentions
β
Build Interaction Graph (NetworkX)
β
Community Detection (python-louvain)
β
Calculate Centrality Metrics
β
Output: Network structure + influence scores
π Website: https://simppl-reasearch.vercel.app/
- Hosting: Vercel / Render
- Framework: React 19 + Vite
- Build:
npm run build - Deploy: Automatic via git push
Backend (.env)
GROQ_API_KEY=gsk_xxxxx
HUGGINGFACE_API_KEY=hf_xxxxx
Frontend (.env)
VITE_API_BASE_URL=https://your-backend-api.com
VITE_ENV=production
πΊ Project Walkthrough & Explanation
π₯ Watch the complete project demo: Watch explantion
Video Contents:
- System overview and architecture
- Live demonstration of all features
- RAG in action - asking intelligent questions
- Network visualization explained
- Time series trend analysis
- Topic clustering results
- Backend API walkthrough
- Deployment process
| Metric | Value |
|---|---|
| Search Latency | < 500ms (semantic) |
| RAG Response Time | < 3s (with LLM) |
| Embedding Generation | ~100 posts/sec |
| Network Graph Render | < 2s (1000 nodes) |
| Concurrent Users | 100+ |
# Semantic search
top_k = 10 # Number of results
similarity_threshold = 0.5 # Min relevance score
# BM25 ranking
bm25_k1 = 1.5 # Term frequency saturation
bm25_b = 0.75 # Length normalizationhdbscan_min_cluster_size = 5
umap_n_neighbors = 15
umap_min_dist = 0.1llm_model = "llama-3.3-70b-versatile"
llm_temperature = 0.7
context_window = 4000 tokens
top_k_context = 8 posts- Follow PEP 8 for Python code
- Use ES6+ for JavaScript
- Add tests for new features
- Update documentation
- Keep commits atomic and descriptive
Built with β€οΈ for narrative intelligence & social media analysis
For questions, issues, or suggestions:
- π€ Email: [kukadiyarishi895@gmail.com]
Last Updated: April 2024 | Version 1.0.0
.png)
.png)





