Live Sentient is a full-stack application for ingesting, enriching, and exploring news articles with advanced NLP techniques. The backend processes a large news dataset, classifies emotion, generates embeddings, and stores enriched records in MongoDB. The frontend provides a modern interface for searching and visualizing news by location, sentiment, and more.
- Dataset Ingestion: Reads and processes a large news dataset (
public_dataset/news_category.json), cleaning and enriching each article. - NLP Enrichment:
- Summarization: Generates concise summaries of articles.
- Emotion Classification: Detects the primary emotion of each article using a transformer model.
- Embeddings: Generates semantic vector embeddings for similarity search and clustering.
- MongoDB Storage: Stores enriched articles in a MongoDB Atlas collection.
- REST API: Exposes endpoints for querying news, searching by location, and more.
- Frontend: React-based UI for searching, filtering, and visualizing news articles.
- Testing: Includes unit and integration tests for backend services.
- Python 3.8+
- Node.js 16+
- MongoDB Atlas account (or local MongoDB)
- HuggingFace Transformers
- Sentence Transformers
- Vite (for frontend)
git clone https://github.com/yourusername/Live_Sentient.git
cd Live_Sentient-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
- Copy
.env.exampleto.envand fill in your MongoDB URI and any other secrets.
- Copy
-
Ingest and enrich the dataset:
python app/services/data_ingest.py
This will process
public_dataset/news_category.json, enrich each article, and store results in MongoDB. -
Run the backend server:
python app/main.py
The API will be available at
http://localhost:8000(or as configured).
-
Install dependencies:
cd frontend npm install -
Configure environment variables:
- Copy
.env.exampleto.envand set your API endpoints and RapidAPI keys.
- Copy
-
Run the frontend:
npm run dev
The app will be available at
http://localhost:5173(or as configured).
- Search News: Use the search bar to find news by city, region, or country.
- Filter by Sentiment/Emotion: Filter articles by detected emotion or sentiment.
- Explore Categories: Browse articles by category (e.g., Politics, Entertainment, Wellness).
- View Details: Click on an article to see its summary, emotion, and metadata.
Run backend tests with:
pytest tests/- Use
run.shto automate backend and frontend startup. - For production, deploy the backend (e.g., with Gunicorn/Uvicorn) and the frontend (e.g., Vercel, Netlify, or static hosting).
- Configure CI/CD with
.github/workflows/ci.yml.
.env(backend):MONGO_URI, model paths, etc.frontend/.env:VITE_GEODB_KEY, API URLs, etc.
- news_category.json: Source dataset.
- preprocessor.py: Standalone enrichment script.
- data_ingest.py: Main ingestion and enrichment pipeline.
MIT License. See LICENSE for details.