WikiFetch is a modern web application for searching, saving, and managing Wikipedia articles offline. It features a SQLite database for article storage, a REST API for programmatic access, and Docker support for easy deployment.
- Search Wikipedia: Search and save Wikipedia articles with full content
- SQLite Database: Store articles with metadata (word count, dates, tags)
- Offline API: REST API endpoints for accessing saved articles without internet
- Modern UI: Split dashboard layout with sidebar navigation
- Migration Tool: Import existing text files into the database
- Docker Support: Containerized deployment with dev and prod configurations
- Cross-Platform: Works on Windows, Linux, and macOS
- Git (for cloning repository)
- Internet connection (for initial Wikipedia fetches)
- Windows/macOS: Docker Desktop
- Linux: Docker Engine and Docker Compose
- Python 3.9 or higher
- pip (Python package manager)
- Install Docker Desktop from https://www.docker.com/products/docker-desktop
- Start Docker Desktop from Start menu
- Open PowerShell and verify installation:
docker --version docker-compose --version
# Install Docker
sudo apt update
sudo apt install docker.io docker-compose
sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker $USER
# Log out and back in for group changes
docker --versionsudo dnf install docker docker-compose
sudo systemctl start docker
sudo systemctl enable docker
sudo usermod -aG docker $USER
# Log out and back in
docker --version- Install Docker Desktop from https://www.docker.com/products/docker-desktop
- Open Docker.app from Applications
- Verify in Terminal:
docker --version docker-compose --version
# Clone repository
git clone https://github.com/NikolisSecurity/WikiFetch.git
cd WikiFetch
# Development mode (with hot reload)
docker-compose up --build
# Production mode
docker-compose -f docker-compose.prod.yml up -d --buildAccess the application:
- Development: http://localhost:5000
- Production: http://localhost:8000
# Check Python version (should be 3.9+)
python --version
# Clone repository
git clone https://github.com/NikolisSecurity/WikiFetch.git
cd WikiFetch
# Create virtual environment
python -m venv venv
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run application
python app.py# Check Python version (should be 3.9+)
python3 --version
# Clone repository
git clone https://github.com/NikolisSecurity/WikiFetch.git
cd WikiFetch
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run application
python app.pyAccess the application:
On first run, the database file wikifetch.db will be created automatically in the data/ directory (or at the path specified by DATABASE_PATH environment variable).
If you have existing .txt files in the downloaded_data/ directory:
- Navigate to the application in your browser
- Look for the "Import from Files" section in the sidebar
- Select files you want to import
- Click "Import Selected" or "Import All"
- Optionally check "Delete files after import" to remove text files after migration
- Open browser to http://localhost:5000 (or :8000 for production)
- You should see the WikiFetch dashboard
- Try searching for "Python programming"
- Article should appear and be saved to database
- Check sidebar for saved articles list
Create a .env file in the WikiFetch directory for custom configuration:
FLASK_ENV=development
PORT=5000
DATABASE_PATH=./data/wikifetch.dbDocker environment:
Set environment variables in docker-compose.yml or create a .env file.
Available variables:
FLASK_ENV:developmentorproductionPORT: Port number (default: 5000 for dev, 8000 for prod)DATABASE_PATH: Path to SQLite database file
- Search Wikipedia: Enter article name in search box and click "Search"
- View Saved Articles: Click any article in the sidebar to view it
- Import Files: Use the "Import from Files" section to migrate text files
- Browse Offline: All saved articles are accessible without internet
The application provides REST API endpoints for programmatic access.
curl http://localhost:5000/api/articles
curl http://localhost:5000/api/articles?limit=20&offset=0curl http://localhost:5000/api/articles/1curl -X POST http://localhost:5000/api/search \
-H "Content-Type: application/json" \
-d '{"query": "python"}'curl http://localhost:5000/api/statscurl -X DELETE http://localhost:5000/api/articles/1Symptom: Error message about port 5000 or 8000 already in use
Windows:
netstat -ano | findstr :5000Linux/macOS:
lsof -i :5000Solution: Either stop the conflicting service or change the PORT environment variable.
Symptom: "Cannot connect to Docker daemon"
Windows/macOS: Ensure Docker Desktop is started Linux:
sudo systemctl start dockerSymptom: Permission denied when running Docker commands
Solution:
sudo usermod -aG docker $USER
# Log out and back in for changes to take effectSymptom: "ModuleNotFoundError" when running Python
Solution:
- Ensure virtual environment is activated:
- Windows:
venv\Scripts\activate - Linux/macOS:
source venv/bin/activate
- Windows:
- Reinstall dependencies:
pip install -r requirements.txt
Symptom: "Database is locked" error
Solution:
- Close any other processes accessing the database
- Restart the application
- If using Docker, restart the container
Symptom: Errors when searching multiple articles quickly
Solution: Wait a few seconds between requests. The Wikipedia API has rate limits.
# Pull latest changes
git pull
# Rebuild and restart
docker-compose down
docker-compose up --build# Pull latest changes
git pull
# Activate virtual environment
source venv/bin/activate # or venv\Scripts\activate on Windows
# Update dependencies
pip install -r requirements.txt --upgrade
# Restart application
python app.py# Stop and remove containers
docker-compose down
# Remove volumes (WARNING: deletes database)
docker-compose down -v
# Remove images
docker rmi wikifetch_wikifetch# Deactivate virtual environment
deactivate
# Remove project directory
cd ..
rm -rf WikiFetch # Linux/macOS
# or: rmdir /s WikiFetch # WindowsDocker:
docker-compose upHot reload is enabled - code changes will automatically restart the server.
Python:
source venv/bin/activate
FLASK_ENV=development python app.pyWikiFetch/
├── app.py # Main Flask application
├── database.py # SQLite database module
├── requirements.txt # Python dependencies
├── Dockerfile # Multi-stage Docker configuration
├── docker-compose.yml # Development Docker Compose
├── docker-compose.prod.yml # Production Docker Compose
├── .dockerignore # Docker build exclusions
├── templates/
│ └── index.html # Web UI template
├── data/
│ └── wikifetch.db # SQLite database (created at runtime)
└── downloaded_data/ # Legacy text files (optional)
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/articles |
List all saved articles (pagination supported) |
| GET | /api/articles/:id |
Get specific article by ID |
| POST | /api/search |
Search saved articles |
| DELETE | /api/articles/:id |
Delete article |
| GET | /api/stats |
Database statistics |
| GET | /migration-status |
Check migration status |
| POST | /migrate |
Migrate text files to database |
- Use
docker-compose.prod.ymlfor production - Set
FLASK_ENV=production - Use gunicorn (included in production Docker setup)
- Place behind reverse proxy (nginx) for SSL/TLS
- Regular database backups
- Monitor disk space for database growth
- Set up log rotation
server {
listen 80;
server_name your-domain.com;
location / {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}Docker:
docker cp wikifetch_wikifetch_1:/app/data/wikifetch.db ./backup/wikifetch_$(date +%Y%m%d).dbPython:
cp data/wikifetch.db backup/wikifetch_$(date +%Y%m%d).dbContributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make your changes
- Test locally (both Docker and Python methods)
- Commit your changes:
git commit -m "Description" - Push to your fork:
git push origin feature-name - Open a Pull Request
This project is provided as-is for educational and personal use.
For issues, questions, or feature requests, please open an issue on the GitHub repository.
- Added SQLite database for article storage
- Implemented REST API for offline access
- Modern split dashboard UI
- Migration tool for importing text files
- Docker support with dev and prod configurations
- Cross-platform setup documentation
- Basic Wikipedia search
- Text file storage
- Simple web interface
Enjoy using WikiFetch!