The local index. A self-hosted BitTorrent DHT spider and search engine.
Serma autonomously discovers torrents from the BitTorrent DHT network, enriches metadata, and provides a clean web interface for searching your personal torrent index.
- 🕷️ Autonomous DHT Spider: Crawls the BitTorrent DHT network to discover new torrents
- 🔍 Full-Text Search: Fast search powered by Tantivy (Rust's Lucene alternative)
- 📊 Metadata Enrichment: Automatically fetches torrent metadata using the ut_metadata extension
- 🧹 Automatic Cleanup: Removes inactive/low-seed torrents to keep the index fresh
- 🌐 Clean Web UI: Minimalist dark-mode interface for browsing and searching
- 🚀 High Performance: Built in Rust for speed and efficiency
- 💾 Embedded Storage: Uses Sled (embedded database) and Tantivy (search index)
Serma consists of several background tasks:
- Spider (
spider.rs): BEP-5 DHT crawler that discovers info hashes from DHT traffic - Enricher (
enrich.rs): Fetches full torrent metadata via DHT peer lookup and ut_metadata protocol - Indexer (
index.rs): Maintains a full-text search index using Tantivy - Cleanup (
cleanup.rs): Periodic task to remove stale/low-quality torrents - Web Server (
web.rs): Axum-based HTTP server providing search UI and API
- Rust 1.75 or later (edition 2024)
- Linux, macOS, or Windows (tested on Linux)
- ~16-32 GB disk space for a meaningful index (grows over time)
- Open UDP port (optional, but recommended for better DHT connectivity)
git clone <repository-url> serma
cd serma
cargo build --release./target/release/sermaBy default, Serma:
- Stores data in
./datadirectory - Serves web UI on
http://localhost:3000 - Uses an ephemeral UDP port for DHT traffic
Navigate to http://localhost:3000 in your browser to start searching.
Serma is configured via environment variables, optionally loaded from a local .env file.
cp .env.example .env
$EDITOR .env
./target/release/sermaPrecedence:
- Process environment variables override
.env .envoverrides built-in defaults
The complete, up-to-date list of configuration options lives in .env.example.
| Variable | Default | Description |
|---|---|---|
SERMA_DATA_DIR |
data |
Directory for database and index storage |
SERMA_ADDR |
(unset) | HTTP server bind address (if unset, dual loopback is used) |
SERMA_WEB_PORT |
3000 |
Web port used when SERMA_ADDR is unset (binds 127.0.0.1 and ::1) |
SERMA_SPIDER |
enabled | Set to 0, false, off, or no to disable DHT spider |
SERMA_SPIDER_BIND |
0.0.0.0:0 |
UDP bind address for DHT spider |
SERMA_SPIDER_BOOTSTRAP |
built-in list | Comma-separated DHT bootstrap nodes |
SERMA_CLEANUP |
enabled | Set to 0, false, off, or no to disable cleanup |
SERMA_SOCKS5_PROXY |
(unset) | Optional SOCKS5 proxy for DHT UDP traffic (e.g. socks5://127.0.0.1:1080 or socks5://user:pass@host:1080) |
SERMA_SOCKS5_USERNAME |
(unset) | SOCKS5 username (if not provided in URL) |
SERMA_SOCKS5_PASSWORD |
(unset) | SOCKS5 password (if not provided in URL) |
RUST_LOG |
info |
Log level (trace, debug, info, warn, error) |
Run on custom port:
SERMA_ADDR=0.0.0.0:8080 ./target/release/sermaUse specific DHT port:
SERMA_SPIDER_BIND=0.0.0.0:6881 ./target/release/sermaIncrease logging verbosity:
RUST_LOG=debug ./target/release/sermaDisable the DHT spider (search-only mode):
SERMA_SPIDER=false ./target/release/sermaProxy DHT traffic via SOCKS5 (privacy):
SERMA_SOCKS5_PROXY=socks5://127.0.0.1:1080 ./target/release/sermaSerma exposes a simple HTTP API:
GET /api/search?q=<query>&limit=<limit>&offset=<offset>
Parameters:
q: Search query (required)limit: Results per page (default: 50, max: 500)offset: Pagination offset (default: 0)
Response:
{
"results": [
{
"info_hash": "abc123...",
"title": "Example Torrent",
"magnet": "magnet:?xt=urn:btih:...",
"seeders": 42
}
],
"total": 1234,
"limit": 50,
"offset": 0
}GET /api/torrent/<info_hash>
Response:
{
"info_hash": "abc123...",
"title": "Example Torrent",
"magnet": "magnet:?xt=urn:btih:...",
"seeders": 42,
"first_seen": 1704931200000,
"last_seen": 1704931200000
}All data is stored in the SERMA_DATA_DIR (default: ./data):
data/
├── sled/ # Embedded key-value database (torrent metadata)
└── tantivy/ # Full-text search index
Backup: Simply copy the entire data/ directory to back up your index.
- Discovery: The DHT spider joins the BitTorrent DHT network by connecting to bootstrap nodes
- Harvesting: Listens for
announce_peerandget_peersqueries to discover info hashes - Enrichment: For each discovered hash:
- Performs DHT peer lookup
- Connects to peers and requests metadata via BEP-9 (ut_metadata)
- Extracts torrent name and file information
- Indexing: Stores metadata in Sled and indexes it in Tantivy for fast search
- Cleanup: Periodically removes torrents with low seeders or inactivity
- Initial seeding: It may take 1-2 hours to discover your first 1,000 torrents
- Index growth: Expect ~10-50k new torrents per day depending on DHT traffic
- Memory usage: ~100-300 MB RAM typical, ~500 MB during heavy indexing
- Disk I/O: Mostly sequential writes, SSD recommended but not required
- This software interacts with the public BitTorrent DHT network
- You are discovering content that others are sharing; you are not hosting or distributing it
- Be aware of the legal implications in your jurisdiction
- Consider using a VPN if privacy is a concern
- Alternatively, set
SERMA_SOCKS5_PROXYto route DHT UDP traffic via a SOCKS5 proxy - Do not expose the web interface to the public internet without authentication
See LICENSE for the full disclaimer.
src/
├── main.rs # Application entry point
├── spider.rs # DHT spider implementation
├── enrich.rs # Metadata fetcher
├── index.rs # Tantivy search index wrapper
├── storage.rs # Sled database operations
├── cleanup.rs # Cleanup task
└── web.rs # Axum web server and UI
cargo runcargo test- Check logs: Ensure the spider is running (
RUST_LOG=debug) - Wait: Initial discovery can take 30-60 minutes
- Network: Ensure UDP traffic isn't blocked by firewall
- DHT: Try specifying a fixed port with
SERMA_SPIDER_BIND
- The in-memory bloom filter uses ~16 MB for deduplication
- Tantivy's index writer may use up to 500 MB during heavy writes
- Consider reducing
SERMA_SPIDERtraffic or increasing system resources
- Serma includes automatic cleanup (default: every 10s; see
.env.example) - Adjust cleanup thresholds in
.envif needed - Manually delete
data/and restart to reset the index
This is a personal project, but issues and pull requests are welcome for:
- Bug fixes
- Performance improvements
- Documentation improvements
Please note that this project is provided as-is with no warranty.
See LICENSE file for details.
This software is provided for educational and research purposes only. The authors and contributors:
- Do not endorse, encourage, or facilitate copyright infringement
- Are not responsible for how you use this software
- Are not liable for any legal consequences resulting from its use
- Make no warranties about the software's fitness for any purpose
Use at your own risk and in accordance with your local laws.
Made with ❤️ and Rust
