👹 Samurai Telegram Bot

Simple, yet effective auto-moderator bot for telegram.
With reports, logs, profanity filter, anti-spam AI, NSFW detection, reputation system and more :3

What samurai do?

Anti-Profanity: Automatically detects and removes messages containing profanity (Russian/English)
Anti-Spam: ML-based spam detection for new users
NSFW Detection: Profile photo analysis for NSFW accounts
Reputation System: Users gain reputation through positive participation
Report System: Users can report messages to admins
Scheduled Announcements: Periodic automated messages

Code Hierarchy

samurai/
├── bot.py                 # Main entry point
├── config/
│   ├── __init__.py
│   ├── settings.py        # Pydantic configuration
│   └── config.toml        # Configuration file
├── core/
│   ├── __init__.py
│   └── i18n.py            # Fluent internationalization
├── db/
│   ├── __init__.py
│   ├── database.py        # Database setup
│   └── models/
│       ├── member.py      # Member model
│       └── spam.py        # Spam record model
├── filters/
│   ├── is_owner.py
│   ├── is_admin.py
│   ├── throttle.py
│   └── .. other useful filters
├── handlers/
│   ├── admin_actions.py   # Ban/unban commands
│   ├── callbacks.py       # Inline button handlers
│   ├── exceptions.py      # Error handler
│   ├── group_events.py    # Main message processing
│   ├── personal_actions.py# Ping, profanity check
│   └── user_actions.py    # Report command
├── locales/
│   ├── en/
│   │   ├── strings.ftl    # English translations
│   │   └── announcements.ftl
│   └── ru/
│       ├── strings.ftl    # Russian translations
│       └── announcements.ftl
├── middlewares/
│   ├── __init__.py
│   ├── throttling.py      # Middleware for rate limiting
│   └── i18n.py            # I18n middleware
├── services/
│   ├── announcements.py   # Scheduled announcements
│   ├── cache.py           # LRU caching
│   ├── gender.py          # Gender detection
│   ├── nsfw.py            # NSFW detection
│   ├── profanity.py       # Profanity detection
│   ├── healthcheck.py     # Healthcheck server for containers orchestration
│   ├── ml_manager.py      # Unloads unused ML models from memory after some time
│   └── spam.py            # Spam detection
├── utils/
│   ├── helpers.py         # Utility functions
│   ├── enums.py           # Some useful enums to keep the codebase consistent
│   └── localization.py    # Localization exports
├── libs/                  # External libraries (censure, gender_extractor)
├── ruspam_model/          # ML model for spam detection
├── requirements.txt
├── Dockerfile
├── config.py              # Configuration of the bot
├── db_init.py             # Use this to initialize your database tables
└── .env.example

Internationalization (i18n)

The bot uses Project Fluent for translations.

Usage in handlers

# Method 1: Import _ function directly
from core.i18n import _

async def handler(message: Message) -> None:
    text = _("error-no-reply")
    await message.reply(text)

# Method 2: Use i18n from middleware (user's locale)
async def handler(message: Message, i18n: Callable) -> None:
    text = i18n("error-no-reply")
    await message.reply(text)

# With variables
text = _("report-message", date="2024-01-01", chat_id="123", msg_id="456")

Adding new translations

Create/edit .ftl files in locales/{lang}/
Use hyphenated keys: error-no-reply
Variables use { $var } syntax

Example locales/ru/strings.ftl:

error-no-reply = Эта команда должна быть ответом на сообщение!
report-message = 👆 Отправлено { $date }
    <a href="https://t.me/c/{ $chat_id }/{ $msg_id }">Перейти</a>

Installation

Prerequisites

Python 3.11+ is required
Bot token from @BotFather

Setup process

Clone the repository
Install dependencies:
```
pip install -r requirements.txt
```
Copy .env.example to .env and fill in your values:
```
cp .env.example .env
```
Configure config.toml with your group IDs and other settings how you like
Run the bot:
```
python bot.py
```
Enjoy!

Environment Variables in Production

For production deployments, you can also set environment variables directly instead of using .env file:

# Export variables directly
export BOT_TOKEN="your_bot_token"
export BOT_OWNER="your_user_id"
export GROUPS_MAIN="-1001234567890"
export DB_URL="sqlite:///./samurai.db"

# Or pass them inline
BOT_TOKEN="..." BOT_OWNER="..." python bot.py

For systemd services, add them to the unit file:

[Service]
Environment="BOT_TOKEN=your_token"
Environment="BOT_OWNER=123456789"

For Docker, use -e flags or --env-file:

docker run -e BOT_TOKEN="..." -e BOT_OWNER="..." samurai-bot
# or
docker run --env-file .env samurai-bot

Database Initialization

The db_init.py script can be used to create or recreate database tables.

⚠️ WARNING: This script will DROP ALL DATA in the tables!
Make sure to backup first if running on an existing database.

# 1. Open db_init.py and comment out or delete this line:
#    exit("COMMENT THIS LINE IN ORDER TO RE-INIT DATABASE TABLES")

# 2. Run the script
python db_init.py

# 3. Uncomment the exit() line again to prevent accidental runs or just delete this file after usage

Use this script ONLY when:

Setting up the bot for the first time
Migrating to a new database
Resetting all data (development only)

Docker

docker build -t samurai-bot .
docker run -d --name samurai-bot -v $(pwd)/config.toml:/app/config.toml samurai-bot

RAM usage

Currently bot uses ~800mb of RAM for ML models and for data caching.
~~Probably we could reduce ML models RAM usage by implementing ONNX runtime models, but that's plans for future updates.~~
That ain't worked, the only viable solution would be to quantize the models :3

For now, if your server doesn't handle and the process being killed with Out of memory (dmesg | grep -i "killed process"), simple solution is to add swap:

# Create 2GB swap file
fallocate -l 2G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile

# Make permanent
echo '/swapfile none swap sw 0 0' >> /etc/fstab

Configuration

Environment Variables

Variable	Description
`BOT_TOKEN`	Telegram bot token
`BOT_OWNER`	Owner's Telegram user ID
`GROUPS_MAIN`	Main group chat ID (can be a comma separated list)
`GROUPS_REPORTS`	Reports group chat ID
`GROUPS_LOGS`	Logs group chat ID
`LINKED_CHANNEL`	Linked channel ID (can be a comma separated list)
`DB_URL`	Database URL

Built-in Commands

User Commands

Command	Description
`!rules` / `/rules`	Request the chat rules
`!report` / `/report`	Report a message (reply)
`!me` / `!info`	Show user info
`!бу`	Fun command (bot pretends to be scared lol)
`@admin`	Call admin attention

Admin Commands

Command	Description
`!ban`	Ban user (reply)
`!unban`	Unban user (reply)
`!ping`	Check bot status
`!prof <text>`	Check text for profanity

Owner Commands

Command	Description
`!spam`	Mark message as spam (reply)
`!reward <points>`	Add reputation points
`!punish <points>`	Remove reputation points
`!setlvl <level>`	Set user level
`!rreset`	Reset user reputation
`!msg <text>`	Send message from bot
`!chatid`	Get current chat ID
`!reload`	Reload announcements from localization files
`!log <text>`	Write test log

External Libraries

The bot uses two external libraries in the libs/ folder:

censure: Russian/English profanity detection
gender_extractor: Gender detection from names

Credits

https://github.com/masteroncluster/py-censure - Profanity filter we used as a base
https://github.com/MasterGroosha/telegram-report-bot - Reports system we used as a base
https://huggingface.co/RUSpam/spam_deberta_v4 - Anti-Spam AI model we used as a base
https://github.com/wwydmanski/gender-extractor - Gender detection we used as a base
https://huggingface.co/prithivMLmods/siglip2-x256-explicit-content - Our current NSFW detection model

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

👹 Samurai Telegram Bot

What samurai do?

Code Hierarchy

Internationalization (i18n)

Usage in handlers

Adding new translations

Installation

Prerequisites

Setup process

Environment Variables in Production

Database Initialization

Docker

RAM usage

Configuration

Environment Variables

Built-in Commands

User Commands

Admin Commands

Owner Commands

External Libraries

Credits

Author of Samurai

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
config		config
core		core
db		db
filters		filters
handlers		handlers
libs		libs
locales		locales
middlewares		middlewares
ruspam_model		ruspam_model
services		services
utils		utils
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
bot.py		bot.py
config.toml		config.toml
db_init.py		db_init.py
group_events.py		group_events.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Priler/samurai

Folders and files

Latest commit

History

Repository files navigation

👹 Samurai Telegram Bot

What samurai do?

Code Hierarchy

Internationalization (i18n)

Usage in handlers

Adding new translations

Installation

Prerequisites

Setup process

Environment Variables in Production

Database Initialization

Docker

RAM usage

Configuration

Environment Variables

Built-in Commands

User Commands

Admin Commands

Owner Commands

External Libraries

Credits

Author of Samurai

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages