Professional Indonesian Text-to-Speech API built by ARSA Technology using Microsoft Edge TTS.
Perfect for content creators, developers, and businesses needing high-quality Indonesian voice synthesis for videos, applications, and automation workflows.
🇮🇩 Native Indonesian Support - Natural sounding Indonesian voices (female & male)
🇺🇸 English Support - Professional US English voices
🚀 High Performance - Fast generation with concurrent request handling
📦 Batch Processing - Generate multiple audio files simultaneously
🔄 Auto Cleanup - Automatic file management and cleanup
📊 Analytics - Built-in statistics and monitoring
🐳 Docker Ready - One-command deployment
🌐 Remote Access - API accessible from anywhere
📖 Interactive Docs - Auto-generated API documentation
🏥 Health Monitoring - Built-in health checks and status monitoring
🔑 API Key Authentication - Production-ready auth via X-API-Key header (single or multi-key)
🛡️ Rate Limiting - Per-route quotas keyed by API key (fallback to IP), backed by in-memory or Redis
- Video Content Creation - Generate narration for educational videos
- E-Learning Platforms - Create course audio in Indonesian
- Marketing Automation - Automated voice-overs for social media
- Accessibility Tools - Text-to-speech for Indonesian applications
- IoT & AI Projects - Voice responses for smart devices
- Content Localization - Convert text content to Indonesian audio
git clone https://github.com/arsa-technology/edge-tts-api.git
cd edge-tts-api
docker-compose up -ddocker run -d \
--name arsa-edge-tts \
-p 8021:8021 \
-v $(pwd)/output:/app/output \
arsa/edge-tts-api:latestgit clone https://github.com/arsa-technology/edge-tts-api.git
cd edge-tts-api
pip install -r requirements.txt
python main.pymkdir -p /www/wwwroot/nama_project_tts/audio
chmod -R 755 /www/wwwroot/nama_project_tts/audio
chown -R www:www /www/wwwroot/nama_project_tts/audio
location /audio/ {
alias /www/wwwroot/nama_project_tts/audio/;
add_header Access-Control-Allow-Origin *;
add_header Cache-Control "public, max-age=86400";
try_files $uri $uri/ =404;
}| Voice ID | Language | Gender | Description |
|---|---|---|---|
female |
Indonesian | Female | Professional, clear pronunciation |
male |
Indonesian | Male | Authoritative, business tone |
female_us |
English | Female | Natural US English |
male_us |
English | Male | Professional US English |
Note: When
API_KEYis configured, add-H "X-API-Key: <your-key>"to every protected request. The examples below omit it for brevity — see Authentication for details.
curl -X POST http://localhost:8021/tts \
-H "Content-Type: application/json" \
-H "X-API-Key: $API_KEY" \
-d '{
"text": "Selamat datang di ARSA Technology, perusahaan AI terdepan di Indonesia",
"voice": "female",
"language": "indonesian"
}'curl -X POST http://localhost:8021/tts \
-H "Content-Type: application/json" \
-d '{
"text": "ARSA Technology menghadirkan solusi AI dengan akurasi 99,67 persen",
"voice": "female",
"rate": "+15%",
"pitch": "+30Hz",
"volume": "+10%",
"language": "indonesian",
"output_format": "wav"
}'curl -X POST http://localhost:8021/tts/batch \
-H "Content-Type: application/json" \
-d '[
{
"text": "Selamat pagi, Indonesia!",
"voice": "female",
"language": "indonesian"
},
{
"text": "Good morning, world!",
"voice": "female_us",
"language": "english"
}
]'import requests
# Generate Indonesian speech
response = requests.post('http://localhost:8021/tts', json={
"text": "Teknologi AI untuk masa depan Indonesia",
"voice": "female",
"rate": "+10%",
"language": "indonesian"
})
result = response.json()
if result["success"]:
# Download the audio file
audio_response = requests.get(f"http://localhost:8021{result['audio_url']}")
with open("output.wav", "wb") as f:
f.write(audio_response.content)const axios = require('axios');
const fs = require('fs');
async function generateIndonesianTTS() {
try {
// Generate speech
const response = await axios.post('http://localhost:8021/tts', {
text: 'ARSA Technology menghadirkan inovasi AI terdepan',
voice: 'female',
rate: '+10%',
language: 'indonesian'
});
if (response.data.success) {
// Download audio
const audioResponse = await axios.get(
`http://localhost:8021${response.data.audio_url}`,
{ responseType: 'arraybuffer' }
);
fs.writeFileSync('output.wav', audioResponse.data);
console.log('Audio generated successfully!');
}
} catch (error) {
console.error('Error:', error);
}
}
generateIndonesianTTS();Authentication is opt-in via environment variable:
- Set
API_KEY(single) orAPI_KEYS(comma-separated) → auth enabled. - Leave both empty → auth disabled (development mode; a warning is logged on startup).
Public endpoints (no key required): GET /, GET /health, GET /voices, /docs, /redoc.
Protected endpoints (key required): POST /tts, POST /tts/batch, GET /audio/{id}, GET /stats.
Enable for production:
# Generate a strong key
python -c "import secrets; print(secrets.token_urlsafe(32))"
# .env
API_KEY=paste-the-generated-key-here
# or rotate / per-tenant
API_KEYS=key-for-mobile-app,key-for-internal-batch,key-for-partner-xSend the key on every protected request:
curl -X POST http://localhost:8021/tts \
-H "Content-Type: application/json" \
-H "X-API-Key: paste-the-generated-key-here" \
-d '{"text": "Halo dunia", "voice": "female", "language": "indonesian"}'import os, requests
HEADERS = {"X-API-Key": os.environ["API_KEY"]}
r = requests.post("http://localhost:8021/tts", headers=HEADERS, json={
"text": "Teknologi AI untuk masa depan Indonesia",
"voice": "female",
"language": "indonesian",
})Verify auth status from the public health check:
curl http://localhost:8021/health
# { "status": "healthy", ..., "auth_enabled": true }Responses on failure: 401 Unauthorized (header missing) · 403 Forbidden (key invalid).
Keys are compared in constant time (secrets.compare_digest) to avoid timing leaks.
All routes are rate-limited via slowapi. Limits are keyed by API key when present, otherwise client IP — so each authenticated caller gets its own quota and one noisy IP cannot starve others.
Default per-route limits (override via env, syntax <count>/<period>, period = second|minute|hour|day):
| Route | Env Var | Default |
|---|---|---|
| Global default (all routes) | RATE_LIMIT_DEFAULT |
60/minute |
POST /tts |
RATE_LIMIT_TTS |
30/minute |
POST /tts/batch |
RATE_LIMIT_TTS_BATCH |
5/minute |
GET /audio/{id} |
RATE_LIMIT_AUDIO |
120/minute |
GET /stats |
RATE_LIMIT_STATS |
30/minute |
Every response includes:
X-RateLimit-Limit: 30
X-RateLimit-Remaining: 27
X-RateLimit-Reset: 1716100000
When a limit is exceeded, the API returns 429 Too Many Requests with a Retry-After header.
Multi-replica / multi-worker deployments: the default in-memory store is per-process, so quotas won't be shared. Point all instances at Redis:
RATE_LIMIT_STORAGE_URI=redis://redis:6379…and add a redis service in your compose file.
| Variable | Default | Description |
|---|---|---|
TTS_MAX_TEXT_LENGTH |
5000 |
Maximum characters per request |
TTS_CLEANUP_INTERVAL |
3600 |
File cleanup interval (seconds) |
PYTHONUNBUFFERED |
1 |
Python output buffering |
OUTPUT_DIR |
./app/output |
Where generated audio is stored inside the container |
HOST_OUTPUT_DIR |
(unset) | Host path bind-mounted to OUTPUT_DIR (used by docker-compose.yml) |
API_KEY |
(empty) | Single API key. Empty = auth disabled (dev mode). |
API_KEYS |
(empty) | Comma-separated keys (multi-tenant / rotation). Takes precedence over API_KEY. |
API_KEY_HEADER |
X-API-Key |
HTTP header clients must use to send the key |
RATE_LIMIT_DEFAULT |
60/minute |
Default limit for all routes |
RATE_LIMIT_TTS |
30/minute |
Limit for POST /tts |
RATE_LIMIT_TTS_BATCH |
5/minute |
Limit for POST /tts/batch |
RATE_LIMIT_AUDIO |
120/minute |
Limit for GET /audio/{id} |
RATE_LIMIT_STATS |
30/minute |
Limit for GET /stats |
RATE_LIMIT_STORAGE_URI |
memory:// |
Storage for limiter; use redis://host:6379 for multi-replica setups |
See
.env.examplefor a ready-to-copy template.
version: '3.8'
services:
edge-tts:
build: .
ports:
- "8021:8021"
environment:
- TTS_MAX_TEXT_LENGTH=5000
- TTS_CLEANUP_INTERVAL=3600
# Authentication (leave API_KEY empty to disable for dev)
- API_KEY=${API_KEY:-}
- API_KEYS=${API_KEYS:-}
- API_KEY_HEADER=${API_KEY_HEADER:-X-API-Key}
# Rate limiting
- RATE_LIMIT_DEFAULT=${RATE_LIMIT_DEFAULT:-60/minute}
- RATE_LIMIT_TTS=${RATE_LIMIT_TTS:-30/minute}
- RATE_LIMIT_TTS_BATCH=${RATE_LIMIT_TTS_BATCH:-5/minute}
- RATE_LIMIT_AUDIO=${RATE_LIMIT_AUDIO:-120/minute}
- RATE_LIMIT_STATS=${RATE_LIMIT_STATS:-30/minute}
- RATE_LIMIT_STORAGE_URI=${RATE_LIMIT_STORAGE_URI:-memory://}
volumes:
- ./output:/app/output
restart: unless-stopped| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Service information |
/health |
GET | Health check |
/voices |
GET | List available voices |
/tts |
POST | Generate single audio |
/tts/batch |
POST | Generate multiple audios |
/audio/{audio_id} |
GET | Download audio file |
/stats |
GET | Service statistics |
/docs |
GET | Interactive API documentation |
Run the comprehensive test suite:
# If auth is enabled on the server, export the key first:
export API_KEY=your-key-here # Linux/Mac
$env:API_KEY = "your-key-here" # Windows PowerShell
# Test locally
python test_client.py
# Test remote server
python test_client.py YOUR_SERVER_IP
# Expected output:
# ✅ Health Check: healthy
# ✅ Voice Listing: 4 voices available
# ✅ Indonesian TTS: Generated successfully
# ✅ English TTS: Generated successfully
# ✅ Batch TTS: 3/3 successful
# ✅ Service Stats: All metrics available# Ubuntu/Debian
sudo ufw allow 8021/tcp
# CentOS/RHEL
sudo firewall-cmd --permanent --add-port=8021/tcp
sudo firewall-cmd --reloadAWS Security Group:
Type: Custom TCP
Port: 8021
Source: 0.0.0.0/0 (or specific IPs)
Google Cloud Firewall:
gcloud compute firewall-rules create edge-tts-api \
--allow tcp:8021 \
--source-ranges 0.0.0.0/0server {
listen 80;
server_name your-domain.com;
location /api/tts/ {
proxy_pass http://localhost:8021/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}- Generation Speed: ~2-5 seconds per 100 words
- Concurrent Requests: Handles multiple simultaneous requests
- Memory Usage: ~100-200MB per container
- File Size: ~1MB per minute of audio (WAV format)
- Text Length: Max 5,000 characters per request (
TTS_MAX_TEXT_LENGTH) - Batch Size: Max 10 requests per batch
- File Retention: Auto-cleanup after 1 hour (
TTS_CLEANUP_INTERVAL) - Request Throughput: Per-key/IP rate limits — see Rate Limiting section above
- Output: WAV (default), MP3
- Sample Rate: 22kHz (Edge TTS default)
- Channels: Mono
- Bit Depth: 16-bit
edge-tts-service/
├── main.py # FastAPI application
├── requirements.txt # Python dependencies
├── Dockerfile # Container configuration
├── docker-compose.yml # Service orchestration
├── test_client.py # Test suite
├── .env # Environment variables
└── output/ # Generated audio files
# Clone repository
git clone https://github.com/arsa-technology/edge-tts-api.git
cd edge-tts-api
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run development server
uvicorn main:app --reload --host 0.0.0.0 --port 8021- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Authentication: ✅ Built-in — set
API_KEY/API_KEYS(see Authentication section) - Rate Limiting: ✅ Built-in — per-key/IP throttling (see Rate Limiting section)
- Input Validation: Pydantic schema + length cap via
TTS_MAX_TEXT_LENGTH - CORS: Default is
allow_origins=["*"]— restrict to your domains for browser clients - Network Security: Terminate TLS at a reverse proxy (Nginx/Caddy) and restrict access by IP/firewall
- Resource Limits: Set container memory/CPU limits (see Docker example below)
- Shared Rate Limit Store: If you run multiple replicas, point
RATE_LIMIT_STORAGE_URIat Redis so quotas are global
# Example production configuration
services:
edge-tts:
build: .
user: "1000:1000" # Non-root user
read_only: true # Read-only filesystem
tmpfs:
- /tmp:rw,noexec,nosuid,size=100m
deploy:
resources:
limits:
memory: 512M
cpus: '1.0'This project is licensed under the MIT License - see the LICENSE file for details.
- 📖 Documentation: API Docs
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
For enterprise support, custom development, or consulting services:
- 🌐 Website: arsa.technology
- 📧 Email: support@arsa.technology
- 📱 WhatsApp: Contact Us
ARSA Technology is Indonesia's leading AI and IoT solutions provider, specializing in:
- 🤖 Artificial Intelligence - Face recognition, computer vision, NLP
- 🌐 Internet of Things - Industrial monitoring, smart city solutions
- 🏭 Industry 4.0 - Manufacturing automation and optimization
- 🏥 Digital Health - Medical AI and self-service health platforms
- 🎓 Virtual Reality - Immersive training and simulation
Trusted by: Ministry of Defense of Indonesia, Indonesian National Police, and leading enterprises across Southeast Asia.
- ARSACA: Advanced vision AI analytics for human recognition and safety
- AKSAYANA: Vehicle analytics and license plate recognition
- SYNAPTA: Medical AI platform for diagnostics and health monitoring
- ANIYATA: VR solutions for industrial training and simulation
Video Content Creation
# Generate educational content in Indonesian
educational_script = """
Teknologi AI ARSA telah membantu berbagai industri di Indonesia.
Dengan akurasi 99,67 persen dalam pengenalan wajah,
sistem kami mengamankan fasilitas strategis negara.
"""
tts_response = requests.post('http://localhost:8021/tts', json={
"text": educational_script,
"voice": "female",
"rate": "+10%",
"language": "indonesian"
})Multilingual Content
# Create bilingual content for international audience
contents = [
{
"text": "Selamat datang di masa depan teknologi Indonesia",
"voice": "female",
"language": "indonesian"
},
{
"text": "Welcome to the future of Indonesian technology",
"voice": "female_us",
"language": "english"
}
]
batch_response = requests.post('http://localhost:8021/tts/batch', json=contents)- ✅ Indonesian and English TTS
- ✅ Batch processing
- ✅ Docker deployment
- ✅ Remote access
- ✅ Auto cleanup
- ✅ API key authentication (single + multi-key, constant-time compare)
- ✅ Per-route rate limiting (per-key/IP, in-memory or Redis-backed)
- 🔄 Regional Indonesian Dialects - Javanese, Sundanese voices
- 🔑 JWT/OAuth2 - Token-based auth as an alternative to API keys
- 📊 Advanced Analytics - Usage metrics and reporting per API key
- 🎛️ Voice Customization - Emotion and style controls
- 📱 Mobile SDK - iOS and Android libraries
- 🧠 AI Voice Cloning - Custom voice training
- 🎵 SSML Support - Advanced speech markup
- ☁️ Cloud Integration - AWS, GCP, Azure deployments
- 🔄 Real-time Streaming - Live TTS streaming
Made with ❤️ by ARSA Technology
🌐 Website • 📧 Email • 📱 WhatsApp
⭐ Star this repository if it helped you! ⭐