Comprehensive guide for creating professional video presentations with SlideStream's AI-powered tools.
- Quick Start
- Installation
- Configuration System
- Creating Your First Video
- Working with Providers
- Advanced Workflows
- Troubleshooting
- Best Practices
# Install with all AI providers
pip install slide-stream[all-ai]slide-stream initThis creates a slidestream.yaml file in your current directory with example configuration.
Edit your configuration file or set environment variables:
export OPENAI_API_KEY="your-openai-key"
export ELEVENLABS_API_KEY="your-elevenlabs-key"slide-stream create presentation.md output.mp4pip install slide-stream# For OpenAI (DALL-E, GPT, TTS)
pip install slide-stream[openai]
# For ElevenLabs premium voices
pip install slide-stream[elevenlabs]
# For Google Gemini
pip install slide-stream[gemini]
# For Anthropic Claude
pip install slide-stream[claude]
# For Groq (fast inference)
pip install slide-stream[groq]
# All AI providers
pip install slide-stream[all-ai]- Python 3.10+
- FFmpeg (for video processing)
macOS:
brew install ffmpegUbuntu/Debian:
sudo apt update && sudo apt install ffmpegWindows: Download from FFmpeg website
SlideStream 2.0 uses YAML configuration files for maximum flexibility and maintainability.
SlideStream searches for configuration in this order:
./slidestream.yaml(current directory)~/.slidestream.yaml(home directory)- Built-in defaults
# slidestream.yaml
providers:
llm:
provider: openai
model: gpt-4o-mini
images:
provider: dalle3
fallback: text
tts:
provider: elevenlabs
voice: rachel
api_keys:
openai: "${OPENAI_API_KEY}"
elevenlabs: "${ELEVENLABS_API_KEY}"
settings:
video:
resolution: [1920, 1080]
fps: 24
cleanup: trueUse environment variables for secure API key management:
# OpenAI (for DALL-E 3, GPT, and OpenAI TTS)
export OPENAI_API_KEY="sk-..."
# ElevenLabs (for premium TTS)
export ELEVENLABS_API_KEY="..."
# Stock photo providers (optional)
export PEXELS_API_KEY="..."
export UNSPLASH_ACCESS_KEY="..."
# Other LLM providers
export GEMINI_API_KEY="..."
export ANTHROPIC_API_KEY="..."
export GROQ_API_KEY="..."Create different configurations for different use cases:
Basic Profile (basic.yaml):
providers:
llm:
provider: none
images:
provider: text
tts:
provider: gttsProfessional Profile (pro.yaml):
providers:
llm:
provider: openai
model: gpt-4o
images:
provider: dalle3
fallback: pexels
tts:
provider: elevenlabs
voice: rachel
api_keys:
openai: "${OPENAI_API_KEY}"
elevenlabs: "${ELEVENLABS_API_KEY}"
pexels: "${PEXELS_API_KEY}"Use with:
slide-stream create --config pro.yaml presentation.md video.mp4Create a simple Markdown file:
# Welcome to SlideStream
- Create professional video presentations
- Use AI to enhance your content
- Generate videos automatically
# Key Features
- AI-powered image generation
- Premium text-to-speech voices
- Smart content enhancement
- Professional video output
# Getting Started
- Install SlideStream
- Configure your providers
- Create your first videoGenerate the video:
slide-stream create presentation.md my-video.mp4SlideStream 2.0 supports PowerPoint files with enhanced features:
slide-stream create slides.pptx presentation.mp4PowerPoint Features:
- Extracts slide titles and content
- Uses speaker notes for enhanced narration
- Preserves slide structure
- Supports complex layouts
When using PowerPoint files, add speaker notes for better AI narration:
Slide Content: Key Benefits
• Faster development
• Better user experience
• Competitive advantage
Speaker Notes: In this slide, we'll explore the three main benefits of adopting our solution. First, you'll see dramatically faster development cycles, allowing your team to ship features in weeks rather than months. Second, your users will experience a more intuitive and responsive interface. Finally, these improvements will give you a significant competitive advantage in your market.
The AI will use these notes to create natural, flowing narration.
providers:
images:
provider: dalle3
fallback: text
api_keys:
openai: "${OPENAI_API_KEY}"Benefits:
- Custom images for each slide
- Relevant to your content
- Professional quality
- No licensing concerns
Requirements:
- OpenAI API key
- Pay-per-image pricing
Pexels:
providers:
images:
provider: pexels
fallback: text
api_keys:
pexels: "${PEXELS_API_KEY}"Unsplash:
providers:
images:
provider: unsplash
fallback: text
api_keys:
unsplash: "${UNSPLASH_ACCESS_KEY}"providers:
images:
provider: textAlways available as a fallback. Creates clean, professional text-based slides.
providers:
tts:
provider: elevenlabs
voice: rachel # or adam, aria, etc.
api_keys:
elevenlabs: "${ELEVENLABS_API_KEY}"Available Voices:
rachel: Professional female voiceadam: Clear male voicearia: Expressive female voicejosh: Warm male voice- And 900+ more voices
providers:
tts:
provider: openai
voice: nova # alloy, echo, fable, nova, onyx, shimmer
api_keys:
openai: "${OPENAI_API_KEY}"providers:
tts:
provider: gttsAlways available, no API key required.
LLMs improve your slide content by:
- Making bullet points flow naturally
- Creating engaging narratives
- Improving clarity and structure
- Generating better image search queries
OpenAI GPT:
providers:
llm:
provider: openai
model: gpt-4o-mini # or gpt-4o for higher qualityGoogle Gemini:
providers:
llm:
provider: gemini
model: gemini-1.5-flashAnthropic Claude:
providers:
llm:
provider: claude
model: claude-3-5-sonnet-20241022Create different configurations for different scenarios:
# Quick prototype with free services
slide-stream create --config basic.yaml draft.md prototype.mp4
# High-quality final version
slide-stream create --config premium.yaml final.md presentation.mp4
# Client-specific branding
slide-stream create --config client-brand.yaml proposal.md client-video.mp4Process multiple presentations:
# Create multiple videos
for file in *.md; do
output="${file%.md}.mp4"
slide-stream create "$file" "$output"
doneFine-tune video output:
settings:
video:
resolution: [1920, 1080] # 4K: [3840, 2160]
fps: 30 # Smooth playback
codec: libx264 # Compatibility
audio_codec: aac
slide_duration_padding: 2.0 # More time per slide
default_slide_duration: 8.0
image:
bg_color: "#1a1a1a" # Dark theme
font_color: "#ffffff"
title_font_size: 120
content_font_size: 80Automate presentation generation:
# .github/workflows/presentations.yml
name: Generate Presentations
on:
push:
paths: ['presentations/*.md']
jobs:
generate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install SlideStream
run: pip install slide-stream[all-ai]
- name: Generate Videos
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ELEVENLABS_API_KEY: ${{ secrets.ELEVENLABS_API_KEY }}
run: |
slide-stream create presentations/quarterly-review.md output/q4-review.mp4
slide-stream create presentations/product-launch.md output/launch.mp4slide-stream providersCheck which providers are available and their status.
- Verify environment variables:
echo $OPENAI_API_KEY - Check configuration file syntax
- Ensure API keys are valid and have sufficient credits
Install FFmpeg:
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg- Check disk space in temp directory
- Verify input file format (.md or .pptx)
- Try with
--configto use specific configuration
Adjust timing settings:
settings:
video:
slide_duration_padding: 2.0 # More padding
default_slide_duration: 6.0 # Longer defaultFor detailed error information:
PYTHONPATH=. python -m slide_stream.cli create --helpCheck what's working:
slide-stream providersOutput shows each provider's availability and requirements.
Markdown Structure:
# Clear, Descriptive Titles
- Use bullet points for key ideas
- Keep points concise and focused
- Aim for 3-5 points per slide
# Logical Flow
- Structure your presentation logically
- Use consistent formatting
- Include call-to-action slidesPowerPoint Tips:
- Use speaker notes for detailed explanations
- Keep slide content brief
- Use consistent layouts
- Include relevant images in slides
Environment Variables:
# .env file (don't commit to git)
OPENAI_API_KEY=sk-...
ELEVENLABS_API_KEY=...
PEXELS_API_KEY=...Version Control:
- Commit configuration files
- Use environment variables for secrets
- Create different configs for different environments
Fast Generation:
providers:
llm:
provider: groq # Fastest inference
model: llama-3.1-8b-instant
images:
provider: text # No API calls
tts:
provider: gtts # Free and fastHigh Quality:
providers:
llm:
provider: openai
model: gpt-4o # Best quality
images:
provider: dalle3
tts:
provider: elevenlabs
voice: rachelMonitor Usage:
- OpenAI: Check usage dashboard
- ElevenLabs: Monitor character usage
- Set usage alerts
Optimize Costs:
- Use text images for drafts
- Switch to premium providers for final versions
- Batch process multiple presentations
For Professional Presentations:
- Use DALL-E 3 or stock photos for images
- Use ElevenLabs or OpenAI TTS for voices
- Enable LLM content enhancement
- Use higher resolution (1920x1080 or 4K)
For Internal/Draft Use:
- Text-based images are sufficient
- gTTS provides adequate quality
- Disable LLM enhancement for speed
- Never commit API keys to version control
- Use environment variables or secure vaults
- Rotate API keys regularly
- Monitor API usage for unusual activity
- Use least-privilege API permissions
This guide covers the essential workflows for SlideStream 2.0. For development setup and contributing, see DEVELOPMENT_WORKFLOW.md.