A condensed, production-ready guide to generating professional images with
gpt-image-2— distilled from the OpenAI Cookbook into a working playbook. EN + DE.
Most image-generation prompts fail in the same three ways: wrong block order, marketing adjectives instead of concrete materials, and missing exclusion lists. This repo is a working playbook — paste templates, swap your subject, ship.
Scene → Subject → Composition → Lighting → Color → Style → Atmosphere → Constraints → Production
That order, every time. Eight ready-to-fire templates included.
from openai import OpenAI
import base64
client = OpenAI()
# Pick a template from guide-en.md Section 5, drop in your subject
prompt = """
Editorial photograph for a B2B pitch deck slide.
Scene: a bright modern workshop room with a wooden conference table,
empty chairs arranged loosely, a single coffee cup, scattered notes.
Subject: the room itself — no people visible.
Visual style: photorealistic, taken on a 35mm camera with 50mm lens.
Composition: medium-wide shot, eye-level, slightly off-axis.
Lighting: warm natural daylight from window left, soft diffuse.
Color palette: warm whites, oak wood tones, subtle blue-grey shadows.
Constraints:
- No people, no faces, no logos.
- Documentary feel, not staged. No HDR.
- Generous negative space upper third for slide title overlay.
Quality: high. Size: 1536x1024.
"""
result = client.images.generate(
model="gpt-image-2",
prompt=prompt,
size="1536x1024",
quality="high",
n=2,
)
for i, item in enumerate(result.data, start=1):
image_bytes = base64.b64decode(item.b64_json)
with open(f"output-{i}.png", "wb") as f:
f.write(image_bytes)That's it. Two variants in ~30 seconds. Pick the better one.
- The 9-block universal template — the one structure that holds across all use cases
- Eight ready-to-fire templates — pitch slide, social visual, infographic, hero with people, methodology diagram, brand mark guidance, background texture, headline slide
- Power-modifier cheatsheet — concrete words that measurably improve output quality, grouped by effect (realism, depth-of-field, light, composition, atmosphere)
- Anti-patterns — the vocabulary that actively makes output worse (yes, "epic" and "cinematic" belong here)
- Brand-constants block — drop-in template for keeping outputs on-brand across iterations
- Edit workflows — for the text+image cases (background swap, lighting change, element removal, translation)
The OpenAI GPT Image Generation Models Prompting Guide is excellent reference material. But it reads like documentation. This repo turns the same content into an operational toolkit — the kind you actually paste into a prompt at 11 PM when you need a slide image done by tomorrow.
Two insights shaped the structure:
- The 9-block prompt order matters more than vocabulary. Same model, same words, different order — output quality changes visibly. Most failures are order failures, not word failures.
- Anti-vocabulary is as important as vocabulary. "Epic", "cinematic", "stunning" actively hurt output quality. The constraint block is where most prompts fail.
- Solo operators building marketing visuals, slide decks, landing pages
- Independent practitioners who need image generation in their workflow but don't have a designer on call
- Anyone whose output looks too generic, too stock-photo-ish, too AI-glossy
If you're a working illustrator or art director, you already know most of this. The audience here is the operator who needs reliable defaults.
- 🇬🇧 English version — full 12 sections, all templates
- 🇩🇪 Deutsche Version — vollständig, alle Templates
.
├── README.md ← you are here
├── guide-en.md ← full English playbook (12 sections)
├── guide-de.md ← vollständige deutsche Version
├── LICENSE ← CC BY 4.0 (docs) + MIT (code snippets)
└── .gitignore
- Example image gallery (real outputs from each template)
- Additional use-case templates: packaging, book covers, UI mockups
- Spanish + French translations (if community contributes)
- Comparison guide: gpt-image-2 vs other models for specific cases
- Cookbook-style notebook (Jupyter/Colab) for quick experimentation
Issues and pull requests welcome — see Contributing below.
Pull requests welcome — especially:
- Better anti-pattern examples with before/after output proof
- Use-case templates not yet covered (UI mockups, packaging, cover art, technical diagrams, infographic variants)
- Translations to other languages
- Real-world prompt iteration logs showing the 4-step refinement discipline
For larger changes, please open an issue first to discuss the direction.
- Documentation (all
.mdfiles): CC BY 4.0 — share, adapt, use commercially with attribution - Code snippets (Python examples): MIT — use, modify, redistribute freely
- Structure follows the OpenAI GPT Image Generation Models Prompting Guide — credit where due
- Power-modifier patterns synthesized through real production use across editorial, B2B, and conceptual work
Built and maintained by Dirk Häger — independent learning architect working at the intersection of training, coaching, and AI integration.
More about my work: focusinstitute.io · LinkedIn
If this saves you time, ⭐ star the repo or share with someone who'd benefit.