中文文档: README.zh-CN.md
clawvisual AI is an agent-skill pipeline that converts long-form text into short-form carousel/infographic content.
Default output constraints (fast mode):
post_title: one-sentence hook.post_caption: concise body, normalized to 100-300 characters.hashtags: 1-5 tags.slides: generated visual slides are required (not text-only output).- each slide should include
image_urlandvisual_prompt - cover slide (
slide_id: 1) must prioritize first-glance clarity and hook strength
- each slide should include
- Install dependencies:
npm install- Create local env file:
cp .env.local.template .env.local- Fill required keys in
.env.localat least:
LLM_API_URLLLM_API_KEYLLM_MODEL
- Start dev server:
npm run dev- Open in browser:
http://localhost:3000
clawvisual can be integrated into OpenClaw as a workspace/local skill via MCP.
- Run clawvisual service:
npm install
cp .env.local.template .env.local
npm run dev- Install this skill into OpenClaw:
- copy skills/clawvisual-mcp to either:
<openclaw-workspace>/skills/clawvisual-mcp(workspace scope), or~/.openclaw/skills/clawvisual-mcp(shared local scope)
- Configure skill runtime env:
CLAWVISUAL_MCP_URL=http://localhost:3000/api/mcp
CLAWVISUAL_API_KEY=<your_clawvisual_api_key_if_enabled>- Test the skill client locally:
npm run skill:clawvisual -- tools- Framework: Next.js App Router + TypeScript
- API:
POST /api/v1/convertstarts a 16-skill chain and returnsjob_idGET /api/v1/jobs/:idreturns status/progress/resultPOST /api/mcpJSON-RPC MCP endpoint (initialize,tools/list,tools/call)GET /api/openapi.jsonexports OpenAPI schema
- Skill system:
src/lib/skillscontains 16 atomic async skills - Prompt templates:
src/lib/prompts/index.ts - Orchestration:
src/lib/orchestrator.ts - Queue:
- Local in-memory job queue for immediate development
- API key validation:
src/lib/auth/api-key.ts
src/app/page.tsx: clawvisual dashboard UIsrc/app/api/v1/convert/route.ts: conversion entrypointsrc/app/api/v1/jobs/[id]/route.ts: job status endpointsrc/app/api/openapi.json/route.ts: OpenAPI exportsrc/lib/types: standard interfaces and context objectsrc/lib/skills: 16 atomic skill modules
Existing keys are reusable. Current scaffold reads:
LLM_API_URLLLM_API_KEYLLM_MODELLLM_TIMEOUT_MS(optional, default25000)LLM_COPY_FALLBACK_MODEL(optional, defaultgoogle/gemini-2.5-flash)LLM_COPY_POLISH_MODEL(optional, defaultopenai/gpt-5.1-mini)GEMINI_API_KEYNANO_BANANA_MODELNANO_BANANA_TIMEOUT_MS(optional, default60000)NANO_BANANA_TRANSIENT_RETRY_MAX(optional, default2)NANO_BANANA_RETRY_BASE_DELAY_MS(optional, default450)QUALITY_LOOP_ENABLED(optional, defaulttrue)QUALITY_AUDIT_THRESHOLD(optional, default78)QUALITY_IMAGE_COVER_THRESHOLD(optional, default85)QUALITY_IMAGE_INNER_THRESHOLD(optional, default78)QUALITY_COVER_FIRST_GLANCE_THRESHOLD(optional, default82)QUALITY_COVER_NOVELTY_THRESHOLD(optional, default80)QUALITY_COVER_CANDIDATE_COUNT(optional, default1)QUALITY_MAX_COPY_ROUNDS(optional, default1)QUALITY_MAX_IMAGE_ROUNDS(optional, default0)QUALITY_MAX_EXTRA_IMAGES(optional, default1)QUALITY_IMAGE_LOOP_MAX_MS(optional, default120000)QUALITY_IMAGE_AUDIT_SCOPE(optional,coverorall, defaultcover)PIPELINE_MODE(optional,fastorfull, defaultfast)PIPELINE_MAX_DURATION_MS(optional, default300000)PIPELINE_ENABLE_SOURCE_INTEL(optional, defaultfalsein fast mode)PIPELINE_ENABLE_STORYBOARD_QUALITY(optional, defaultfalsein fast mode)PIPELINE_ENABLE_STYLE_RECOMMENDER(optional, defaultfalsein fast mode)PIPELINE_ENABLE_ATTENTION_FIXER(optional, defaultfalsein fast mode)PIPELINE_ENABLE_POST_COPY_QUALITY(optional, defaultfalsein fast mode)PIPELINE_ENABLE_FINAL_AUDIT(optional, defaultfalsein fast mode)
Runtime observability:
- Thinking & Actions event timeline now includes per-step token usage deltas (
in/out/total) when providerusageis returned. - Final
skill_logsincludesllm_usage_summaryfor total request-level token aggregation. OPENROUTER_API_KEYTAVILY_API_KEYSERPER_API_KEYJINA_API_KEY
API security controls:
CLAWVISUAL_API_KEYScomma-separated accepted keysCLAWVISUAL_ALLOW_NO_KEYdefaulttruein local development
- This project includes async conversion pipeline + revision engine + MCP-compatible JSON-RPC endpoint.
- Real integrations (Flux/Midjourney, Redis/BullMQ worker process, PostgreSQL persistence, satori rendering) are left as plug-in points.
POST /api/mcp supports:
convert: create conversion jobjob_status: fetch current job status/resultrevise: create revision job for copy/image changesregenerate_cover: regenerate cover via job revision or direct prompt image call
Reusable external skill package:
Convenience command:
npm run skill:clawvisual -- tools



