diff --git a/.claude/commands/yt2pdf.md b/.claude/commands/yt2pdf.md index c0c5740..3a0eb20 100644 --- a/.claude/commands/yt2pdf.md +++ b/.claude/commands/yt2pdf.md @@ -39,22 +39,26 @@ If `hqdefault.jpg` fails, try `mqdefault.jpg`. If both fail, continue without th ## Step 3: Fetch Metadata & Extract Transcript -First, fetch video metadata (title, publish date, channel name) via yt-dlp: +First, fetch video metadata (title, publish date, channel name, language) via yt-dlp: ```bash yt-dlp --dump-json --skip-download "https://youtube.com/watch?v=VIDEO_ID" 2>/dev/null | python3 -c " import json,sys; d=json.load(sys.stdin) -print(json.dumps({'title':d.get('title',''),'uploader':d.get('uploader',''),'upload_date':d.get('upload_date',''),'duration':d.get('duration',0)}))" +print(json.dumps({'title':d.get('title',''),'uploader':d.get('uploader',''),'upload_date':d.get('upload_date',''),'duration':d.get('duration',0),'language':d.get('language','en')}))" ``` -Parse the JSON to get: title, uploader, upload_date (YYYYMMDD → YYYY-MM-DD). +Parse the JSON to get: title, uploader, upload_date (YYYYMMDD → YYYY-MM-DD), language. -Then extract transcript: +Then extract transcript with timestamps and the video's original language: ```bash -python3 scripts/yt/get_transcript.py VIDEO_ID +python3 scripts/yt/get_transcript.py VIDEO_ID --lang LANGUAGE --timestamps ``` +- Use the `language` field from metadata (e.g. `zh-tw`, `en`, `ja`). If missing, default to `en`. +- The `--timestamps` flag preserves `[MM:SS]` markers every ~30 seconds for timestamp links in the summary. +- If the transcript starts with `[NO_TIMESTAMPS]`, timestamps are unavailable (Whisper fallback) — skip timestamp links in Step 4. + Capture the stdout output as the transcript text. If it fails, reply with the error and stop. Update progress: "Transcript extracted (N chars). Generating summary..." @@ -66,9 +70,15 @@ Using the transcript, generate markdown summary file(s) in `output/youtube/YYYY- **Important**: - Each summary MUST include the thumbnail image reference (`thumb.jpg` — embedded as base64 in PDF automatically) -- Include metadata: title, YouTube link, published date, uploader/publisher +- ALL metadata fields are **required**: title, YouTube link, published date, uploader/publisher, tags - Include 3-5 **tags**: lowercase English topic tags covering companies (e.g. nvidia, openai), technologies (e.g. inference, rag), categories (e.g. policy, research, product, open-source) +**Timestamp links**: The transcript contains `[MM:SS]` markers. Convert them to YouTube timestamp links in the summary: + +- `[MM:SS]` → `[[MM:SS](https://youtube.com/watch?v=VIDEO_ID&t=TOTAL_SECONDS)]` where TOTAL_SECONDS = minutes × 60 + seconds +- Place timestamps at natural topic boundaries — aim for 5-15 per summary, not every marker +- If the transcript starts with `[NO_TIMESTAMPS]`, omit all timestamp links + ### If lang includes `en` → write `output/youtube/YYYY-MM-DD/VIDEO_ID/summary_en.md` Each metadata field MUST be on its own line. Use HTML `
` line breaks to ensure they render separately in the PDF: @@ -80,17 +90,52 @@ Each metadata field MUST be on its own line. Use HTML `
` line breaks to ensu **Publisher**: Uploader Name
**Source**: [Watch on YouTube](https://youtube.com/watch?v=VIDEO_ID)
-**Published**: 2026-04-04
+**Published**: YYYY-MM-DD
**Tags**: tag1, tag2, tag3, tag4, tag5 ## Summary -3-5 paragraph English summary covering: -- Key points and main arguments -- Notable quotes or data points -- Conclusions and takeaways +### 🔍 Section Title [[MM:SS](url&t=N)] + +- Key concept or argument +- Supporting detail, quote, or data point + +### 📌 Another Section [[MM:SS](url&t=N)] + +#### Sub-topic (when section has multiple distinct points) + +1. Enumerated item one +2. Enumerated item two + +- Practical takeaway or example + +### 💡 Conclusion / Key Takeaways + +- Takeaway 1 +- Takeaway 2 + +🔗 [Watch the full video](https://youtube.com/watch?v=VIDEO_ID) ``` +**Structure requirements:** + +- Generate a detailed, structured summary (**800-1200 words**) +- Use `##` for the main "Summary" heading +- Use `###` for each major topic/section, prefixed with a relevant emoji +- Use `####` for sub-topics when a section has multiple distinct points +- Use bullet points (`-`) for key concepts, practical takeaways, notable quotes +- Use numbered lists (`1.`) for sequential or enumerated content (e.g. "3 steps", "4 failure modes") +- Include YouTube timestamp links at key section headings and significant points +- End with a `### 💡 Conclusion` or `### 💡 Key Takeaways` section +- Add a video link at the bottom + +**Content requirements:** + +- Cover ALL major topics discussed, not just the first few +- Include specific examples, data points, and direct quotes when present +- Preserve technical terms and proper nouns accurately +- Each `###` section should have 2-4 bullet points minimum + ### If lang includes `zh-tw` → write `output/youtube/YYYY-MM-DD/VIDEO_ID/summary_zh-tw.md` ```markdown @@ -100,18 +145,53 @@ Each metadata field MUST be on its own line. Use HTML `
` line breaks to ensu **頻道**: Uploader Name
**來源**: [在 YouTube 上觀看](https://youtube.com/watch?v=VIDEO_ID)
-**發布日期**: 2026-04-04
+**發布日期**: YYYY-MM-DD
**標籤**: tag1, tag2, tag3, tag4, tag5 ## 摘要 -3-5 段繁體中文摘要,涵蓋: -- 重點與主要論點 -- 值得注意的引用或數據 -- 結論與要點 +### 🔍 段落標題 [[MM:SS](url&t=N)] + +- 核心概念或論點 +- 具體範例、數據或引述 + +### 📌 另一段落 [[MM:SS](url&t=N)] + +#### 子主題(當段落有多個重點時) + +1. 列舉項目一 +2. 列舉項目二 + +- 實踐方式或關鍵心得 + +### 💡 總結 / 重點整理 + +- 要點一 +- 要點二 + +🔗 [觀看完整影片](https://youtube.com/watch?v=VIDEO_ID) ``` -**Important**: Use Traditional Chinese (繁體中文) only. Never use Simplified Chinese. +**結構要求:** + +- 生成詳細的結構化摘要(**800-1200 字**) +- 使用 `##` 作為「摘要」主標題 +- 使用 `###` 標示每個主要主題,前方加上相關 emoji +- 使用 `####` 標示子主題(當一個段落有多個重點時) +- 使用項目符號(`-`)列出核心概念、實踐方式、關鍵引述 +- 使用編號列表(`1.`)呈現有先後順序或列舉性質的內容 +- 在關鍵段落標題與重要論點旁加入 YouTube 時間戳連結 +- 以 `### 💡 總結` 或 `### 💡 重點整理` 作為結尾段落 +- 底部加上影片連結 + +**內容要求:** + +- 涵蓋影片中討論的**所有**主要主題,不只前幾個 +- 保留具體範例、數據與直接引述 +- 專有名詞保持原文(可加中文說明) +- 每個 `###` 段落至少包含 2-4 個項目符號 + +**務必使用繁體中文,嚴禁簡體中文。所有元資料欄位皆為必填。** Write file(s) using the Write tool. diff --git a/docs/screenshots/telegram/yt2pdf_reply_zh-tw.png b/docs/screenshots/telegram/yt2pdf_reply_zh-tw.png new file mode 100644 index 0000000..04cfe3e Binary files /dev/null and b/docs/screenshots/telegram/yt2pdf_reply_zh-tw.png differ diff --git a/external_plugins/telegram-channel/server.ts b/external_plugins/telegram-channel/server.ts index 0e8c2ac..ca4d781 100644 --- a/external_plugins/telegram-channel/server.ts +++ b/external_plugins/telegram-channel/server.ts @@ -15,6 +15,9 @@ * Telegram's Bot API has no history or search. Reply-only tools. */ +// Fork identifier — shown in startup log to distinguish from official plugin. +const FORK_TAG = 'claude-code-channels' + import { Server } from '@modelcontextprotocol/sdk/server/index.js' import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js' import { @@ -625,6 +628,7 @@ mcp.setRequestHandler(CallToolRequestSchema, async req => { sentIds.length === 1 ? `sent (id: ${sentIds[0]})` : `sent ${sentIds.length} parts (ids: ${sentIds.join(', ')})` + return { content: [{ type: 'text', text: result }] } } case 'react': { @@ -740,7 +744,7 @@ bot.command('status', async ctx => { if (access.allowFrom.includes(senderId)) { const name = from.username ? `@${from.username}` : senderId - await ctx.reply(`Paired as ${name}.`) + await ctx.reply(`Paired as ${name}.\nFork: ${FORK_TAG}`) return } @@ -1121,7 +1125,7 @@ void (async () => { await bot.start({ onStart: info => { botUsername = info.username - process.stderr.write(`telegram channel: polling as @${info.username}\n`) + process.stderr.write(`telegram channel: polling as @${info.username} [${FORK_TAG}]\n`) void bot.api.setMyCommands( [ { command: 'start', description: 'Welcome and setup guide' }, diff --git a/lib/sessions/commands.ts b/lib/sessions/commands.ts index 6454603..e4942a2 100644 --- a/lib/sessions/commands.ts +++ b/lib/sessions/commands.ts @@ -1,7 +1,7 @@ /** * Session bot commands — parsed and executed at the broker layer. * - * Commands are prefixed with `/session` and handled without LLM invocation. + * Commands are prefixed with `/memory` (legacy `/session` also accepted) and handled without LLM invocation. */ import { existsSync } from 'fs' @@ -19,12 +19,15 @@ export interface SessionCommand { args: string[] } -/** Parse a /session command from message text. Returns null if not a session command. */ +/** Parse a /memory (or legacy /session) command from message text. Returns null if not a match. */ export function parseSessionCommand(text: string): SessionCommand | null { const trimmed = text.trim() - if (!trimmed.startsWith('/session')) return null + let prefix: string + if (trimmed.startsWith('/memory')) prefix = '/memory' + else if (trimmed.startsWith('/session')) prefix = '/session' + else return null - const parts = trimmed.slice('/session'.length).trim().split(/\s+/) + const parts = trimmed.slice(prefix.length).trim().split(/\s+/) const action = parts[0] ?? 'help' const args = parts.slice(1) @@ -97,7 +100,7 @@ function handleProfile(stateDir: string, userId: string): string { function handleForget(stateDir: string, args: string[]): string { const slug = args[0] - if (!slug) return 'Usage: /session forget ' + if (!slug) return 'Usage: /memory forget ' deleteTopic(stateDir, slug) return `Topic "${slug}" has been deleted.` } @@ -116,12 +119,12 @@ function handleHelp(): string { return [ 'Session Commands', '────────────────', - '/session status — Show session stats', - '/session clear — Clear your short-term memory', - '/session clear all — Clear all your data (STM + LTM)', - '/session profile — Show your stored profile', - '/session forget — Delete a topic note', - '/session export — Export your data', - '/session help — Show this help', + '/memory status — Show session stats', + '/memory clear — Clear your short-term memory', + '/memory clear all — Clear all your data (STM + LTM)', + '/memory profile — Show your stored profile', + '/memory forget — Delete a topic note', + '/memory export — Export your data', + '/memory help — Show this help', ].join('\n') } diff --git a/scripts/yt/get_transcript.py b/scripts/yt/get_transcript.py index af330b4..cb4425b 100644 --- a/scripts/yt/get_transcript.py +++ b/scripts/yt/get_transcript.py @@ -52,6 +52,10 @@ def download_subtitles(video_url: str, out_dir: Path, lang: str = "en") -> Path ) if srt_path.exists() and srt_path.stat().st_size > 0: return srt_path + # Glob fallback — yt-dlp may normalize lang codes differently + for candidate in out_dir.glob("video.*.srt"): + if candidate.stat().st_size > 0: + return candidate # Try auto-generated subs subprocess.run( @@ -73,6 +77,9 @@ def download_subtitles(video_url: str, out_dir: Path, lang: str = "en") -> Path ) if srt_path.exists() and srt_path.stat().st_size > 0: return srt_path + for candidate in out_dir.glob("video.*.srt"): + if candidate.stat().st_size > 0: + return candidate return None @@ -93,6 +100,34 @@ def srt_to_text(srt_path: Path) -> str: return "\n".join(lines) +def srt_to_timestamped_text(srt_path: Path, interval: int = 30) -> str: + """Convert SRT to text with [MM:SS] markers every ~interval seconds.""" + content = srt_path.read_text(encoding="utf-8") + result = [] + last_emitted = -interval # emit on first entry + current_ts_seconds = 0 + + for line in content.splitlines(): + line = line.strip() + if not line or re.match(r"^\d+$", line): + continue + ts_match = re.match(r"(\d{2}):(\d{2}):(\d{2}),\d{3}\s*-->", line) + if ts_match: + h, m, s = int(ts_match.group(1)), int(ts_match.group(2)), int(ts_match.group(3)) + current_ts_seconds = h * 3600 + m * 60 + s + continue + # Text line — prepend timestamp marker if interval elapsed + if current_ts_seconds - last_emitted >= interval: + mm = current_ts_seconds // 60 + ss = current_ts_seconds % 60 + result.append(f"[{mm:02d}:{ss:02d}] {line}") + last_emitted = current_ts_seconds + else: + result.append(line) + + return "\n".join(result) + + def whisper_transcribe_hf(video_url: str, out_dir: Path) -> str | None: """Download audio and transcribe via HuggingFace Inference API.""" hf_token = os.environ.get("HF_TOKEN") @@ -163,28 +198,43 @@ def whisper_transcribe_hf(video_url: str, out_dir: Path) -> str | None: return None -def get_transcript(video_id: str) -> str | None: +LANG_FALLBACKS = { + "zh-tw": ["zh-Hant", "zh-TW", "zh", "en"], + "zh": ["zh-Hans", "zh-CN", "zh", "en"], + "ja": ["ja", "en"], + "ko": ["ko", "en"], +} + + +def get_transcript( + video_id: str, lang: str = "en", timestamps: bool = False +) -> str | None: """Get transcript for a video using subtitle download with Whisper fallback.""" video_url = f"https://www.youtube.com/watch?v={video_id}" + convert = srt_to_timestamped_text if timestamps else srt_to_text + langs_to_try = LANG_FALLBACKS.get(lang, [lang, "en"]) if lang != "en" else ["en"] with tempfile.TemporaryDirectory(prefix="yt_transcript_") as tmpdir: tmp = Path(tmpdir) - # Try subtitles first (English) - print(f"Trying EN subtitles for {video_id}...", file=sys.stderr) - srt = download_subtitles(video_url, tmp, lang="en") - if srt: - text = srt_to_text(srt) - if len(text) > 100: - print(f"Got subtitle transcript ({len(text)} chars)", file=sys.stderr) - return text + for try_lang in langs_to_try: + print(f"Trying {try_lang} subtitles for {video_id}...", file=sys.stderr) + srt = download_subtitles(video_url, tmp, lang=try_lang) + if srt: + text = convert(srt) + if len(text) > 100: + print( + f"Got subtitle transcript in {try_lang} ({len(text)} chars)", + file=sys.stderr, + ) + return text - # Whisper fallback + # Whisper fallback (no timestamps available) print(f"Falling back to Whisper for {video_id}...", file=sys.stderr) text = whisper_transcribe_hf(video_url, tmp) if text and len(text) > 100: print(f"Got Whisper transcript ({len(text)} chars)", file=sys.stderr) - return text + return "[NO_TIMESTAMPS]\n" + text if timestamps else text print(f"No transcript available for {video_id}", file=sys.stderr) return None @@ -193,9 +243,17 @@ def get_transcript(video_id: str) -> str | None: def main(): parser = argparse.ArgumentParser(description="Extract YouTube video transcript") parser.add_argument("video_id", help="YouTube video ID") + parser.add_argument( + "--lang", default="en", help="Subtitle language (default: en)" + ) + parser.add_argument( + "--timestamps", + action="store_true", + help="Preserve [MM:SS] timestamp markers (~30s intervals)", + ) args = parser.parse_args() - transcript = get_transcript(args.video_id) + transcript = get_transcript(args.video_id, lang=args.lang, timestamps=args.timestamps) if transcript: print(transcript) else: diff --git a/start.sh b/start.sh index c8d3b4b..60344f6 100644 --- a/start.sh +++ b/start.sh @@ -38,8 +38,11 @@ resolve_cache_base() { echo "$HOME/.claude/plugins/cache/$plugin_org/$plugin_name" } -# Patch the .mcp.json in a plugin cache version dir so it runs our local -# fork code with the project-local state dir. +# Patch plugin cache: copy our fork's server.ts into the cache directory and +# update .mcp.json with the project-local state dir env var. +# Claude Code runs server.ts from cache regardless of --cwd, so we must +# overwrite the file directly. The FORK_TAG constant in server.ts identifies +# our fork in startup logs. # Args: $1=cache_version_dir $2=local_abs_path $3=channel_name patch_cache_mcp() { local ver_dir="$1" local_abs="$2" ch_name="$3" @@ -47,12 +50,20 @@ patch_cache_mcp() { local state_dir="$PROJECT_DIR/.claude/channels/$ch_name" local env_key="${ch_name^^}_STATE_DIR" + # Copy fork server.ts into cache (Claude Code runs from cache, not --cwd) + cp "$local_abs/server.ts" "$ver_dir/server.ts" + + # Fix import paths: our fork uses relative __dirname paths to lib/, + # but cache dir has a different parent. Replace with absolute paths. + sed -i "s|resolve(__dirname, '..', '..', 'lib', 'sessions')|'$PROJECT_DIR/lib/sessions'|" "$ver_dir/server.ts" + sed -i "s|resolve(__dirname, '..', '..', 'lib', 'safety')|'$PROJECT_DIR/lib/safety'|" "$ver_dir/server.ts" + cat > "$mcp_file" <