Skip to content

feat: add pdf-podcast-agent - PDF-to-podcast with live debate and Stripe payments#28

Open
stripathy1999 wants to merge 13 commits into
fetchai:mainfrom
stripathy1999:feat/pdf-podcast-agent
Open

feat: add pdf-podcast-agent - PDF-to-podcast with live debate and Stripe payments#28
stripathy1999 wants to merge 13 commits into
fetchai:mainfrom
stripathy1999:feat/pdf-podcast-agent

Conversation

@stripathy1999
Copy link
Copy Markdown
Contributor

@stripathy1999 stripathy1999 commented Apr 10, 2026

Summary

This PR adds the PDF-to-Podcast Agent example.

The pdf-podcast-agent is a multi-agent ASI:One workflow that converts uploaded research PDFs into a debate-style podcast between two AI hosts (Skeptic + Expert), then supports interactive post-show Q&A and paid live debates.

Core flow:

  • Extractor agent: parses long PDF text into key insights (thesis, metrics, controversy)
  • Scriptwriter agent: generates a structured multi-turn debate script
  • Voice Studio agent: synthesizes host dialogue into MP3
  • Orchestrator agent: manages end-to-end pipeline, chat UX, artifacts, and payment/debate orchestration
  • Host A / Host B agents: handle follow-up Q&A and live debate turns

Also includes:

  • Personality customization (multiple host style combinations)
  • Stripe-gated live debate via Agent Payment Protocol
  • Transcript + downloadable output artifacts
  • Follow-up robustness fix from review: guard empty ResourceContent lists in chat attachment handling to avoid handler crashes

Type of Change

  • New agent example
  • Bug fix
  • Documentation update
  • Refactor / cleanup
  • Other

Checklist

  • I have starred this repository.
  • I ran ruff check ..
  • I ran ruff format ..
  • I added/updated README.md for changed example(s).
  • I added .env.example if environment variables are required.
  • I added demo image/GIF (if applicable).
  • I added agent profile link (if applicable).
  • I updated CHANGELOG.md (required for non-doc changes).
  • I verified paths/commands used in docs.

Related Issue

N.A.

Notes for Reviewers

  • Main implementation is under pdf-podcast-agent/.
  • Review commit includes a defensive guard in orchestrator.py for empty attachment resources (ResourceContent.resource=[]) to prevent IndexError during message handling.

6-agent PDF-to-podcast pipeline that converts research papers into debate
podcasts with live Q&A, turn-by-turn debate, host personality customization,
and Stripe payment gating via AgentPaymentProtocol.

Tech: uAgents, ASI:One LLM, OpenAI TTS, pdfplumber, pydub, Stripe.
Includes Dockerfile and docker-compose.yml for containerised deployment.

Made-with: Cursor
Comment thread pdf-podcast-agent/orchestrator.py Outdated
Comment thread pdf-podcast-agent/run.py

await asyncio.sleep(8)

# Build a plain-text debate history from all accumulated lines
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: An await asyncio.sleep(8) call in the handle_debate_response message handler will block the agent's sequential message processing loop, causing significant performance degradation.
Severity: MEDIUM

Suggested Fix

Instead of using await asyncio.sleep() directly in the message handler, refactor the logic to use a non-blocking mechanism. For example, schedule a background task to handle the next step after the delay, allowing the message handler to return immediately.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: pdf-podcast-agent/orchestrator.py#L1030

Potential issue: The `handle_debate_response` message handler contains an `await
asyncio.sleep(8)` call. The orchestrator agent is configured to process messages
sequentially, not concurrently. This means the 8-second sleep will block the agent's
message processing loop. During a multi-turn debate, this will introduce significant,
cumulative delays, degrading the user experience of the live debate feature.

Did we get this right? 👍 / 👎 to inform future reviews.

Auto-resolve missing sub-agent addresses from deterministic seeds at container startup so orchestrator wiring works without manual address injection. Document the behavior so Docker deployments stay aligned with the existing multi-agent workflow.

Made-with: Cursor
Drop Docker files and remove Docker setup references from the example docs so the PR no longer includes Docker build or installation guidance.

Made-with: Cursor
Comment on lines +886 to +891

ASI:One sends CommitPayment with ``transaction_id`` set to the Stripe
Checkout Session ID. We verify with Stripe, send CompletePayment,
mark the session as paid, and notify the user.
"""
if msg.funds.payment_method != "stripe" or not msg.transaction_id:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The code accesses item.resource[0] without checking if the list is empty, leading to an unhandled IndexError that will crash the message handler.
Severity: HIGH

Suggested Fix

Add a check to ensure item.resource is not an empty list before attempting to access its first element. For example: if isinstance(item.resource, list) and item.resource:.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: pdf-podcast-agent/orchestrator.py#L886-L891

Potential issue: In the `handle_chat_message` handler, if an incoming `ResourceContent`
message has an empty list for the `item.resource` attribute, the code attempts to access
`item.resource[0]`. This access occurs before the `try...except` block, causing an
unhandled `IndexError` if the list is empty. This will crash the handler, preventing any
response from being sent back to the user for what could be a realistic edge case, such
as a failed file upload.

sakshitripathy and others added 4 commits April 13, 2026 09:45
Skip empty attachment resource lists in chat handling so malformed or failed uploads do not raise IndexError and crash the message handler.

Made-with: Cursor
Normalize import ordering and formatting across the example, add requests stubs for mypy, and fix the orchestrator variable typing edge so Ruff and mypy checks pass in CI.

Made-with: Cursor
Suppress import-untyped on requests where CI skips stub installation, and update Stripe Checkout ui_mode to the typed embedded value so changed-file mypy checks pass.

Made-with: Cursor
@stripathy1999 stripathy1999 changed the title feat: add pdf-podcast-agent — 6-agent PDF-to-podcast with live debate and Stripe payments feat: add pdf-podcast-agent - PDF-to-podcast with live debate and Stripe payments Apr 13, 2026
Replace the single demo placeholder with real chat flow screenshots covering podcast generation, live Q&A/paywall, Stripe confirmation, and host personality customization.

Made-with: Cursor
f" Host A {HOST_A_ADDRESS or '(not set)'}\n"
f" Host B {HOST_B_ADDRESS or '(not set)'}"
)
ctx.storage.set(_PENDING_PAYMENTS_KEY, "{}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Concurrent payment attempts overwrite a shared _PENDING_PAYMENTS_KEY, causing the system to credit the wrong user's session and denying access to the paying user.
Severity: HIGH

Suggested Fix

Use a unique key for each pending transaction instead of the shared _PENDING_PAYMENTS_KEY. For example, the Stripe checkout_session_id could be used as part of the storage key to ensure each user's payment information is stored and retrieved separately, preventing data overwrites during concurrent sessions.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: pdf-podcast-agent/orchestrator.py#L951

Potential issue: A race condition exists in the payment flow due to the use of a single
shared storage key, `_PENDING_PAYMENTS_KEY`, for all concurrent transactions. When
multiple users initiate payment, the data for the second user overwrites the first. When
the first user completes their payment, the `on_commit_payment` function retrieves the
second user's session ID, marking the wrong session as paid. This prevents the first
user from accessing the feature they paid for. The fallback mechanism is not triggered
because the session ID is incorrect, not empty.

Store pending Stripe payment context per checkout_session_id instead of a shared singleton record so concurrent payment attempts cannot overwrite each other and mis-credit sessions.

Made-with: Cursor
Comment thread pdf-podcast-agent/host_a_agent.py Outdated
Copy link
Copy Markdown
Collaborator

@gautammanak1 gautammanak1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review (pdf-podcast-agent):

Must fix / clarify

  • README says "6 agents" or python run.py, but run.py only starts 4 processes and does not start host_a / host_b — update docs or extend the launcher.
  • run.py should parse and export HOST_A_ADDRESS and HOST_B_ADDRESS from get_addresses.py output into child_env, same as the other addresses.
  • host_a_agent.py / host_b_agent.py docstrings say run hosts "after orchestrator"; README says orchestrator last — fix docstrings to match (hosts before orchestrator is ready to receive messages).

Consider

  • Document or refactor the await asyncio.sleep(8) in the live debate handler (blocks sequential processing).
  • Prefer subprocess over exec(open("get_addresses.py").read()) for maintainability.

Looks good

  • Empty ResourceContent guard, Stripe pending map keyed by checkout session, ruff clean on the example folder.

Comment thread pdf-podcast-agent/host_b_agent.py Outdated
Comment thread pdf-podcast-agent/requirements.txt Outdated
Comment thread pdf-podcast-agent/requirements.txt Outdated
Comment thread pdf-podcast-agent/run.py
Comment thread pdf-podcast-agent/run.py Outdated
Update host agent mailbox instantiation, launch all six agents from run.py with HOST_A/HOST_B addresses, align startup-order docs, and make live debate pacing configurable/documented.

Made-with: Cursor
Update pdf-podcast-agent requirements to uagents>=0.24.1 and uagents-core>=0.4.4.

Made-with: Cursor
Add --explicit-package-bases to the pull_request_ci typecheck step so same-named modules in different directories are resolved by path and no longer collide as top-level module names.

Made-with: Cursor
Run mypy once per changed Python file (with existing flags) instead of passing all files in one command, which prevents same-named modules in different directories from colliding.

Made-with: Cursor
Comment on lines +79 to +82
combined = segments[0]
for seg in segments[1:]:
if gap:
combined = combined + gap + seg
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The _stitch_audio function lacks input validation, which could lead to creating a 0-byte MP3 file if it receives an empty list of audio chunks.
Severity: LOW

Suggested Fix

Add a guard at the beginning of the _stitch_audio function or its caller to check if the msg.lines or the resulting audio chunks list is empty. If it is, raise an error or handle it gracefully instead of proceeding to stitch the audio.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: pdf-podcast-agent/voice_studio_agent.py#L79-L82

Potential issue: The `_stitch_audio` function will raise an `IndexError` if its `chunks`
argument is an empty list, as it attempts to access `segments[0]`. This exception is
caught, but the function then returns an empty byte string `b""`, which is written to
disk as a silent, corrupt 0-byte MP3 file. While upstream agents currently prevent this
by ensuring non-empty inputs, the `voice_studio_agent` lacks its own defensive
validation, making it fragile to changes or malformed inputs from other sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants