Skip to content

fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types#38

Merged
ethanndickson merged 3 commits into
coder_2_33from
anthropic-pdf-filename-context
Jun 4, 2026
Merged

fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types#38
ethanndickson merged 3 commits into
coder_2_33from
anthropic-pdf-filename-context

Conversation

@ethanndickson

@ethanndickson ethanndickson commented Jun 3, 2026

Copy link
Copy Markdown
Member

Summary

The Anthropic provider currently ignores FilePart.Filename for both PDFs and text documents, and it silently drops any other media type without surfacing a warning. Claude therefore has no handle the model can use to refer back to an attachment, and unsupported attachments leave no trace for the caller.

This PR makes three small changes to providers/anthropic:

  1. PDF filename → document title. Forward file.Filename into DocumentBlockParam.Title on the application/pdf branch. The filename is sanitized first (see below).
  2. Text filename → document title. Same forwarding on the text/* branch, which also produces a DocumentBlockParam (via PlainTextSourceParam).
  3. Default branch with CallWarning. Match the other providers in this repo by emitting a fantasy.CallWarning when a FilePart media type is not handled, instead of silently dropping it.
docBlock.OfDocument.Title = anthropic.String(sanitizeAnthropicDocumentTitle(file.Filename))
default:
    warnings = append(warnings, fantasy.CallWarning{
        Type:    fantasy.CallWarningTypeOther,
        Message: fmt.Sprintf("file part media type %s not supported", file.MediaType),
    })

Why title and the sanitizer

Anthropic restricts document titles to alphanumerics, whitespace, hyphens, parentheses, and square brackets. A title that contains other runes (the . and _ characters that occur in almost every real filename, for example) is rejected with:

The document file name can only contain alphanumeric characters, whitespace characters, hyphens, parentheses, and square brackets.

The new sanitizeAnthropicDocumentTitle helper replaces disallowed runes with spaces, collapses consecutive whitespace, and trims. Empty or fully disallowed input falls back to "Document" so every attached document has a stable handle the model can refer back to.

Why the image branch is untouched

ImageBlockParam does not have a free-form filename or title slot, so there is no equivalent place to forward FilePart.Filename. The two branches touched here (application/pdf, text/*) are the full set of Anthropic content blocks that can carry a document title. The new default warning still covers everything else, including image MIME types that the existing image/* case does not handle (for example, image/heic).

Upstream

The same gaps exist in charmbracelet/fantasy. Tracked in charmbracelet#267.

Closes

CODAGT-540 (follow-up to #37)

@ethanndickson ethanndickson changed the title fix(providers/anthropic): include PDF filename in document context fix(providers/anthropic): forward FilePart.Filename to document context and warn on unsupported media types Jun 3, 2026
Anthropic's DocumentBlockParam exposes a Title field that the model
uses when it refers back to an attached document. Forward FilePart.Filename
into that field so users can ask the model about a document by name.

The title is sanitized first: Anthropic restricts titles to alphanumerics,
whitespace, hyphens, parentheses, and square brackets, and returns
'The document file name can only contain alphanumeric characters,
whitespace characters, hyphens, parentheses, and square brackets.' for
any title containing other runes. Disallowed runes are replaced with
spaces, runs of whitespace are collapsed, and the result is trimmed.
Empty or fully disallowed input falls back to 'Document' so every
attached document has a stable handle, matching the invariant the
OpenAI provider already enforces with its part-N.pdf synthetic name.

The sanitizer is a Go port of the implementation in coder/mux
(src/node/utils/messages/sanitizeAnthropicDocumentFilename.ts); prior
art for sending filename as title also includes vercel/ai's
@ai-sdk/anthropic, which sets document.title from part.filename when
no provider-options title is supplied.
…ed media types

Mirror the PDF document-title handling on the text/* document branch
so text attachments also reach Anthropic with a stable handle the model
can refer back to. The filename runs through the same sanitizer; an
empty or fully disallowed filename falls back to 'Document'.

Also add a default case to the file MediaType switch that emits a
CallWarning when a FilePart's media type is not handled. Previously
the Anthropic provider silently dropped any file with a media type
other than image/*, application/pdf, or text/*, so unsupported
attachments left no trace for the caller. The new behavior matches
the openai, openaicompat, openrouter, and vercel providers, which
already warn on unsupported FilePart media types.
@ethanndickson ethanndickson force-pushed the anthropic-pdf-filename-context branch from 4331a3f to 492b6b0 Compare June 3, 2026 07:07
@ethanndickson ethanndickson changed the title fix(providers/anthropic): forward FilePart.Filename to document context and warn on unsupported media types fix(providers/anthropic): forward FilePart.Filename as document title and warn on unsupported media types Jun 3, 2026
revive's exported rule requires doc comments on exported consts to
begin with the const name. Two consts in computer_use.go are flagged
on coder_2_33 and on this PR's lint job. Rewrite the comments to
satisfy revive without changing behavior.
@ethanndickson ethanndickson merged commit b2b2fc6 into coder_2_33 Jun 4, 2026
8 of 16 checks passed
ethanndickson added a commit to coder/coder that referenced this pull request Jun 9, 2026
Previously, fantasy's Anthropic provider adapter accepted PDF and text
FileParts but dropped the filename on the floor, so Claude (direct or
via Bedrock) saw the document bytes without any handle and could not
answer questions like "what's in foo.pdf". Other providers (OpenAI,
Gemini, OpenRouter, Vercel) already forwarded filenames.

Bumps coder/fantasy past coder/fantasy#38, which sanitizes
FilePart.Filename and sets it as the Anthropic DocumentBlockParam.Title
for both application/pdf and text/* attachments, and emits a CallWarning
for unsupported FilePart media types instead of silently dropping them.

On this side, plumbs the resolved filename through partsToMessageParts
so the FilePart literal carries it into the provider. The
TestModelFromConfig_AnthropicPDFFilePartReachesProvider regression test
now asserts the outbound Anthropic request includes the sanitized title
("quarterly_report.v1.pdf" becomes "quarterly report v1 pdf").

Closes CODAGT-545
ethanndickson added a commit to coder/coder that referenced this pull request Jun 9, 2026
Previously, fantasy's Anthropic provider adapter accepted PDF and text
FileParts but dropped the filename on the floor, so Claude (direct or
via Bedrock) saw the document bytes without any handle and could not
answer questions like "what's in foo.pdf". Other providers (OpenAI,
Gemini, OpenRouter, Vercel) already forwarded filenames.

Bumps coder/fantasy past coder/fantasy#38, which sanitizes
FilePart.Filename and sets it as the Anthropic DocumentBlockParam.Title
for both application/pdf and text/* attachments, and emits a CallWarning
for unsupported FilePart media types instead of silently dropping them.

On this side, plumbs the resolved filename through partsToMessageParts
so the FilePart literal carries it into the provider. The
TestModelFromConfig_AnthropicPDFFilePartReachesProvider regression test
now asserts the outbound Anthropic request includes the sanitized title
("quarterly_report.v1.pdf" becomes "quarterly report v1 pdf").

Closes CODAGT-545
ethanndickson added a commit to coder/coder that referenced this pull request Jun 10, 2026
Previously, fantasy's Anthropic provider adapter accepted PDF and text
FileParts but dropped the filename on the floor, so Claude (direct or
via Bedrock) saw the document bytes without any handle and could not
answer questions like "what's in foo.pdf". Other providers (OpenAI,
Gemini, OpenRouter, Vercel) already forwarded filenames.

Bumps `coder/fantasy` past
[coder/fantasy#38](coder/fantasy#38), which
sanitizes `FilePart.Filename` and sets it as the Anthropic
`DocumentBlockParam.Title` for both `application/pdf` and `text/*`
attachments, and emits a `CallWarning` for unsupported `FilePart` media
types instead of silently dropping them.

On this side, plumbs the resolved filename through `partsToMessageParts`
so the `FilePart` literal carries it into the provider. The existing
`TestModelFromConfig_AnthropicPDFFilePartReachesProvider` regression
test is extended to assert the outbound Anthropic request includes the
sanitized title (`quarterly_report.v1.pdf` becomes `quarterly report v1
pdf`).

Closes CODAGT-545
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant