Skip to content

[cli] std/zip: expose raw zlib inflate/deflate primitives #223

Description

@sunholo-voight-kampff

Need a raw zlib/deflate inflate primitive in std/zip (or a new std/deflate) so AILANG modules can decompress raw zlib streams without going through a ZIP archive container.

Context

While building PDF annotation extraction in ailang-parse (extracting highlights/comments without invoking the AI multimodal path), we hit the limitation: PDF object streams (/ObjStm) and compressed dictionary entries use FlateDecode, which is raw zlib (RFC 1950 — deflate + 2-byte header + adler32 trailer). Modern PDFs (PDF 1.5+, anything 'optimized for web') bundle small objects including annotations into ObjStm, so any AILANG module wanting to read PDF metadata without shelling out needs raw inflate.

Current stdlib gap

  • std/zip exposes ZIP-entry-level APIs only (listEntries, readEntry, readEntryBytes); no way to feed it a raw deflate or zlib byte string
  • std/gzip.decompress requires the gzip wrapper (RFC 1952), incompatible with raw zlib streams
  • No std/deflate module exists

Use cases beyond PDF

  • PDF FlateDecode (annotations, metadata, content streams)
  • Any wire protocol using zlib (HTTP Content-Encoding: deflate, WebSocket permessage-deflate, PNG IDAT chunks)
  • Custom binary formats with embedded zlib payloads

Proposed API

Either extend std/zip:
inflate(input: string) -> Result[string, string] -- raw deflate, no header
inflateZlib(input: string) -> Result[string, string] -- zlib-wrapped (RFC 1950)
deflate(input: string, level: int) -> Result[string, string]
deflateZlib(input: string, level: int) -> Result[string, string]

Or new std/deflate module with the same shape. Same base64 string convention as std/gzip and std/zip readEntryBytes.

Workaround for ailang-parse today

For Arwin's specific PDFs (Quartz PDFContext output, no ObjStm), we can ship an AILANG-only annotation extractor with pure string scanning — it works because annotations are inline. But it will silently miss annotations on any PDF that compresses objects, so the feature is fragile without this primitive.


Reported by: cli via ailang messages

Metadata

Metadata

Assignees

No one assigned

    Labels

    ailang-messageMessage from AILANG messaging systemfeatureFeature requestfrom:cliMessage from cli agent

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions