Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
266 changes: 266 additions & 0 deletions Community/tokencut/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,266 @@
---
name: tokencut
description: Compress text using AgentReady's TokenCut API to reduce token usage by 40-60% with minimal accuracy loss. Use when you need to reduce prompt length or compress text before sending to LLMs.
category: Data & Integrations
metadata:
author: curtastrophe.zo.computer
emojis: ["✂️", "📉", "⚡"]
tags:
- compression
- token-optimization
- ai-costs
- agentready
---

# TokenCut - Text Compression Skill

This skill compresses text using AgentReady's TokenCut API to reduce token usage by 40-60% with minimal accuracy loss (~0.4% on standard benchmarks).

> **Data Disclosure:** All input text is transmitted to AgentReady's external servers (`agentready.cloud`) for compression processing. Your text leaves your environment. Do not compress text containing passwords, API keys, PII, or confidential information without first reviewing [AgentReady's data retention policy](https://agentready.cloud). Your LLM API keys are never sent to AgentReady, only the text content itself.

---

## ⚠️ CRITICAL: When TokenCut Works vs. Doesn't Work

### ✅ TokenCut WORKS (Saves Tokens)

| Scenario | Why It Works |
|----------|--------------|
| **Script calling external LLM** | Script fetches data → compresses → sends to OpenAI/Claude API. The LLM only sees compressed text. |
| **zo.space API routes** | Route receives request → compresses → calls LLM. The LLM only sees compressed text. |
| **Multi-agent pipeline with file handoff** | Agent A compresses → writes to file → Agent B reads compressed file. Agent B sees less content. |
| **Pre-processing data for later** | Compress now, store. Future LLM calls use compressed data. |

**The key:** The LLM that processes the data must see the COMPRESSED version, not the original.

### ❌ TokenCut DOESN'T WORK (No Savings)

| Scenario | Why It Fails |
|----------|--------------|
| **Single agent fetching and processing** | Agent reads data into its context (tokens consumed), then compresses. Too late. |
| **Agent calling TokenCut on its own fetched data** | The agent already "saw" the uncompressed data. |
| **Compression after LLM processing** | Tokens already consumed. |

**The problem:** If an LLM agent fetches data via API, it immediately reads that data into its context. Compressing afterward doesn't undo the token usage.

---

## Architecture Comparison

### ❌ Wrong Way (Single Agent)

```
┌─────────────────────────────────────────────────────────────────┐
│ SCHEDULED AGENT (GLM 5) │
│ │
│ Step 1: "Fetch emails from Gmail" │
│ └── API returns 500 emails │
│ └── Agent READS them ← 🔴 TOKENS CONSUMED HERE │
│ │
│ Step 2: "Compress with TokenCut" │
│ └── Returns compressed text │
│ └── Agent READS result ← 🔴 MORE TOKENS │
│ │
│ Step 3: "Summarize and send digest" │
│ └── No savings possible - already consumed tokens │
│ │
└─────────────────────────────────────────────────────────────────┘
```

### ✅ Right Way (Script + External LLM)

```
┌─────────────────────────────────────────────────────────────────┐
│ SCRIPT (No LLM context) │
│ │
│ Step 1: Fetch emails from Gmail API │
│ └── Returns raw data (no token cost - just API call) │
│ │
│ Step 2: Compress with TokenCut │
│ └── Returns compressed text │
│ │
│ Step 3: Send to EXTERNAL LLM (OpenAI, Claude, etc.) │
│ └── 🟢 LLM only sees COMPRESSED text │
│ └── 🟢 Token savings: 40-60% │
│ │
│ Step 4: Save result to file │
│ └── Agent reads tiny summary (minimal tokens) │
│ │
└─────────────────────────────────────────────────────────────────┘
```

---

## Prerequisites

- AgentReady API key saved in Zo Secrets as `AGENTREADY_API_KEY`

---

## Usage

### Option 1: In Scripts (Recommended for Token Savings)

Use TokenCut in scripts that call external LLMs:

```typescript
import { compressText } from "/home/workspace/Skills/tokencut/scripts/compress.ts";

// Fetch data (no LLM yet)
const rawData = await fetchDataFromAPI();

// Compress BEFORE sending to LLM
const compressed = await compressText(rawData, "standard");

// Send to external LLM (only sees compressed version)
const response = await fetch("https://api.openai.com/v1/chat/completions", {
body: JSON.stringify({
model: "gpt-4o",
messages: [{ role: "user", content: compressed }],
}),
});
```

### Option 2: Direct Script Usage

Run the compression script directly:

```bash
bun /home/workspace/Skills/tokencut/scripts/compress.ts --text "your long text here" --level standard
```

**Arguments:**
- `--text` or `-t`: The text to compress (required)
- `--level`: Compression level - `light`, `standard` (default), or `aggressive`
- `--output` or `-o`: Optional output file path

**Examples:**
```bash
# Standard compression (recommended)
bun /home/workspace/Skills/tokencut/scripts/compress.ts --text "your prompt here" --level standard

# Light compression (preserves more, saves 20-30%)
bun /home/workspace/Skills/tokencut/scripts/compress.ts -t "your text" -l light

# Aggressive compression (maximum savings, ~60%)
bun /home/workspace/Skills/tokencut/scripts/compress.ts -t "your text" -l aggressive -o result.txt
```

### Option 3: REST API Direct

Call the AgentReady API directly:

```bash
curl -X POST https://agentready.cloud/v1/compress \
-H "Authorization: Bearer $AGENTREADY_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "your text to compress", "level": "standard"}'
```

---

## Compression Levels

| Level | Token Savings | Best For |
|-------|---------------|----------|
| Light | 20-30% | Code, math, technical content |
| Standard | 40-50% | General prompts, web content (recommended) |
| Aggressive | 55-60% | Long documents, articles |

---

## Integration Patterns

### Pattern 1: Script-Based Workflow (Best for Scheduled Tasks)

Instead of having an agent do everything, use a script:

```yaml
# Agent instruction
Run the script `bun /home/workspace/Scripts/your-batch-script.ts --output /tmp/result.md`
Then read the output file and send a summary via Telegram.
```

**Why this works:** The script handles the heavy processing (fetch + compress + LLM call), the agent only sees a tiny summary.

### Pattern 2: zo.space API Route

Create an API route that compresses before calling an LLM:

```typescript
// /api/compress-and-process route
import { compressText } from "/home/workspace/Skills/tokencut/scripts/compress.ts";

export default async (c) => {
const { text } = await c.req.json();

// Compress first
const compressed = await compressText(text, "standard");

// Then process with LLM (only sees compressed)
const result = await callLLM(compressed);

return c.json({ result });
};
```

### Pattern 3: Multi-Agent Pipeline with File Handoff

```
Agent A (Fetcher):
1. Fetch data from API
2. Compress with TokenCut
3. Write to file: /tmp/compressed_data.txt

Agent B (Processor):
1. Read file: /tmp/compressed_data.txt
2. Process the compressed data
3. Output final result
```

**Note:** Agent B saves tokens because it reads the compressed file. Agent A still consumes tokens reading the original data, but its job is minimal (just fetch and compress).

---

## What's New (Feb 2026)

- **`/v1/compress` endpoint** - Now live and stable
- **Simplified auth** - No `X-Upstream-API-Key` needed, only your AgentReady API key
- **Optimized compression** - More efficient token reduction
- **Token usage tracking** - Better visibility into savings
- **Self-hosted version** - Coming soon for entirely local compression

---

## Tips

- **Best for**: Web content, articles, documentation, long prompts
- **Preserves**: Code blocks, URLs, numbers, proper nouns
- **Avoid for**: Code (use light), math equations, very short prompts (<100 tokens)

---

## When to Trigger This Skill

### Suggested Use Cases

Consider offering TokenCut compression when the user is building:

- Scripts that process multiple items through an external LLM in a batch/loop
- Multi-agent pipelines with file handoffs between stages
- zo.space API routes that process text before calling an LLM
- Pre-processing pipelines that prepare data for later LLM consumption

**Important:** Because this skill sends text to an external API, always confirm with the user before adding compression to their workflow. Never inject TokenCut silently into scripts or pipelines.

**Opt-in tip:** On Zo, users can create a conditional rule to be prompted about TokenCut when writing batch LLM scripts.

### When NOT to Trigger

This skill should NOT be triggered when:
- A single agent is fetching and processing data in one go (tokens already consumed in context)
- The LLM has already consumed the tokens
- You're just trying to reduce an agent's output size
- The input is very short (<100 tokens / ~400 chars)
- The input is structured data (JSON schemas, configs) where exact format matters
- The input is executable code (compress docs ABOUT code, not the code itself)
123 changes: 123 additions & 0 deletions Community/tokencut/scripts/compress.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
#!/usr/bin/env bun

interface CompressResponse {
success: boolean;
data?: {
original_length: number;
compressed_length: number;
savings_percent: number;
compressed_text: string;
};
error?: string;
}

const API_URL = "https://agentready.cloud/v1/compress";

async function getApiKey(): Promise<string> {
const apiKey = process.env.AGENTREADY_API_KEY;
if (!apiKey) {
throw new Error(
"AGENTREADY_API_KEY not found in environment. Please add it to Zo Secrets."
);
}
return apiKey;
}

export async function compressText(
text: string,
level: "light" | "standard" | "aggressive" = "standard"
): Promise<string> {
const apiKey = await getApiKey();

const response = await fetch(API_URL, {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ text, level }),
});

if (!response.ok) {
const error = await response.text();
throw new Error(`Compression failed: ${response.status} ${error}`);
}

const result: CompressResponse = await response.json();

if (!result.success || !result.data) {
throw new Error(result.error || "Compression failed");
}

return result.data.compressed_text;
}

async function main() {
const args = process.argv.slice(2);
let text = "";
let level: "light" | "standard" | "aggressive" = "standard";
let outputFile = "";

// Parse arguments
for (let i = 0; i < args.length; i++) {
const arg = args[i];
if (arg === "--text" || arg === "-t") {
text = args[++i];
} else if (arg === "--level" || arg === "-l") {
const lvl = args[++i].toLowerCase();
if (["light", "standard", "aggressive"].includes(lvl)) {
level = lvl as "light" | "standard" | "aggressive";
}
} else if (arg === "--output" || arg === "-o") {
outputFile = args[++i];
} else if (arg === "--help" || arg === "-h") {
console.log(`
TokenCut - Text Compression CLI

Usage: bun compress.ts [options]

Options:
-t, --text <text> Text to compress (required)
-l, --level <level> Compression level: light, standard, aggressive (default: standard)
-o, --output <file> Output file path (optional)
-h, --help Show this help message

Examples:
bun compress.ts -t "your text here"
bun compress.ts --text "your text" --level aggressive
bun compress.ts -t "your text" -o compressed.txt
`);
process.exit(0);
}
}

if (!text) {
console.error("Error: --text or -t is required");
console.log("Run with --help for usage information");
process.exit(1);
}

console.log(`Compressing text (${text.length} chars) with ${level} level...`);

try {
const compressed = await compressText(text, level);

if (outputFile) {
await Bun.write(outputFile, compressed);
console.log(`Compressed text saved to: ${outputFile}`);
} else {
console.log("\n--- Compressed Text ---\n");
console.log(compressed);
}

const savings = ((1 - compressed.length / text.length) * 100).toFixed(1);
console.log(`\nOriginal: ${text.length} chars | Compressed: ${compressed.length} chars | Savings: ${savings}%`);
} catch (error) {
console.error("Error:", error instanceof Error ? error.message : error);
process.exit(1);
}
}

if (import.meta.main) {
main();
}
Loading