A CLI tool for collecting tweets from X/Twitter API v2, designed for OpenClaw integration.
- Get User Tweets: Fetch recent tweets from any public user
- Get Single Tweet: Retrieve a specific tweet by ID
- Search Tweets: Search with X query operators
- Collect All Tweets: Full timeline collection with pagination
- Adaptive Rate Limiting: Automatically adjusts request speed based on API quota
- Progress Persistence: Resume interrupted collections
- Multiple Output Formats: JSON, JSONL, or Markdown
# Clone the repository
cd /path/to/x-collector
# Install with pip (development mode)
pip install -e .
# Or install dependencies manually
pip install httpx pyyaml click richCreate the config file:
x-collector config initThis creates ~/.openclaw/x-collector.yaml. Edit it with your X Bearer Token:
x:
bearer_token: "YOUR_BEARER_TOKEN_HERE"- Go to X Developer Portal
- Create a project and app (or use existing)
- Navigate to "Keys and tokens"
- Generate and copy the Bearer Token
# Get recent tweets from a user
x-collector get-tweets elonmusk --limit 50
# Get a specific tweet
x-collector get-tweet 1234567890
# Search tweets
x-collector search "bitcoin" --limit 100
# Collect ALL tweets from a user
x-collector collect-all elonmusk --output ./elon_dataFetch recent tweets from a user.
x-collector get-tweets <username> [OPTIONS]
Options:
-l, --limit INTEGER Maximum tweets to fetch (default: 100)
--since-id TEXT Only fetch tweets after this ID
--until-id TEXT Only fetch tweets before this ID
-o, --output TEXT Output file path
-f, --format [json|jsonl|markdown] Output format (default: json)Examples:
# Get 50 recent tweets
x-collector get-tweets elonmusk --limit 50
# Save to file as markdown
x-collector get-tweets elonmusk --output tweets.md --format markdown
# Get tweets since a specific ID
x-collector get-tweets elonmusk --since-id 1234567890Get a single tweet by ID.
x-collector get-tweet <tweet_id> [OPTIONS]
Options:
-o, --output TEXT Output file path
-f, --format [json|markdown] Output format (default: json)Examples:
x-collector get-tweet 1234567890
x-collector get-tweet 1234567890 --format markdownSearch for tweets matching a query.
x-collector search <query> [OPTIONS]
Options:
-l, --limit INTEGER Maximum tweets to fetch (default: 100)
--since-id TEXT Only fetch tweets after this ID
--until-id TEXT Only fetch tweets before this ID
-o, --output TEXT Output file path
-f, --format [json|jsonl|markdown] Output format (default: json)X search operators:
| Operator | Description | Example |
|---|---|---|
from: |
Tweets from user | from:elonmusk |
to: |
Replies to user | to:elonmusk |
#hashtag |
Contains hashtag | #bitcoin |
"phrase" |
Exact phrase | "to the moon" |
lang: |
Language | lang:en |
-word |
Exclude word | -retweet |
Examples:
# Search for bitcoin tweets
x-collector search "bitcoin" --limit 100
# Search tweets from a specific user about crypto
x-collector search "from:elonmusk crypto"
# Search with multiple operators
x-collector search "#AI lang:en -filter:retweets"Collect complete tweet history from a user.
x-collector collect-all <username> [OPTIONS]
Options:
-o, --output TEXT Output directory (default: ./x_data)
-f, --format [json|jsonl|markdown] Output format (default: json)
-m, --max-tweets INTEGER Maximum tweets to collect
--since-id TEXT Only collect tweets after this ID
--until-id TEXT Only collect tweets before this ID
--resume/--no-resume Resume from last progress (default: resume)Features:
- Automatic pagination: Handles X's pagination automatically
- Rate limit handling: Waits when approaching API limits
- Progress saving: Saves progress to resume if interrupted
- Batch files: Saves tweets in batches plus a combined file
Examples:
# Collect all tweets (may take hours for prolific accounts)
x-collector collect-all elonmusk --output ./elon_data
# Collect with limit
x-collector collect-all elonmusk --max-tweets 1000
# Collect in JSONL format (streaming-friendly)
x-collector collect-all elonmusk --format jsonlOutput structure:
./x_data/
├── batch_0001.json # First batch
├── batch_0002.json # Second batch
├── ...
├── all_tweets.json # Combined file
└── .progress.json # Progress file (deleted on completion)
Get all tweets in a thread/conversation.
x-collector get-thread <conversation_id> [OPTIONS]
Options:
-o, --output TEXT Output file path
-f, --format [json|markdown] Output format (default: json)Manage configuration.
# Create default config file
x-collector config init
# Show current configuration
x-collector config show
# Validate configuration
x-collector config validateConfiguration file location: ~/.openclaw/x-collector.yaml
Full configuration options:
x:
# Required: X API v2 Bearer Token
bearer_token: "AAAA..."
rate_limit:
# Seconds between requests (normal mode)
safe_delay: 0.7
# Seconds between requests (approaching limit)
slow_delay: 2.0
# Slow down when remaining requests < this
safe_threshold: 10
# Wait for reset when remaining < this
critical_threshold: 2
collection:
# Max tweets per API request (max 100)
max_results_per_page: 100
# HTTP timeout in seconds
timeout: 30
# User agent string
user_agent: "XCollector/0.1.0"
output:
# Default format: json, jsonl, markdown
format: "json"
# Include referenced tweets
include_referenced: trueEnvironment variables can override config file values:
| Variable | Description |
|---|---|
X_BEARER_TOKEN |
X API Bearer Token (overrides config file) |
X_COLLECTOR_CONFIG |
Path to config file (overrides default path) |
X API v2 has strict rate limits. This tool implements adaptive rate limiting:
-
Normal Mode (
remaining > safe_threshold)- Wait
safe_delayseconds between requests - Default: 0.7 seconds
- Wait
-
Slow Mode (
remaining < safe_threshold)- Wait
slow_delayseconds between requests - Default: 2.0 seconds
- Wait
-
Critical Mode (
remaining < critical_threshold)- Wait for rate limit window to reset
- Automatically resumes after reset
| Tier | User Tweets | Search |
|---|---|---|
| Basic | ~15/15min | ~450/15min |
| Pro | 300/15min | 450/15min |
| Enterprise | Higher | Higher |
The collector reads the x-rate-limit-remaining header from API responses to adapt in real-time.
{
"username": "elonmusk",
"collected_at": "2024-01-15T10:30:00",
"total_count": 100,
"tweets": [
{
"id": "1234567890",
"text": "Tweet content here",
"created_at": "2024-01-15T09:00:00",
"metrics": {
"like_count": 1000,
"retweet_count": 500
}
}
]
}{"id": "1234567890", "text": "Tweet 1", ...}
{"id": "1234567891", "text": "Tweet 2", ...}
# @elonmusk X Archive
**Collected:** 2024-01-15 10:30
**Total Tweets:** 100
---
### 2024-01-15 09:00
Tweet content here
*1000 likes, 500 RTs*
[View on X](https://x.com/elonmusk/status/1234567890)
---You can also use the collector as a Python library:
import asyncio
from x_collector import XCollector, XConfig
async def main():
# Load config
config = XConfig.load()
collector = XCollector(config)
# Get recent tweets
tweets = await collector.get_user_tweets("elonmusk", limit=50)
for tweet in tweets:
print(f"{tweet.created_at}: {tweet.text[:100]}")
# Get a single tweet
tweet = await collector.get_tweet("1234567890")
print(tweet.to_json())
# Search
results = await collector.search_tweets("bitcoin", limit=100)
print(f"Found {len(results)} tweets")
# Collect all (generator)
async for batch in collector.collect_all("elonmusk", max_tweets=1000):
print(f"Got batch of {len(batch)} tweets")
asyncio.run(main())This skill is designed to work with OpenClaw. To use it:
-
Install the skill:
pip install -e /path/to/x-collector
-
Configure credentials:
x-collector config init # Edit ~/.openclaw/x-collector.yaml -
Use in OpenClaw:
> Collect the last 100 tweets from @elonmusk OpenClaw will invoke: x-collector get-tweets elonmusk --limit 100
Make sure you've created the config file and added your token:
x-collector config init
# Edit ~/.openclaw/x-collector.yaml with your token
x-collector config validateThe collector will automatically wait and retry. If you're hitting limits frequently:
- Increase
safe_delayin config - Lower
safe_thresholdfor earlier slowdown - Consider upgrading your API tier
- Check the username is correct (without @)
- Ensure the account is public
- Verify your Bearer Token has correct permissions
Just run the same command again. Progress is automatically saved and resumed.
# Will resume from where it left off
x-collector collect-all elonmusk --output ./dataMIT License