Hex Machina is a free, AI-driven newsletter service that automatically monitors AI research, blogs, and announcements, summarizes key insights, and delivers high-quality, concise newsletters.
AI News, Compiled by the Machine. You can find the newsletter at the following URL: https://hexmachina.beehiiv.com/
Each newsletter was generated automatically with this project.
Keeping up with the fast-moving AI landscape is time-consuming. Traditional manual curation can't scale.
Hex Machina solves this with automated intelligence.
Ingestion → Ingests articles from AI-related websites. ![]()
Article Enrichment Flow → Adds tags, summaries, etc... ![]()
Selection → Selects most relevant items in an unsupervised way. ![]()
Newsletter Generator → Compiles and formats weekly updates. ![]()
Orchestration Script → Runs the full pipeline automatically. ![]()
| Component | Technology |
|---|---|
| Scraping | Scrapy |
| Database | TinyDB |
| LLMs | OpenRouter / OpenAI |
| Tagging & NLP | Hugging Face / OpenAI |
| Workflow | Metaflow |
| Hosting | Beehiiv |
| Ochestration | Cronjob |
-
Python 3.9+ installed
-
Install dependencies:
pip install -r requirements.txt
-
Set up your environment variables in the existing
.envfile:Note: You need a paid, working API key for both OpenAI and OpenRouter platforms.
- Get your OpenAI API key here: https://platform.openai.com/api-keys
- Get your OpenRouter API key here: https://openrouter.ai/docs/api-reference/api-keys/create-api-key Don't worry—processing thousands of articles typically costs only a couple of cents.
# OpenAI API keys OPENAI_API_KEY=your_openai_api_key OPENROUTER_API_KEY=your_openrouter_api_key
You can customize which RSS feeds are ingested by editing the following files:
data/rss_feeds.txt: Add or remove URLs (one per line) to control which RSS feeds are scraped by the standard scraper.data/rss_feeds_stealth.txt: Add URLs here to use the stealth mode scraper, which is designed to bypass security checks and scrape sites that block regular scrapers.
Simply add the desired feed URLs to these files before running the pipeline to expand or modify your sources.
Option 1: Using the shell script (recommended)
Note:
The--articles-limit 100flag controls how many articles are processed (about 10 minutes to run).
You can remove this limit or increase it to fetch and process more articles if you want a newsletter selected over more articles.
chmod +x run_generate_newsletter.sh
./run_generate_newsletter.shOption 2: Direct Python execution
export PYTHONPATH="./hex:$PYTHONPATH"
python generate_newsletter.py \
--ingestion-articles-table 'articles' \
--replicates-table 'replicates' \
--articles-limit 100 \
--date-threshold "$(date -u -v-7d +"%a, %d %b %Y %H:%M:%S +0000")" \
--selection-articles-limit 6 \
--selected-articles-table 'selected_articles_dummy_table'You can also run each flow separately:
# Article Ingestion Flow
python -m hex.flows.article_ingestion.flow run --with card
# Article Enrichment Flow
python -m hex.flows.article_enrichment.flow run --with card
# Article Selection Flow
python -m hex.flows.article_selection.flow run --with cardThe output of each pipeline run is organized in a timestamped folder inside ./generated_newsletters/. Here is an example of the directory structure you will find after a run:
generated_newsletters/2025-06-26_18-37-55
├── articleenrichmentflow_flow.log
├── articleenrichmentflow_report.html
├── articleingestionflow_flow.log
├── articleselectionflow_flow.log
├── articleselectionflow_report.html
├── images
│ ├── edito_image.png
│ └── hexmachina_wordcloud.png
└── newsletter_report.txt
For questions or contributions, contact Mathieu Crilout at mathieu.crilout@gmail.com.
If you find this useful, give it a ⭐ on GitHub! 😊
The code is public, you can look at it, but this software is proprietary and owned by Mathieu Crilout.
Unauthorized use, distribution, or modification is prohibited.