GitHub - justinhuang0208/brain_viewer: Desktop GUI and CLI toolkit for WorldQuant Brain API workflows, including datasets, operators, alpha generation, simulations, backtests, evolution, and Telegram worker automation.

WorldQuant Brain Toolbox

An integrated desktop toolbox that includes:

Datasets: Browse and search dataset fields (including Gemini AI semantic search) and import selected fields
Backtests: View backtest results and import data
Strategy Generator: Generate strategy code using fields and templates, then export or send to Simulation
Simulation: Batch simulations interacting with the WorldQuant Brain API, with progress tracking and result export

This app is built with PySide6 for the GUI and supports macOS.

Project Structure Overview

brain_viewer/
  app.py                    # Integrated entry (four tabs)
  dataset_viewer.py         # Dataset browsing and search (with Gemini)
  backtest_viewer.py        # Backtest data viewer (pairs with data/)
  generator.py              # Strategy templates/parameters and export
  simulation.py             # Simulation widget (integrated into Simulation tab in app.py)
  datasets/                 # Datasets containing *_fields_formatted.csv
  data/                     # Backtest or simulation outputs (CSV/LOG)
  alphas/                   # Strategy files generated by the strategy generator
  templates/                # Strategy templates (.json)

Prerequisites

Python 3.10+ (3.11 or later recommended)
OS: macOS (Windows/Linux also possible; install corresponding dependencies yourself)

Required packages (install with pip):

pip install PySide6 pandas matplotlib python-dotenv google-generativeai requests

Quick Start

Create a virtual environment and install dependencies

python3 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install PySide6 pandas matplotlib python-dotenv google-generativeai requests

Configure Gemini API (optional but recommended; used for AI Search)

Create a .env file at the project root:

cat > .env << 'EOF'
GEMINI_API_KEY=your_api_key
EOF

dataset_viewer.py automatically reads .env and initializes Gemini:
- If successful, you can switch the search mode at the top-right to "AI Search"
- If the key is missing/invalid, it will show "AI Search (API key required)" and disable the feature

Launch the integrated application

python app.py

Modules and Operations

Datasets
- The left panel lists *_fields_formatted.csv files under the datasets/ directory
- The right panel shows a table to browse fields, filter, sort, and visualize distributions
- After selecting fields, click "Import Selected Fields to Generator" to send them to the Strategy Generator
- Search modes: Normal (keyword) / AI (Gemini semantic search)
- CLI API refresh uses WQ Brain OPTIONS /simulations to discover region-specific universes, then queries data-sets / data-fields per universe and caches field metadata in datasets/datasets.sqlite
- CLI and GUI dataset caches are separate: the GUI continues to browse *_fields_formatted.csv; the CLI reads datasets/datasets.sqlite after datasets refresh
- CLI dataset refresh waits for a user-triggered WQ session refresh and retries the same API request when a saved session expires mid-refresh
- WQ rate limits are handled by respecting Retry-After on HTTP 429 and retrying transient 500/502/503/504 responses
- Simulation workers read x-ratelimit-limit, x-ratelimit-remaining, and x-ratelimit-reset from POST /simulations; when the daily remaining count reaches 0, the next submit waits until reset before continuing. Telegram /status shows the latest simulation quota.
Operators
- CLI refresh uses WQ Brain operators to cache FASTEXPR operator metadata under operators/operators.json
- When an operator has a documentation path, refresh also caches detailed documentation under operators/docs/<operator>.json
- level: null is valid metadata and should not be interpreted as unavailable; some access tiers are shown in the website UI rather than this API field
Backtests
- Reads backtest data under data/ (per your existing data format/process)
- Can import results to Simulation or Generator (follow the UI prompts)
Strategy Generator
- After importing fields from Datasets, combine with templates to build a list of strategies
- Use "Preview" to review output; or generate files into alphas/
- "Import to Simulation" sends the selected strategies to the Simulation tab
Simulation
- Run batch simulations for strategies imported from the Generator, with progress display and row-by-row highlight
- Outputs CSV and LOG files under the data/ directory (filenames include timestamps)
- Requires WorldQuant Brain login credentials (see the next section)
Background Worker
- A persistent worker can monitor Telegram commands and continuously drain pending simulation jobs from .brain_cli/jobs/
- Launching the GUI will also auto-start this worker in the background

Gemini API Setup (AI Search)

Obtain an API Key:
- Request a Gemini API Key in Google AI Studio
Save the key into .env:

echo "GEMINI_API_KEY=your_api_key" >> .env

Ensure packages are installed:

pip install google-generativeai python-dotenv

After launching the app, switch to "AI Search" in the top-right of the Datasets tab. Enter a natural-language description of what you are looking for (e.g., "indicators related to earnings momentum"). The system will perform semantic matching to suggest relevant fields.

Common issues:

If the status bar indicates an invalid API key or the feature is disabled, ensure .env exists, the content is correct, and the terminal process has permission to read it.

Add a New Dataset

Location: datasets/

GUI CSV filename rule: Must match *_fields_formatted.csv, for example: custom_demo_fields_formatted.csv

Required columns (case-sensitive):

Field (string, field name)
Description (text description)
Type (recommended categories such as Vector/Matrix/Scalar)
Coverage (coverage; recommended as a percentage string like 95%. If numeric, the app will append % automatically)
Users (integer)
Alphas (integer)

CLI refresh writes to datasets/datasets.sqlite and also exposes these metadata columns through datasets show, datasets search, and datasets export-fields:

Region (simulation region, for example USA)
Delay (simulation delay, for example 1)
Universe (simulation universe, for example TOP3000)

Use datasets scopes to see which region/delay combinations are cached. For live smoke tests, refresh a small slice first:

python brain_cli.py datasets refresh --dataset-id analyst10 --universes TOP3000,TOP1000
python brain_cli.py datasets scopes
python brain_cli.py datasets show analyst10 --region USA --delay 1

Minimal viable example (CSV content):

Field,Description,Type,Region,Delay,Universe,Coverage,Users,Alphas
demo_close,Daily close price,Vector,USA,1,TOP3000,95%,120,300
demo_volume,Daily volume,Vector,USA,1,TOP3000,92%,110,280
demo_return_5d,5-day return,Vector,USA,1,TOP1000,88%,90,250
demo_beta,Market beta estimate,Scalar,USA,1,TOP500,80%,60,180
demo_inst_density,Institutional density,Matrix,USA,1,TOPSP500,75%,40,120

Save the above as datasets/custom_demo_fields_formatted.csv, then return to the app and click it in the left list to load. The app will automatically create a corresponding .db (SQLite) in the same directory to accelerate browsing and sorting.

WorldQuant Brain Credentials (for Simulation)

The Simulation tab interacts with the WorldQuant Brain API. Create a credentials.json in the project root:

{
  "email": "your_email@example.com",
  "password": "your_password"
}

Notes:

Use Check Login in the Simulation tab before running simulations.
When biometric verification (persona) is required, the app will try to open the persona page automatically in your browser.
After finishing the scan in browser, return to the app and click 我已完成驗證 to complete login confirmation.
Successful GUI or CLI login persists WQ cookies into session.pkl / login_time.pkl, so later Simulation and Dataset refresh flows can reuse the same login state.
CLI login stores a pending Persona session in pending_session.pkl / pending_persona.json while waiting for verification. This avoids creating a new Persona inquiry every time a different CLI command is run.
auth login-status is a passive check: it validates saved cookies with OPTIONS /simulations and does not start a new Persona flow. Use auth login or Telegram /refresh when a new login is actually needed.
If login is not completed, Run Simulation will be blocked and ask you to finish Check Login first.
If credentials are expired/invalid, the UI will notify you and stop subsequent simulations.

CLI Interface (`brain_cli.py`)

A fully-featured headless CLI is available for AI-agent and scripting use. It has no Qt dependency and can be used independently of the desktop app.

Quick overview

python brain_cli.py <group> <command> [options]

Groups:
  auth       Login status, login, persona completion
  datasets   List, scopes, refresh, show, search, export-fields
  operators  List, refresh, show, search WQ Brain operators
  template   List, show, save, delete, placeholders
  generate   Preview strategies, generate file
  simulate   Enqueue, run, status, stop, results, reconcile, list
  alpha      List, show, history, pnl, correlation, promote, reject registry entries
  backtest   List, show, filter, score, diversity, export
  evolution  Run, from-backtest, auto-run, status, stop, results, list
  telegram   Run Telegram bot polling and send status notifications
  worker     Run the persistent worker loop

Global flags (usable anywhere in the command line)

Flag	Description
`--json`	Output a machine-readable envelope: `ok`, `status`, `data`, `warnings`, `errors`
`--credentials FILE`	Path to `credentials.json` (default: project root)

Examples

# Check templates
python brain_cli.py template list --json

# Refresh and inspect WQ Brain operators
python brain_cli.py operators refresh --json
python brain_cli.py operators list
python brain_cli.py operators show ts_rank --json
python brain_cli.py operators search group --category Group --json

# Show a specific template
python brain_cli.py template show "[Default] Basic ts_rank"

# Save a new template
python brain_cli.py template save my_alpha --code "rank(close / open)"

# Preview generated strategies from a template + pools
python brain_cli.py generate preview \
  --template-name "[Default] Basic ts_rank" \
  --pool "field=close,open,volume" \
  --pool "window=5,10,20"

# List backtest files
python brain_cli.py backtest list

# Score and show top rows from a backtest CSV
python brain_cli.py backtest score 20250417_231939.csv --top 10 --json

# Diversity-filtered top candidates
python brain_cli.py backtest diversity 20250417_231939.csv --top 20 --min-hamming 0.5

# Run a single simulation (inline code)
python brain_cli.py simulate run \
  --code "rank(ts_mean(close, 20) / close)" \
  --universe TOP3000 --region USA

# Enqueue a batch from a generated .py/.csv/.json strategy file, then run
python brain_cli.py simulate enqueue \
  --params-file alphas/my_strategies.py \
  --decay 4 --truncation 0.08 \
  --json
python brain_cli.py simulate run --job-id <job_id>

# Check job status and get results
python brain_cli.py simulate status <job_id>
python brain_cli.py simulate results <job_id> --json

# Fetch official WQ Brain simulation parameter options
python brain_cli.py simulate options
python brain_cli.py simulate options --region USA
python brain_cli.py simulate options --raw --json

# Reconcile failed items whose WQ simulation URL later completed
python brain_cli.py simulate reconcile <job_id> --json

# Inspect the local alpha registry
python brain_cli.py alpha list --json
python brain_cli.py alpha show <alpha_hash_or_platform_alpha_id> --json
python brain_cli.py alpha history <alpha_hash_or_platform_alpha_id> --json
python brain_cli.py alpha pnl <alpha_hash_or_platform_alpha_id> --format csv --json
python brain_cli.py alpha correlation <base_alpha_id> <target_alpha_id> [more_target_alpha_ids...] --json
python brain_cli.py alpha promote <alpha_hash_or_platform_alpha_id> --reason "good simulation metrics" --json
python brain_cli.py alpha reject <alpha_hash_or_platform_alpha_id> --reason "turnover too high" --json

# Run evolution to generate diverse candidates
python brain_cli.py evolution run \
  --template-name "[Default] Basic ts_rank" \
  --pool "field=close,open,volume,returns" \
  --pool "window=5,10,20,40" \
  --generations 20 --pop-size 50 --top-k 10

# Seed evolution from an existing backtest CSV
python brain_cli.py evolution from-backtest \
  --template-name "[Default] Basic ts_rank" \
  --pool "field=close,open" \
  20250417_231939.csv --top-seed 5

# Closed-loop auto evolution: evolve -> simulate -> feed results back
python brain_cli.py evolution auto-run \
  --template-name "[Default] Basic ts_rank" \
  --pool "field=close,volume" \
  --rounds 3 --generations 10 \
  --universe TOP3000 --region USA --json

# Auth check
python brain_cli.py auth login-status --json
python brain_cli.py auth login
python brain_cli.py auth persona-complete

# Start Telegram bot polling
python brain_cli.py telegram run
python brain_cli.py telegram run --log-level DEBUG

# Send running simulation job progress to Telegram
python brain_cli.py telegram progress
python brain_cli.py telegram progress --job-id <job_id>

# Discover chat IDs from recent bot updates
python brain_cli.py telegram chat-id --json

# Discover and write the latest chat ID into .env
python brain_cli.py telegram chat-id --write-env

# Start the persistent worker
python brain_cli.py worker run
python brain_cli.py worker run --log-level DEBUG

# Check worker status
python brain_cli.py worker status --json

CLI job state

CLI job state for simulate and evolution is stored under .brain_cli/jobs/<job_id>.json. Use simulate list / evolution list to view all jobs. Stop a running job from another terminal with simulate stop <job_id> or evolution stop <job_id>.

Simulation jobs keep completed_count, failed_count, and recovered_count in the job summary. status=done means the worker has finished processing the queued items; inspect the summary counts to distinguish full success from completed jobs with failed items. During polling, each simulation item preserves simulation_url, last_poll_status, last_progress, last_poll_at, and alpha_id when available. Polling retries transient 500, 502, 503, and 504 responses on the same simulation URL using Retry-After when present, otherwise capped exponential backoff.

When alpha detail fetch succeeds, the raw /alphas/<alpha_id> JSON payload is saved under data/alpha_details/<alpha_id>.json; completed job items include alpha_details_file when available.

Use alpha pnl <alpha_hash_or_platform_alpha_id> to fetch the official daily PnL recordset for one completed alpha from /alphas/<alpha_id>/recordsets/pnl. If the identifier is a platform alpha ID, that exact ID is used; if the identifier is a formula hash, the registry uses the formula's canonical_alpha_id. The command saves the full payload under data/alpha_pnl/<alpha_id>.json by default, or data/alpha_pnl/<alpha_id>.csv with --format csv; CLI output returns a summary unless --include-records is used.

Use alpha correlation <base_alpha_id> <target_alpha_id> [more_target_alpha_ids...] to calculate Pearson correlation between the base alpha's daily PnL and one or more target alphas. Targets can also be supplied with --targets-file as newline text or a JSON array. The command uses cached data/alpha_pnl/<alpha_id>.json files when available, fetches missing PnL from WQ Brain, and supports --field <pnl_field>, --min-overlap <N>, and --refresh-pnl.

When using simulate enqueue or simulate run with --params-file / --params-json, command-level simulation settings such as --decay, --delay, --neutralization, --region, --truncation, and --universe are written into each queued params item. This keeps job JSON aligned with the settings that will actually be submitted.

Use simulate options to fetch the official OPTIONS /simulations schema from WQ Brain using the saved session. The command summarizes allowed settings such as region, universe, delay, decay, neutralization, truncation, pasteurization, nanHandling, lookback, and testPeriod; add --region <REGION> to show dependent choices for one region, or --raw --json for the unmodified API schema.

Simulation jobs send one Telegram message after the job finishes, regardless of how many simulation items it contains. Telegram must be configured with TELEGRAM_BOT_API_TOKEN or TELEGRAM_BOT_TOKEN, plus TELEGRAM_CHAT_ID; if it is not configured, the completion notification is skipped and logged.

If a previous item failed after WQ accepted the simulation, run simulate reconcile <job_id> --json. Reconcile checks failed items with simulation_url; when WQ now returns COMPLETE or WARNING with an alpha ID, it fetches /alphas/<alpha_id>, appends the result CSV row if missing, updates the alpha registry, moves the item to completed, and increments recovered_count.

Alpha registry state is stored in .brain_cli/alphas.sqlite. This registry is an index over alpha code, WQ platform alpha IDs, simulation attempts, and lifecycle events; it does not replace job JSON or result CSV files. simulate enqueue records candidate alphas, and completed/failed simulations update the registry with metrics, links, errors, and history events.

The registry is formula-family oriented: alphas.alpha_hash identifies normalized formula code, while alpha_platform_ids.alpha_id tracks every WQ platform alpha generated for that formula. A formula may have many platform alpha IDs because reruns and settings sweeps can create duplicate IDs. canonical_alpha_id is the stable representative used for formula-hash operations such as alpha pnl <alpha_hash>; latest_alpha_id and latest_result_link point to the newest observed platform result. alpha show/history/promote/reject/pnl/correlation accept either a formula hash or any known platform alpha ID.

CLI authentication reuses the same persisted WQ cookie files as the GUI (session.pkl / login_time.pkl), matching the open_machine-style login flow.

Important authentication behavior:

auth login-status only checks the current saved session. It does not call POST /authentication and therefore does not consume a new Persona inquiry.
auth login starts or resumes a Persona flow. If a pending Persona URL already exists, it reuses that URL/session instead of generating a new one.
auth persona-complete is equivalent to resuming the pending Persona flow from the CLI.
When login succeeds, saved cookies are written to session.pkl and login_time.pkl; pending Persona files are cleared.

Telegram /status counts jobs directly from the JSON files in .brain_cli/jobs/. If an old job remains pending, it will be counted as pending even if no process is running. For abandoned simulation jobs with "pid": null, mark them stopped rather than deleting the file if you want to preserve history.

Telegram integration

Telegram support is optional and uses direct Bot API HTTP calls (no extra SDK required). Add these variables to .env:

TELEGRAM_BOT_API_TOKEN=your_bot_token
TELEGRAM_CHAT_ID=your_chat_id

After that, start the polling loop:

python brain_cli.py telegram run

Supported Telegram commands:

/refresh / /refresh_session: refresh the saved WQ session, including Persona verification handoff with an inline confirmation button
/status / /stat: send the current session state plus simulation/evolution job counts
/progress / /sim_progress [job_id]: send progress for running simulation jobs, including processed count and active simulation item polling progress
/help / /start: show available commands

Recommended Telegram login flow:

Start the bot polling loop with python brain_cli.py telegram run.
Send /refresh to the bot in Telegram.
Open the Persona URL sent by the bot and complete verification.
Press the inline 我已完成驗證 button in Telegram.
The same long-running bot process submits the Persona completion request and saves the WQ session.

This is the preferred flow when Persona quota is limited because the Persona URL and pending HTTP session stay in the same process, matching the open_machine pattern.

To discover TELEGRAM_CHAT_ID, first send a message like /start to your bot, then run:

python brain_cli.py telegram chat-id --json

If you want the tool to automatically write the newest discovered chat ID into .env, run:

python brain_cli.py telegram chat-id --write-env

If multiple chats have interacted with the bot, the tool will list them and use the most recent one when --write-env is set.

When GUI/CLI dataset refresh or simulation flows detect invalid login or expired session state, the app also sends Telegram notifications to the configured chat.

Persistent worker behavior

brain_cli.py worker run starts a long-lived process that does two things at the same time:

Starts Telegram monitoring (if Telegram is configured)
Repeatedly scans .brain_cli/jobs/ for pending simulation jobs and runs them

Foreground worker run enables console logging by default. You should see startup lines for the worker state, Telegram monitoring, and simulation job scans, for example:

Starting persistent brain worker…
2026-05-02 10:00:00 INFO [MainThread] Brain worker state written: pid=12345 poll_interval=3s state_file=...
2026-05-02 10:00:00 INFO [MainThread] Telegram monitoring configured: chat_id_configured=yes poll_timeout=60s
2026-05-02 10:00:00 INFO [MainThread] Worker scan: total=0 pending=0 running=0 done=0 failed=0 stopped=0 next_pending=-
2026-05-02 10:01:01 INFO [brain-telegram-worker] Telegram polling active; no updates.

Use --log-level DEBUG for lower-level Telegram polling details, or --log-level WARNING to quiet normal heartbeat logs.

This mirrors the open_machine-style worker pattern: one always-on process watches commands and pending work together.

The desktop GUI auto-starts this worker on launch, so opening app.py also brings up the same background processing model.

The Simulation tab now follows the same architecture: clicking Run Simulation enqueues a simulation job into .brain_cli/jobs/, and the persistent worker executes it. The GUI no longer runs the simulation request loop directly; instead, it polls the job state and updates the table/result flow from worker-owned job progress.

How to Run

Integrated app (recommended):

python app.py

Headless CLI (no Qt required):

python brain_cli.py --help

Run individually (for debugging or split work only):
- Datasets: python dataset_viewer.py
- Strategy Generator: python generator.py
- Backtests: python backtest_viewer.py (requires readable data under data/)

simulation.py is integrated as a widget within app.py; running it standalone is not recommended.

FAQ

macOS display issues?
- This project enforces a light color palette. If the appearance looks abnormal, please update PySide6 and matplotlib.

Versions and Outputs

Generated strategy files: alphas/alpha_custom_YYYYmmdd_HHMMSS.py
Simulation outputs: data/YYYYmmdd_HHMMSS.csv and the corresponding .log

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
datasets		datasets
operators		operators
templates		templates
.gitignore		.gitignore
README.md		README.md
alpha_registry.py		alpha_registry.py
app.py		app.py
backtest_viewer.py		backtest_viewer.py
brain_cli.py		brain_cli.py
brain_worker.py		brain_worker.py
cli_services.py		cli_services.py
dataset_viewer.py		dataset_viewer.py
evolution.py		evolution.py
generator.py		generator.py
parameters.py		parameters.py
simulation.py		simulation.py
telegram_integration.py		telegram_integration.py
wq_session.py		wq_session.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WorldQuant Brain Toolbox

Project Structure Overview

Prerequisites

Quick Start

Modules and Operations

Gemini API Setup (AI Search)

Add a New Dataset

WorldQuant Brain Credentials (for Simulation)

CLI Interface (`brain_cli.py`)

Quick overview

Global flags (usable anywhere in the command line)

Examples

CLI job state

Telegram integration

Persistent worker behavior

How to Run

FAQ

Versions and Outputs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WorldQuant Brain Toolbox

Project Structure Overview

Prerequisites

Quick Start

Modules and Operations

Gemini API Setup (AI Search)

Add a New Dataset

WorldQuant Brain Credentials (for Simulation)

CLI Interface (brain_cli.py)

Quick overview

Global flags (usable anywhere in the command line)

Examples

CLI job state

Telegram integration

Persistent worker behavior

How to Run

FAQ

Versions and Outputs

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CLI Interface (`brain_cli.py`)

Packages