Skip to content

2533757653/Freqai_RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Freqai_RL

Reinforcement Learning Trading Strategies for Freqtrade FreqAI — achieved ~20% returns in 15 days of live trading.


Quick Start (5 Minutes)

1. Install Dependencies

pip install freqtrade gym pandas pandas-ta talib

2. Copy Files to Freqtrade Directory

# Copy model files
cp freqaimodels/*.py ~/.freqtrade/user_data/freqaimodels/

# Copy strategy files
cp strategies/*.py ~/.freqtrade/strategies/

# Copy strategy configs
cp strategies/*.json ~/.freqtrade/strategies/

3. Configure config.json

{
  "freqai": {
    "enabled": true,
    "period": 7,
    "backtest_period_days": 45,
    "train_period_days": 15,
    "identifier": "my_rl_model",
    "save_models": true
  },
  "pairlists": {
    "method": "VolumePairList",
    "number_assets": 10
  }
}

4. Train & Run

# Regression Strategy (stable returns, 3x leverage)
freqtrade trade --config config.json --strategy RLStrategy_Regression --freqaimodel RL_Model_Regression --dry-run

# Trend Strategy (excess returns, 1x leverage)
freqtrade trade --config config.json --strategy RLStrategy_Trend --freqaimodel RL_Model_Trend --dry-run

Tip: Models are automatically trained on first run and saved to user_data/freqaimodels/.


Live Performance

Simulated Trading (15 Days)

Simulated Trading

~20% returns with Trend + Regression combined portfolio over 15 days


Regression Strategy

Metric Value
Backtest Annual Return ~70%
Target Return 2%
Leverage 3x
Philosophy High-frequency micro-profit + strict risk control

Regression Live

Regression Training Log


Trend Strategy

Metric Value
Backtest Annual Return ~50%
Target Return 10%
Leverage 1x
Philosophy Trend following + generous volatility tolerance

Trend Live

Trend Training Log


Strategy Overview

Dimension Trend Strategy Regression Strategy
Design Philosophy Trend Following Mean Reversion
Timeframe 2h 1h
Leverage 1x 3x
Target Return 10% 2%
Stoploss Threshold -15% -20%
Backtest Annual Return ~50% ~70%
Core Mechanism Potential Shaping Multi-factor Reward Decomposition
Normalization Online Normalization Extended Observation Space

Regression Strategy Deep Dive

Philosophy: High-frequency Micro-profit + Strict Risk Control

The Regression strategy applies mean reversion thinking — capturing short-term price reversions and accumulating gains through high-frequency micro-profits.

Core Reward Mechanism

Total Reward = Delta Reward + Reversion Reward + Quick Profit Bonus + Time Penalty + Stoploss Penalty

1. Delta PNL Reward

reward = delta_p * 2.5

Rewards/punishes based on unrealized P&L changes each tick.

2. Reversion Reward

if unreal_pnl > 0:
    return unreal_pnl * 0.4
return 0.0

Extra reward only when profitable — encourages holding winning positions.

3. Quick Profit Bonus

if unreal_pnl > 0.008 and hold_time <= 5:
    return 0.015

Rewards fast profits (0.8%+ in 5 ticks) — encourages quick entries and exits.

4. Time Penalty

if hold_time <= min_holding_time:
    return 0.003          # Positive reward during min holding
elif hold_time <= max_holding_time:
    return -0.006 * (hold_time - min_holding_time)  # Linear penalty
else:
    return -0.10           # Heavy penalty for exceeding max holding

Linear time penalty ensures the model doesn't over-hold.

5. Extended Observation Space (+7 Dimensions)

Feature Range Purpose
hold_ratio [0, 1] Current holding time ratio
unreal_pnl_norm [-1, 1] Normalized unrealized P&L
drawdown_ratio [0, 1] Drawdown from peak
position_val {-1, 0, 1} Current position direction
win_rate [0, 1] Episode win rate
profit_loss_delta [-1, 1] Consecutive profit/loss state
pnl_from_entry_norm [-1, 1] Return relative to entry price

Risk Control Design

  1. Trailing Stoploss: Dynamic adjustment based on profit level

    if current_profit > 0.08: return -0.008
    elif current_profit > 0.05: return -0.015
    elif current_profit > 0.015: return -0.04
  2. Action Cooldown: action_cooldown = 3 prevents overtrading

  3. Whipsaw Penalty: Extra -0.025 penalty after 3 consecutive losses


Trend Strategy Deep Dive

Philosophy: Chase Big Trends + Allow Volatility

The Trend strategy applies trend following thinking — using Potential Shaping to guide the model toward big trends, with relaxed volatility tolerance.

Core Mechanism: Potential Shaping

Potential Shaping is a reward shaping technique that introduces the concept of "potential energy":

def _compute_potential(self) -> float:
    unreal = float(self.get_unrealized_profit() or 0.0)
    return self.potential_coef * unreal  # 0.01 * unrealized profit

def calculate_reward(self, action):
    new_potential = self._compute_potential()
    shaped = self.potential_gamma * new_potential - self.prev_potential  # γ=0.90
    self.prev_potential = new_potential
    return original_reward + shaped

Why Potential Shaping?

  • Raw RL struggles with sparse rewards in long-term dependencies
  • Potential Shaping converts unrealized P&L changes into immediate rewards
  • Guides the agent to focus on the trend of P&L changes, not absolute values

Step-based Reward System

Trend uses a dynamic target line mechanism that divides the profit target into multiple steps:

# Step reward: triggered when profit breaks through current target line
if unreal >= self._current_target:
    reward += unreal * 0.1           # One-time bonus
    self._current_target += self.profit_target  # Elevate target line

# Immediate reward/penalty
if unreal > base_line:
    reward += excess_reward_coef * (unreal - base_line)
else:
    reward -= excess_penalty_coef * (base_line - unreal)

Core Idea: The model must "unlock" each step to earn step-based rewards — encourages holding through big trends.

Online Normalization

Unlike Regression, Trend uses cross-episode accumulated statistics for normalization:

def _normalize_obs(self, obs):
    # Exponential moving average for mean and variance
    alpha = 1.0 / self.obs_count
    self.obs_mean = (1 - alpha) * self.obs_mean + alpha * obs
    self.obs_var = (1 - alpha) * self.obs_var + alpha * ((obs - self.obs_mean) ** 2)

    # Standardize + inject step number
    stage_num = self._current_target / self.profit_target
    return normalized_obs + [stage_num]

Key Points:

  • Normalization statistics are not cleared on reset (cross-episode accumulation)
  • Extra dimension injects current step number, letting the model sense "progress unlocked"

Relaxed Stoploss Design

Trend allows greater volatility:

if current_profit > 0.15: return -0.02
elif current_profit > 0.12: return -0.03
# ... max allows -15% stoploss

Relaxed Entry Conditions

Uses ADX indicator to determine trend strength, not RSI overbought/oversold:

long_condition = (
    (adx > buy_adx) &          # ADX > 20 indicates trend formation
    (plus_di > minus_di) &     # +DI > -DI indicates uptrend
    (close > sma_20) &         # Price above moving average
    (rsi < 70) & (rsi > 30)    # RSI not in extreme zone
)

Using Both Strategies Together

Combined Logic

Trend (Excess Return Engine) + Regression (Stable Growth Engine)
  • Trend: Captures big moves in trending markets, 10% target, 1x leverage
  • Regression: Mean reversion in volatile markets, 2% target, 3x leverage

Live Performance

  • Duration: ~15 days
  • Combined portfolio returns: ~20%
  • Regression strategy: steady consistent profits
  • Trend strategy: excess returns during trending markets

Project Structure

Freqai_RL/
├── freqaimodels/               # RL Model definitions
│   ├── RL_Model.py            # Base RL model (reference implementation)
│   ├── RL_Model_Regression.py # Regression strategy environment
│   └── RL_Model_Trend.py      # Trend strategy environment
├── strategies/                 # Freqtrade strategies
│   ├── RLStrategy.py           # Base strategy (reference implementation)
│   ├── RLStrategy_Regression.py
│   └── RLStrategy_Trend.py
├── picture/                    # Live trading screenshots
│   ├── trend.png              # Trend strategy live trading
│   ├── regression.png         # Regression strategy live trading
│   ├── simulate_run.png        # Simulated portfolio run
│   ├── trend_log.png          # Trend training log
│   └── regression_log.png     # Regression training log
└── README.md

Advanced Usage

Backtesting

freqtrade backtest --config config.json --strategy RLStrategy_Regression --freqaimodel RL_Model_Regression --timerange=20230101-20231231

Training Models Separately

freqtrade train --config config.json --strategy RLStrategy_Regression --freqaimodel RL_Model_Regression
freqtrade train --config config.json --strategy RLStrategy_Trend --freqaimodel RL_Model_Trend

Parameter Tuning

Adjust parameters in the strategy .json config files:

{
  "strategy_name": "RLStrategy_Regression",
  "parameters": {
    "buy_rsi": {"value": 30},
    "sell_rsi": {"value": 70}
  }
}

Design Philosophy Summary

Regression: High-frequency + Strict Risk Control

Design Implementation
Goal Micro high-frequency profits
Leverage 3x
Holding Time Short (max 30 ticks)
Stoploss Trailing dynamic stoploss
Reward Multi-factor decomposition, real-time feedback

Trend: Trend Following + Generous Volatility

Design Implementation
Goal Capture big trends
Leverage 1x
Holding Time Long (hold until trend ends)
Stoploss Relaxed (max -15%)
Reward Potential Shaping + Step rewards

License

MIT License — see LICENSE.

About

freqtrade RL trading strategies for freqtrade

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages