Skip to content

DiogoRibeiro7/ab-glm-abtest

Repository files navigation

ab-glm-abtest

A production-ready Python package for A/B testing analysis using Binomial Generalized Linear Models (GLMs) with logit and probit link functions.

Python 3.10+ Test Coverage License: MIT

🎯 Why Use This Package?

Most A/B testing tools make critical statistical errors:

  • ❌ Ignore clustering when users have multiple sessions
  • ❌ Report coefficients instead of business metrics
  • ❌ Use naive standard errors that underestimate uncertainty
  • ❌ Waste power by not adjusting for covariates

This package does it right:

  • Cluster-robust standard errors for valid inference with repeated measures
  • Marginal effects to convert coefficients into business metrics (ATE, Risk Ratio)
  • Covariate adjustment to increase statistical power
  • Model diagnostics including Brier scores for calibration
  • Production-ready with 97% test coverage and comprehensive documentation

📊 Key Features

  • Binomial GLM with logit and probit link functions
  • Automatic handling of clustered data (multiple sessions per user)
  • Business metrics: Absolute Treatment Effect (ATE) and Risk Ratio (RR)
  • Covariate adjustment for improved precision
  • Model diagnostics including calibration metrics
  • Simulation tools for power analysis and testing
  • Comprehensive examples with Jupyter notebooks

🚀 Quick Start

Installation

# Using Poetry (recommended)
poetry add ab-glm-abtest

# Or using pip
pip install ab-glm-abtest

Basic Usage

import pandas as pd
from ab_glm import fit_binomial_glm, marginal_effects_ate_and_rr

# Load your A/B test data
df = pd.read_csv('your_experiment_data.csv')
# Required columns: user_id, T, country_EU, device_mobile, prior_views, y

# Fit GLM with cluster-robust standard errors
glm, _, df_model, results = fit_binomial_glm(
    df,
    link="logit",
    cluster_col="user_id"
)

# Get business metrics
ate, risk_ratio, p_treatment, p_control = marginal_effects_ate_and_rr(
    results, df_model
)

print(f"Control conversion: {p_control:.1%}")
print(f"Treatment conversion: {p_treatment:.1%}")
print(f"Absolute lift: {ate*100:.2f} percentage points")
print(f"Relative lift: {(risk_ratio-1)*100:.1f}%")

Command-Line Analysis

# Analyze your experiment data
python examples/analyze_experiment.py --data your_data.csv

# Use probit link function
python examples/analyze_experiment.py --data your_data.csv --link probit

# Save results to file
python examples/analyze_experiment.py --data your_data.csv --output results.txt

📈 Example Output

============================================================
RUNNING LOGIT GLM ANALYSIS
============================================================

Sample Size:
  Users: 5,000
  Observations: 15,234
  Avg Sessions/User: 3.05

Covariate-Adjusted Results:
  Control Rate: 0.129 (12.9%)
  Treatment Rate: 0.160 (16.0%)
  ATE (Risk Diff): 0.031 (3.1 pp)
  Risk Ratio: 1.240
  Relative Lift: 24.0%
  P-value: 0.002
  Significant: Yes

Model Diagnostics:
  Brier Score: 0.105 (Good calibration)
  Link Function: logit

📚 Documentation

Comprehensive Guides

Interactive Notebooks

Example Scripts

🔧 Data Requirements

Your data must include these columns:

Column Type Description
user_id int/str Unique user identifier
T binary Treatment assignment (0=control, 1=treatment)
country_EU binary User location (0=non-EU, 1=EU)
device_mobile binary Device type (0=desktop, 1=mobile)
prior_views int Prior engagement metric
y binary Outcome (0=no conversion, 1=conversion)

Important: Treatment must be assigned at the user level, not session level.

🧮 Statistical Methods

Model Specification

The package fits the following model:

y ~ Binomial(p)
g(p) = β₀ + β₁T + β₂country_EU + β₃device_mobile + β₄prior_views

where g() is the link function (logit or probit).

Marginal Effects

Business metrics are calculated using the G-computation formula:

  • ATE = E[Y|T=1, X] - E[Y|T=0, X]
  • Risk Ratio = E[Y|T=1, X] / E[Y|T=0, X]

Cluster-Robust Standard Errors

When users have multiple sessions, observations are correlated. The package automatically computes cluster-robust standard errors to account for this.

🎯 When to Use This Package

Perfect for:

  • E-commerce conversion optimization
  • SaaS feature experiments
  • Marketing campaign testing
  • UX/UI improvements
  • Any binary outcome with user-level randomization

Not suitable for:

  • Continuous outcomes (use linear models)
  • Count data (use Poisson/Negative Binomial)
  • Time-to-event data (use survival analysis)
  • Cluster-randomized trials (use mixed effects models)

📊 Performance & Scalability

  • Memory: ~100 bytes per observation
  • Speed: Handles 1M observations in <10 seconds
  • Limits: Tested up to 10M observations
  • Coverage: 97% test coverage with 55+ tests

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

# Clone the repository
git clone https://github.com/yourusername/ab-glm-abtest.git
cd ab-glm-abtest

# Install with development dependencies
poetry install

# Run tests with coverage
poetry run pytest --cov=ab_glm --cov-report=term-missing

# Run linting
poetry run ruff check src tests

# Run type checking
poetry run mypy src

📖 Citation

If you use this package in your research, please cite:

@software{ab_glm_abtest,
  title = {ab-glm-abtest: Production-ready A/B testing with Binomial GLMs},
  author = {Diogo Ribeiro},
  year = {2025},
  url = {https://github.com/DiogoRibeiro7/ab-glm-abtest}
}

🔗 See Also

📝 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built with:


Questions? Open an issue or check the documentation.

Releases

No releases published

Packages

 
 
 

Contributors