Phil

Phil is a representation-guided imputation library for missing tabular data.

It generates multiple imputations using a configurable strategy grid, computes Euler Characteristic Transform (ECT) descriptors over each imputed dataset, and selects the most representative imputation from the candidate set.

Installation

pip install phil

phil requires the trailed backend for ECT computation. Install it from the KRV research index or provide a compatible local build.

What Phil Does

Impute — runs a grid of imputation strategies (sklearn estimators or custom) over the input dataframe, producing a set of candidate datasets
Describe — computes an ECT descriptor for each candidate via the trailed backend
Select — picks the candidate closest to the mean descriptor (most representative imputation)
Transform — exposes the fitted pipeline for inference on new data

Quick Start

import pandas as pd
from phil import Phil

df = pd.read_csv("data_with_missing.csv")

phil = Phil(samples=30, random_state=42)
imputed_df = phil.fit(df)

# Apply the same fitted pipeline to new data
new_df = phil.transform(new_data)

scikit-learn Pipeline Integration

from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from phil import PhilTransformer

pipe = Pipeline([
    ("imputer", PhilTransformer(samples=20, random_state=0)),
    ("model", RandomForestClassifier()),
])
pipe.fit(X_train, y_train)

MCP Server

Phil ships a FastMCP-based MCP server that lets Claude, Cursor, Gemini CLI, and other MCP-capable agents run imputation sweeps on your pandas or polars dataframes without writing Python.

Install the mcp extra and launch the server with uv tool run or pipx:

pip install "philler[mcp]"
phil-mcp                                  # persistent install
# or, ephemeral via uv:
uv tool run --from "philler[mcp]" phil-mcp

Example Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "phil": {
      "command": "uv tool run",
      "args": ["--from", "philler[mcp]", "phil-mcp"]
    }
  }
}

The server exposes tools for the full sweep workflow — ingest_dataset, characterize_dataset, list_grids, create_config, validate_config, run_imputation_sweep, diagnose_sweep, export_imputed_data, and more. Polars users write to Parquet and ingest the file path. See the MCP guide for setup tabs, the full tool table, and an example dialog.

For local end-to-end testing with medical missing-data examples, use demos/medical.

Configuration

Imputation grids

Phil ships with named grids accessible via GridGallery:

Name	Methods
`default`	BayesianRidge, DecisionTree, RandomForest, GradientBoosting
`sampling`	DistributionImputer (empirical sampling)
`finance`	IterativeImputer, KNNImputer, SimpleImputer
`healthcare`	KNNImputer, SimpleImputer, IterativeImputer
`marketing`	SimpleImputer, KNNImputer, IterativeImputer
`engineering`	SimpleImputer, KNNImputer, IterativeImputer

Pass a grid name or an ImputationConfig directly:

from phil import Phil, ImputationConfig
from sklearn.model_selection import ParameterGrid

config = ImputationConfig(
    methods=["KNNImputer"],
    modules=["sklearn.impute"],
    grids=[ParameterGrid({"n_neighbors": [3, 5, 7]})],
)
phil = Phil(param_grid=config)

ECT descriptor

ECT is configured via ECTConfig:

from phil import Phil, ECTConfig

ect_config = ECTConfig(
    num_thetas=64,
    radius=1.0,
    resolution=100,
    scale=500,
    normalize=True,
    seed=42,
)
phil = Phil(config=ect_config)

Development

uv sync --all-extras
uv run pytest -v
uv run black phil/ tests/

Documentation

Project documentation lives under docs/source with unified API and guide pages. Build locally with uv run sphinx-build -M html docs/source docs/build.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.claude		.claude
.cursor		.cursor
.github/workflows		.github/workflows
demos		demos
docs/source		docs/source
phil		phil
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phil

Installation

What Phil Does

Quick Start

scikit-learn Pipeline Integration

MCP Server

Configuration

Imputation grids

ECT descriptor

Development

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Phil

Installation

What Phil Does

Quick Start

scikit-learn Pipeline Integration

MCP Server

Configuration

Imputation grids

ECT descriptor

Development

Documentation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages