Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

`email-normalize` is a Python 3.11+ library that normalizes email addresses by stripping mailbox-provider-specific behaviors (plus addressing, period stripping, etc.). It uses async DNS (aiodns) to resolve MX records and match them against known providers, with a synchronous wrapper for non-async callers.

## Commands

```bash
# Install dependencies (uses uv)
uv sync --all-extras

# Run tests
uv run coverage run
uv run coverage report

# Run a single test
uv run python -m unittest tests.test_normalize.MailboxProviderTestCase.test_google

# Lint (ruff format + ruff check via pre-commit)
uv run pre-commit run --all-files
```

## Architecture

Two-module library under `email_normalize/`:

- **`__init__.py`** — Core logic: `Normalizer` (async class with LFRU-cached MX lookups), `Result` dataclass, and `normalize()` sync wrapper. The `Normalizer` resolves MX records, matches them to providers, then applies provider-specific normalization rules. `skip_dns=True` mode bypasses MX lookups and uses a static `DomainMap` instead.

- **`providers.py`** — Provider definitions: `Rules` flag enum (`PLUS_ADDRESSING`, `STRIP_PERIODS`, `LOCAL_PART_AS_HOSTNAME`), `MailboxProvider` base class, concrete provider classes (Apple, Fastmail, Google, etc.), `Providers` list (for MX matching), and `DomainMap` dict (for skip_dns mode).

## Code Style

- Ruff with 79-char line length, single quotes
- See `pyproject.toml` `[tool.ruff.lint]` for the full rule selection
85 changes: 67 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,84 @@
# email-normalize

`email-normalize` is a Python 3.11+ library for returning a normalized email-address
stripping mailbox-provider-specific behaviors such as "Plus addressing"
(foo+bar@gmail.com).
A Python 3.11+ library for normalizing email addresses by stripping
mailbox-provider-specific behaviors such as plus addressing
(`foo+bar@gmail.com`) and period ignoring (`f.o.o@gmail.com`).

![Version](https://img.shields.io/pypi/v/email-normalize.svg?)
![Status](https://github.com/gmr/email-normalize/workflows/Testing/badge.svg?)
![Coverage](https://img.shields.io/codecov/c/github/gmr/email-normalize.svg?)
![License](https://img.shields.io/pypi/l/email-normalize.svg?)

## Example
## Installation

```bash
pip install email-normalize
```

## Usage

### Synchronous

```python
import email_normalize

result = email_normalize.normalize('f.o.o+bar@gmail.com')
print(result.normalized_address) # foo@gmail.com
print(result.mailbox_provider) # Google
print(result.mx_records) # [(5, 'gmail-smtp-in.l.google.com'), ...]
```

### Async

For use within an asyncio application, use the `Normalizer` class directly:

```python
import asyncio

import email_normalize

# Returns ``foo@gmail.com``
normalized = email_normalize.normalize('f.o.o+bar@gmail.com')

async def main():
normalizer = email_normalize.Normalizer()
result = await normalizer.normalize('f.o.o+bar@gmail.com')
print(result.normalized_address)

asyncio.run(main())
```
Comment thread
coderabbitai[bot] marked this conversation as resolved.

## Currently Supported Mailbox Providers
The `Normalizer` maintains a LFRU cache of MX lookups, making it efficient
for batch processing.

### Without DNS Lookups

Use `skip_dns=True` to normalize against well-known domains without
performing MX record lookups:

```python
result = email_normalize.normalize('user+tag@gmail.com', skip_dns=True)
```

This mode uses a static domain map and will not detect providers for
custom domains.

## Normalization Rules

| Provider | Plus Addressing | Strip Periods | Local Part as Hostname |
|------------|:---------------:|:-------------:|:----------------------:|
| Apple | x | | |
| Fastmail | x | | x |
| Google | x | x | |
| Microsoft | x | | |
| ProtonMail | x | | |
| Rackspace | x | | |
| Yahoo | | | |
| Yandex | x | | |
| Zoho | x | | |

- Apple
- Fastmail
- Google
- Microsoft
- ProtonMail
- Rackspace
- Yahoo
- Yandex
- Zoho
- **Plus Addressing**: Strips everything after `+` in the local part
- **Strip Periods**: Removes `.` from the local part
- **Local Part as Hostname**: Extracts the subdomain as the local part (Fastmail custom domains)

## Python Versions Supported
## Documentation

3.11+
Full documentation is available at [gmr.github.io/email-normalize](https://gmr.github.io/email-normalize/).
Loading