diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..99396c0 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,37 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +`email-normalize` is a Python 3.11+ library that normalizes email addresses by stripping mailbox-provider-specific behaviors (plus addressing, period stripping, etc.). It uses async DNS (aiodns) to resolve MX records and match them against known providers, with a synchronous wrapper for non-async callers. + +## Commands + +```bash +# Install dependencies (uses uv) +uv sync --all-extras + +# Run tests +uv run coverage run +uv run coverage report + +# Run a single test +uv run python -m unittest tests.test_normalize.MailboxProviderTestCase.test_google + +# Lint (ruff format + ruff check via pre-commit) +uv run pre-commit run --all-files +``` + +## Architecture + +Two-module library under `email_normalize/`: + +- **`__init__.py`** — Core logic: `Normalizer` (async class with LFRU-cached MX lookups), `Result` dataclass, and `normalize()` sync wrapper. The `Normalizer` resolves MX records, matches them to providers, then applies provider-specific normalization rules. `skip_dns=True` mode bypasses MX lookups and uses a static `DomainMap` instead. + +- **`providers.py`** — Provider definitions: `Rules` flag enum (`PLUS_ADDRESSING`, `STRIP_PERIODS`, `LOCAL_PART_AS_HOSTNAME`), `MailboxProvider` base class, concrete provider classes (Apple, Fastmail, Google, etc.), `Providers` list (for MX matching), and `DomainMap` dict (for skip_dns mode). + +## Code Style + +- Ruff with 79-char line length, single quotes +- See `pyproject.toml` `[tool.ruff.lint]` for the full rule selection diff --git a/README.md b/README.md index 3adfa09..88343be 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,84 @@ # email-normalize -`email-normalize` is a Python 3.11+ library for returning a normalized email-address -stripping mailbox-provider-specific behaviors such as "Plus addressing" -(foo+bar@gmail.com). +A Python 3.11+ library for normalizing email addresses by stripping +mailbox-provider-specific behaviors such as plus addressing +(`foo+bar@gmail.com`) and period ignoring (`f.o.o@gmail.com`). ![Version](https://img.shields.io/pypi/v/email-normalize.svg?) ![Status](https://github.com/gmr/email-normalize/workflows/Testing/badge.svg?) ![Coverage](https://img.shields.io/codecov/c/github/gmr/email-normalize.svg?) ![License](https://img.shields.io/pypi/l/email-normalize.svg?) -## Example +## Installation + +```bash +pip install email-normalize +``` + +## Usage + +### Synchronous + +```python +import email_normalize + +result = email_normalize.normalize('f.o.o+bar@gmail.com') +print(result.normalized_address) # foo@gmail.com +print(result.mailbox_provider) # Google +print(result.mx_records) # [(5, 'gmail-smtp-in.l.google.com'), ...] +``` + +### Async + +For use within an asyncio application, use the `Normalizer` class directly: ```python +import asyncio + import email_normalize -# Returns ``foo@gmail.com`` -normalized = email_normalize.normalize('f.o.o+bar@gmail.com') + +async def main(): + normalizer = email_normalize.Normalizer() + result = await normalizer.normalize('f.o.o+bar@gmail.com') + print(result.normalized_address) + +asyncio.run(main()) ``` -## Currently Supported Mailbox Providers +The `Normalizer` maintains a LFRU cache of MX lookups, making it efficient +for batch processing. + +### Without DNS Lookups + +Use `skip_dns=True` to normalize against well-known domains without +performing MX record lookups: + +```python +result = email_normalize.normalize('user+tag@gmail.com', skip_dns=True) +``` + +This mode uses a static domain map and will not detect providers for +custom domains. + +## Normalization Rules + +| Provider | Plus Addressing | Strip Periods | Local Part as Hostname | +|------------|:---------------:|:-------------:|:----------------------:| +| Apple | x | | | +| Fastmail | x | | x | +| Google | x | x | | +| Microsoft | x | | | +| ProtonMail | x | | | +| Rackspace | x | | | +| Yahoo | | | | +| Yandex | x | | | +| Zoho | x | | | -- Apple -- Fastmail -- Google -- Microsoft -- ProtonMail -- Rackspace -- Yahoo -- Yandex -- Zoho +- **Plus Addressing**: Strips everything after `+` in the local part +- **Strip Periods**: Removes `.` from the local part +- **Local Part as Hostname**: Extracts the subdomain as the local part (Fastmail custom domains) -## Python Versions Supported +## Documentation -3.11+ +Full documentation is available at [gmr.github.io/email-normalize](https://gmr.github.io/email-normalize/).