Python pipeline that takes raw CRM exports and produces segmented, channel-ready contact lists for a B2B outbound sales team. Built for a team of 4 BDRs covering multiple territories (LATAM, EMEA, Spain, Italy).
Takes a unified contact database and distributes it into actionable weekly lists per sales identity, applying:
- Territory-based assignment — each company goes to exactly one identity based on country/region
- Channel routing — contacts are routed to the right outreach channel based on buyer presence, contact count, and language callability
- Contact caps — prevents over-prospecting a single company (max 4 buyers / 6 influencers / 6 referrers per company)
- Deduplication — by domain (company level) and LinkedIn URL (contact level)
- Weekly prioritization — selects the top 15 companies per identity for the current week's email campaign
- 3-month backup reserves — generates a second batch from leftover capped contacts for future re-prospecting cycles
| Condition | Channel |
|---|---|
| Has buyer + language match | Call list this week → email if no close |
| No buyer + ≥5 contacts + language match | Email campaign now |
| No buyer + <5 contacts + language match | Cold call list |
| Language not callable by BDR + ≥5 contacts | Email campaign (all contacts together) |
| Language not callable by BDR + <5 contacts | Non-prospectable |
| No contacts at all | Cold call (complement) |
Applied to companies with extreme contact counts (≥10 buyers or ≥15 influencers/referrers):
| TOC | Cap | Priority |
|---|---|---|
| BUYER | 4 | VP/EVP/Chief > Director Sales/Mktg > Manager |
| INFLUENCER | 6 | GM/CEO/COO > Director (non-sales) > VP/Head of |
| REFERRER | 6 | Sales/Mktg/Revenue/Reservations > others (exclude juniors) |
| LOW QUALITY | 3 | No scoring, first N kept |
| Script | Description |
|---|---|
distribuir_bbdd.py |
Initial distribution of contacts by territory and identity |
reorganizar_todo.py |
Rebuilds all output files after territory changes |
generar_exclusiones_y_reservadas.py |
Manages exclusions (replied contacts, active campaigns) |
generar_llamadas.py |
Generates weekly call lists per BDR with phone enrichment |
generar_campana_semana.py |
Selects top 15 companies per identity for this week's email campaign |
limpiar_extremos.py |
Applies 4/6/6 caps to extreme companies in campaign lists |
generar_reserva_3meses.py |
Generates 3-month backup contacts from capped-out companies |
generar_tam_aerolineas.py |
Dedicated pipeline for airline vertical (geo-based split) |
generar_tam_empresas.py |
Company-level TAM summary with TOC breakdown |
generar_xlsx_semana.py |
Formatted Excel summary for weekly BDR briefing |
generar_resumen_ejecutivo.py |
Executive Excel dashboard with metrics per BDR and identity |
Before any contact reaches an output file, three exclusion layers are applied:
| Exclusion type | Scope | Source |
|---|---|---|
| Companies with existing meetings | Entire company excluded globally | exclusion por reunion pactada/ — CRM export of accounts with a booked meeting |
| Contacts in active campaigns | Individual contact excluded (company stays) | exclusion en campana/ — one file per identity with current campaign enrollments |
| Contacts who already replied | Individual contact excluded across all identities | output/CONTACTOS_YA_RESPONDIERON.csv — built from campaign reply data |
The distinction matters: a company with one replied contact is not fully excluded — only that specific person is removed. The rest of the company's contacts remain available for prospecting.
project/
├── bases originales/ # Raw CRM exports (one CSV per region)
├── exclusion por reunion pactada/ # Companies with existing meetings (excluded globally)
├── exclusion en campana/ # Active campaign matches per identity
│ └── match con bbdd/ # Pre-matched: companies in active campaigns already in our database
└── output/ # Working files (generated by scripts)
All input files are standard CRM exports (HubSpot format) with columns including:
First Name, Last Name, Job Title, LinkedIn URL, Company, Domain, Country, TOC (Type of Contact)
ENTREGABLE/
├── 1_TAM_POR_IDENTIDAD/ # Full contact base per identity
├── 2_LLAMAR_ESTA_SEMANA/ # Weekly call lists per BDR (with phone numbers)
├── CAMPAÑA MAIL/
│ ├── CAMPAÑA_SEMANA/ # Top 15 companies for email this week (capped)
│ ├── CAMPAÑA_OLAF/ # Remaining pipeline (capped, for future weeks)
│ ├── BACKUP_CAMPAÑA_SEMANA/# 3-month reserve from this week's companies
│ └── BACKUP_OLAF/ # 3-month reserve from OLAF pipeline
├── 5_COLD_CALL_HOTELES/ # Hotels without buyer for cold outreach
├── 6_NO_PROSPECTABLE/ # Contacts that don't fit any channel
└── AEROLINEAS/ # Airline vertical (split by contact geo region)
├── DIEGO_AEROLINEAS_GEO.csv # EMEA + APAC contacts
└── RODRIGO_AEROLINEAS_GEO.csv # LATAM + USA contacts
pip install -r requirements.txtScripts use absolute paths configured at the top of each file — update the OUTPUT, ENTREGABLE, and BASES variables to match your local directory structure before running.
Domain normalization — all company matching is done via normalized domain (strips https://, www., paths and query strings) to handle inconsistent CRM data.
Airline geo split — unlike hotels (split by company country), airline contacts are split by the contact's own geo region (Geo region field from LinkedIn enrichment). A single airline company can appear in both lists with different contacts, enabling parallel outreach by timezone.
Soft deduplication — contacts are deduplicated by LinkedIn URL within and across identities. Companies are deduplicated by domain. No contact appears in more than one identity's hotel base; airline contacts may appear in both Diego-Aero and Rodrigo-Aero only if they are different people.