F3 National Administrators: Please refer to the National Administrator Guide for security and coordination best practices.
This project contains a suite of modular Python scripts designed to harvest, clean, and consolidate local F3 region data (like WordPress backblasts or legacy Google Sheets) into a clean, standardized format suitable for the F3 National Database.
Because different F3 regions use different tools, these scripts are built modularly. You don't have to use the entire pipeline.
- If your region only has a WordPress XML export and nothing else, you can run just
convert.py. - If your region only manages data via Google Sheets, you can primarily rely on
generate_user_reports.pyandextract_missing_qs.py.
Use the tools that fit your region's historical tracking methods.
This is where you place your raw data.
user_master.csv: The authoritative master list of users from the F3 National Database. Auto-generated by runningfetch_master_users.py.locations.csv: Your region's AOs (Area of Operations).F3region.wordpress.com...xml(or similar): Your WordPress XML export containing backblasts.legacy_pax_directory.csv/legacy_master_directory.csv: Exports of your old Google Sheets user data.legacy_q_schedule.csv: Export of your old Q schedule spreadsheet.manual_aliases.json: (Optional) A manually created file to override logic for difficult name matches. Seesamples/for an example.aliases.json&display_aliases.json: (Auto-generated bybuild_alias_map.py) Mapping dictionaries to link old aliases to canonical F3 National IDs.
This is where the scripts will drop the cleaned, formatted data ready for national integration.
{REGION_NAME}_wordpress_backblasts.csv: The master backblast repository linking Dates, AOs, Qs, and PAX attendees from WordPress exports.{REGION_NAME}_missing_users.csv: Users assigned a TMP_ID_X because they could not be found directly.{REGION_NAME}_qschedule_nobackblast.csv: Q schedule events that have no corresponding backblast in the National DB or WordPress extracts.{REGION_NAME}_qschedule_nonworkoutevents.csv: Social events, Q-Sources, or other non-workout events separated from the main deficiency list.event_overview.md: An automated summary report of your migration data health.my_users.csv: A unified master roster of every single PAX extracted from legacy files, and WordPress.my_users_output.csv: The official output received from the NationalBulkUserCreatescript containing definitive database IDs.users_alias.csv&users_downrange.csv: Audit logs documenting exactly how the scripts intelligently matched your old regional aliases to actual F3 National accounts.
This folder contains structurally accurate but anonymized examples of every possible input and output file.
samples/input/: Review these files if you need to know exactly how to format your region's raw data for the scripts to successfully parse them.samples/output/: An empty structure showing where the final merged data files will be placed.
To run the full suite, execute these processes in the following exact order:
Purpose: Connects directly to the F3 National PostgreSQL Database to download the latest global roster into import/user_master.csv. This prevents having to manually download the CSV file. Note: This assumes you have access to a .env file with database credentials.
Important
This script connects to the production environment and requires specific database credentials. Most regions will not have direct access to run this script. If you are a Regional Admin, you should work with a National Admin to either have them run this script for you or provide you with an updated user_master.csv export from the National Database.
Purpose: Scrapes all legacy files and WordPress XML data to find unrecognized user names. It uses intelligent algorithms (exact email, first/last name matches, and heuristic Regex scrubbing) to map these stray aliases back to authoritative users in user_master.csv.
Purpose: Cross-references your legacy user directories, WordPress Authors, and PAXminer users to dump a completely unified user base formatted strictly to the National Guidelines bulk import schema. Outputs:
output/my_users.csv
Purpose: Run the official F3-Nation/database-helpers user creation script against your newly generated my_users.csv to officially insert your region's pax into the national database.
Command: python import_users.py my_users.csv
Action Needed: Move the resulting my_users_output.csv straight into your output/ directory so our data conversion scripts can read the brand new database IDs!
Purpose: Setup your specific region name so that the file outputs are accurately prefixed.
Purpose: Parses the WordPress XML feed and the legacy Q schedule array. convert.py directly translates the WP XML backasts into output/{REGION_NAME}_wordpress_backblasts.csv. extract_missing_qs.py parses legacy Qs and cross references the national DB to output cleanly formatted skipped backblasts to output/{REGION_NAME}_qschedule_nobackblast.csv.
Outputs:
output/{REGION_NAME}_wordpress_backblasts.csvoutput/{REGION_NAME}_qschedule_nobackblast.csvoutput/{REGION_NAME}_qschedule_nonworkoutevents.csvoutput/{REGION_NAME}_missing_users.csvoutput/event_overview.md