Skip to content

Latest commit

 

History

History
145 lines (107 loc) · 4.93 KB

File metadata and controls

145 lines (107 loc) · 4.93 KB

Adding a New Bank Parser

This guide walks through adding support for a new bank's CSV format to spendstory.

Overview

Each parser is a function that reads a CSV file and returns a list of transaction dicts matching the transaction schema. Detection happens by inspecting CSV headers — filenames don't matter.

Steps

1. Create the parser file

Create spendstory/parsers/bankname.py:

"""Parser for BankName CSV exports."""
from __future__ import annotations
from pathlib import Path
from spendstory.schema import make_txn

def parse_bankname(filepath: Path) -> list[dict]:
    """Parse BankName CSV into normalized transactions."""
    import csv

    transactions = []
    with open(filepath, newline="", encoding="utf-8-sig") as f:
        reader = csv.DictReader(f)
        for row in reader:
            # Adapt these field names to match the bank's CSV headers
            date_str = row["Date"]           # Parse to YYYY-MM-DD
            description = row["Description"]
            amount_str = row["Amount"]

            # Determine debit vs credit
            # Some banks use negative = debit, positive = credit
            # Others have separate Debit/Credit columns
            amount = float(amount_str.replace(",", "").replace("$", ""))
            txn_type = "debit" if amount > 0 else "credit"

            txn = make_txn(
                date=date_str,        # Will be normalized to YYYY-MM-DD
                description=description,
                amount=abs(amount),   # Always positive
                txn_type=txn_type,
                source_account="bankname_account",
                account_type="personal",  # or "business"
            )
            transactions.append(txn)

    return transactions

Key points:

  • Use make_txn() from schema.py — it normalizes dates, ensures positive amounts, sets all required keys
  • Handle the bank's specific date format (see Date Formats below)
  • source_account should be a unique snake_case identifier
  • account_type is "personal" or "business"

2. Add detection logic

In spendstory/parsers/__init__.py, add a detection block to detect_account_type():

# BankName:
#   Headers contain "UniqueHeader1" AND "UniqueHeader2"
if {"UniqueHeader1", "UniqueHeader2"} <= header_set:
    return "bankname_account"

Detection rules:

  • Match on header names, not filenames
  • Use unique header combinations that won't collide with other banks
  • If the bank has a non-standard format (like Venmo), handle it before the CSV header parsing

3. Register the parser

In spendstory/parsers/__init__.py, add the import and registry entry:

from spendstory.parsers.bankname import parse_bankname

PARSERS = {
    # ... existing parsers ...
    "bankname_account": parse_bankname,
}

4. Add to EXPECTED_ACCOUNTS

In spendstory/config.py, add the account to the expected accounts dict:

EXPECTED_ACCOUNTS = {
    # ... existing accounts ...
    "bankname_account": "BankName Checking",
}

This makes the Account Status panel show the account even when no CSV is present.

Transaction Schema

Every transaction must have all 11 keys:

Key Type Description
date str YYYY-MM-DD format
month str YYYY-MM (derived from date)
description str Cleaned transaction description
amount float Always positive
type str "debit" or "credit"
source_account str Account identifier (snake_case)
account_type str "personal" or "business"
card_member str/None Cardholder name (multi-card accounts)
original_description str Raw description before cleaning
category str Set later by categorization (use "Other" initially)
merchant_name str Set later by merchant normalization (use description initially)

make_txn() from schema.py handles most of this — you mainly need to provide date, description, amount, txn_type, source_account, and account_type.

Date Formats by Bank

Bank Format Example
Capital One MM/DD/YY 01/15/26
Amex MM/DD/YYYY 01/15/2026
Citi / Mercury MM-DD-YYYY 01-15-2026
Apple Card MM/DD/YYYY 01/15/2026
Venmo ISO 2026-01-15T14:30:00

make_txn() handles common date formats automatically. If your bank uses an unusual format, parse it to YYYY-MM-DD before passing to make_txn().

Generic CSV Fallback

For banks with standard CSV layouts, check spendstory/parsers/base.py — it provides a generic parser that auto-detects date, description, and amount columns. You may be able to use it directly instead of writing a custom parser.

Testing

Drop a sample CSV into data/raw/ and run:

python -m spendstory

Phase 1 should detect the new account type. Phase 3 should categorize transactions (some may appear as "Other" — add keywords to categorize.py as needed).