Skip to content

Support custom null markers in CSV loading #435

@ohaibbq

Description

@ohaibbq

Overview

Add support for custom NULL marker configuration to match BigQuery's --null_marker parameter functionality.

Current Behavior

The emulator hardcodes NULL detection to only recognize empty strings ("") and the string "null" (case-insensitive).

Expected Behavior

Users should be able to specify a custom string that represents NULL values in their CSV data, matching BigQuery's behavior.

Common Null Markers

  • "N/A"
  • "\\N" (MySQL convention)
  • "NULL"
  • "-"
  • "NA"

Implementation Requirements

  1. Add NullMarker field to JobConfigurationLoad structure
  2. Check value against custom marker before type conversion
  3. Default to empty string if not specified
  4. Maintain backward compatibility with existing "null" detection

Test Cases

  • CSV with "N/A" and nullMarker="N/A" → should be NULL
  • CSV with "\\N" and nullMarker="\\N" → should be NULL
  • CSV with "N/A" and no nullMarker set → should be string "N/A"
  • Ensure empty string always represents NULL (default behavior)

Documentation Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions