Skip to content

[Bug] introspect_postgres() mangles virtual URI path with backslashes on Windows #1672

Description

@raman118

Version: graphifyy 0.9.6, Windows.

Description

In graphify/pg_introspect.py, the introspect_postgres() function connects to a PostgreSQL database, retrieves the schema metadata, and formats it as in-memory SQL DDL, which is then parsed by extract_sql().

To serve as a source_file identifier, introspect_postgres() instantiates a virtual URI path using Path:

virtual_path = Path(f"postgresql://{host}/{dbname}")

On Windows, Path parses URI strings as filesystem paths and normalizes them using backslashes (postgresql:\myhost\mydb).
In graphify/extract.py, extract_sql(path: Path) converts this to a string using str(path) (retaining backslashes) and populates the source_file attribute on nodes and edges with this mangled string.

This causes:

  1. Inconsistent and invalid source_file values (e.g., postgresql:\myhost\mydb instead of postgresql:/myhost/mydb).
  2. Test failures in tests/test_pg_introspect.py::test_pg_introspect_success on Windows due to the backslashes.

Reproduction

On Windows, run the test suite for pg_introspect:

python -m pytest tests/test_pg_introspect.py

This fails with:

E           AssertionError: assert 'postgresql:\\myhost\\mydb' == 'postgresql:/myhost/mydb'
E             
E             - postgresql:/myhost/mydb
E             ?            ^      ^
E             + postgresql:\myhost\mydb
E             ?            ^      ^

Expected Behavior

The URI path postgresql://host/dbname should always use forward slashes (/), even on Windows.

Root Cause & Location

  1. In graphify/pg_introspect.py:
virtual_path = Path(f"postgresql://{host}/{dbname}")

Instantiating Path on a URI string is incorrect on Windows.

  1. In graphify/extract.py:
str_path = str(path)

This should use path.as_posix() if path is a Path instance, or replace backslashes with forward slashes for URI/POSIX normalization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions