Skip to content

Latest commit

 

History

History
1207 lines (894 loc) · 32.8 KB

File metadata and controls

1207 lines (894 loc) · 32.8 KB

From GraphQL to pydantic-resolve: How I Improved Frontend-Backend API Architecture

中文版

Introduction

GraphQL was once promised as the savior of REST APIs — it allows clients to precisely declare their data requirements, avoiding over-fetching and under-fetching issues. However, in real-world projects, many teams have discovered that GraphQL is not a silver bullet, especially when building business-oriented, highly customized BFF (Backend For Frontend) layers.

This article starts from GraphQL's core concepts, analyzes its pain points in practical applications, and explains why pydantic-resolve is a better solution than GraphQL for view data construction.

Note: For easier comparison, GraphQL code examples in this article use strawberry-graphql from the Python ecosystem. This makes the comparison more direct since both solutions use Python syntax and type systems.


GraphQL's Core Concepts

1. Declarative Data Querying

GraphQL's greatest appeal lies in its declarative query syntax:

query GetTeamWithTasks {
  teams {
    id
    name
    sprints {
      id
      name
      stories {
        id
        name
        tasks {
          id
          name
          owner {
            id
            name
          }
        }
      }
    }
  }
}

Advantages:

  • ✅ Clients have full control over required fields
  • ✅ Avoids over-fetching
  • ✅ Type safety (strongly typed Schema)
  • ✅ Self-describing (Introspection)

2. DataLoader Batch Loading

To avoid N+1 query problems, GraphQL introduced the DataLoader pattern:

from aiodataloader import DataLoader

class UserLoader(DataLoader):
    async def batch_load_fn(self, user_ids: list[int]):
        # Batch query: SELECT * FROM users WHERE id IN (...)
        return await batch_get_users(user_ids)

This design is elegant — it optimizes O(N) queries down to O(1) by batching requests.


GraphQL's Pain Points in Real Projects

Despite GraphQL's elegant design philosophy, it exposes several issues in actual projects:

Pain Point 1: Limitations of Top-Down Data Fetching

Problem: GraphQL's data flow is unidirectional — from top-level Query downward, layer by layer. This means you cannot modify or recalculate upper-level fields after lower-level data is loaded.

Example: Calculate total task count for a team

query {
  team(id: 1) {
    id
    name
    # You want to calculate total tasks for this team
    totalTasks # ❌ But this requires recursively traversing all sprints -> stories -> tasks
  }
}

GraphQL's dilemma (using strawberry-graphql):

import strawberry

@strawberry.type
class Team:
    id: int
    name: str

    @strawberry.field
    async def total_tasks(self) -> int:
        # ❌ Problem: sprints data hasn't loaded yet
        # You cannot access child node data
        # You can only query the database again here
        return await count_tasks_for_team(self.id)

Result: You either write complex pre-computation logic or expose multiple query endpoints for the frontend to compose.


Pain Point 2: Query Language Flexibility Becomes a Burden

Problem: GraphQL allows clients to freely compose query fields, which sounds great in theory but creates issues in real business scenarios:

  1. Frontend teams don't want this flexibility

    • Most pages have fixed data structure requirements
    • Frontend prefers calling specifically optimized APIs
    • GraphQL query strings have high maintenance costs
  2. Backend teams struggle to optimize performance

    • Infinite query combinations, impossible to optimize specifically
    • Deep nested queries easily cause performance issues
    • Query complexity limits are hard to enforce
  3. Business logic scattered

    • Generic interfaces cannot satisfy specific business requirements
    • Need to add more and more "special fields"
    • Schema becomes bloated

Reality:

# Theoretically GraphQL should be like this
{
  teams { sprints { stories { tasks { owner } } } }
}

# But in practice, you need different queries for different pages
query TeamDashboardPage { ... }
query TaskListPage { ... }
query SprintReportPage { ... }

# This slowly devolves into multiple REST endpoints

Pain Point 3: Difficulty in Post-Processing Data

Problem: In GraphQL, it's hard to perform secondary processing after data fetching.

Example: Format full path name for tasks

query {
  story(id: 1) {
    tasks {
      name # "mvp tech design"
      # You want: "Team A / Sprint W1 / MVP / mvp tech design"
      fullPath # ❌ Cannot access ancestor node data
    }
  }
}

GraphQL's dilemma:

  • Resolvers cannot access parent or ancestor node context
  • You either compose on the frontend or add multiple computed fields on the backend
  • Business logic gets scattered across frontend and backend

Pain Point 4: High Adoption Cost

Problem: Adopting GraphQL requires significant investment:

  1. Steep learning curve

    • Schema Definition Language (SDL)
    • Resolver writing conventions
    • DataLoader best practices
    • Query complexity analysis and limits
  2. Complex toolchain

    • GraphQL server (Apollo Server, Graphene, etc.)
    • Query parser and validator
    • Dev tools (Apollo Sandbox, GraphiQL)
    • Monitoring and profiling tools
  3. Difficult integration with existing architecture

    • Need to refactor existing REST APIs
    • Authentication and authorization need redesign
    • Caching strategy shifts from HTTP cache to GraphQL layer cache

Pain Point 5: Redundant Work for Internal Projects

Problem: For internal projects (like enterprise management systems, mobile app backends), GraphQL's "flexibility" becomes a burden:

  1. Frontend and backend belong to same team

    • No need to expose generic interfaces externally
    • Can iterate API structure quickly
    • GraphQL's flexibility isn't utilized
  2. High query duplication

    • Different pages need similar but not identical data structures
    • Results in many duplicate query strings
    • Maintenance costs actually increase
  3. Redundant type definitions

    • GraphQL Schema + TypeScript types = double maintenance
    • Even with code generation tools, additional configuration is needed

Enter pydantic-resolve

pydantic-resolve is a declarative data assembly tool designed specifically for the Python ecosystem. It retains GraphQL's core philosophy (declarative descriptions, DataLoader pattern) but is optimized for view data construction scenarios.

Core Design Philosophy

"Remove GraphQL's query part, keep its core idea of declarative description, and focus on building stable, maintainable BFF layers."


Concept Mapping: From GraphQL to pydantic-resolve

For GraphQL developers, migrating to pydantic-resolve feels natural because core concepts are nearly identical:

GraphQL Concept pydantic-resolve Equivalent Similarity
GraphQL Type Pydantic BaseModel 🟢 95%
GraphQL Resolver resolve_{field} method 🟢 95%
DataLoader DataLoader (identical) 🟢 100%
Nested queries Nested Pydantic models 🟢 90%
Query Schema API endpoint + Response Model 🟡 70%

Code Comparison: GraphQL vs pydantic-resolve

GraphQL Approach (using strawberry-graphql)

import strawberry

@strawberry.type
class User:
    id: int
    name: str
    email: str

@strawberry.type
class Task:
    id: int
    name: str
    owner_id: int

    @strawberry.field
    async def owner(self) -> User:
        return await user_loader.load(self.owner_id)

@strawberry.type
class Story:
    id: int
    name: str

    @strawberry.field
    async def tasks(self) -> list[Task]:
        return await task_loader.load(self.id)

pydantic-resolve Approach

from pydantic import BaseModel
from pydantic_resolve import Resolver, Loader

class User(BaseModel):
    id: int
    name: str
    email: str

class Task(BaseModel):
    id: int
    name: str
    owner_id: int

    # Similar to GraphQL Resolver
    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_batch_loader)):
        return loader.load(self.owner_id)

class Story(BaseModel):
    id: int
    name: str

    # Nested query
    tasks: list[Task] = []
    def resolve_tasks(self, loader=Loader(story_to_task_loader)):
        return loader.load(self.id)

Comparison result: Nearly identical! Only the syntax changes from strawberry-graphql to pydantic-resolve.


Why pydantic-resolve is Better for View Data Construction?

1. Bidirectional Data Flow: Forward Fetch + Backward Change

GraphQL is top-down only, pydantic-resolve supports bidirectional data flow:

class Team(BaseModel):
    name: Annotated[str, ExposeAs('team_name')]  # Expose downward
    sprints: list[Sprint] = []

    task_count: int = 0
    def post_task_count(self):
        # Collect upward: calculate total from all child nodes
        return sum(s.task_count for s in self.sprints)

class Sprint(BaseModel):
    name: Annotated[str, ExposeAs('sprint_name')]
    stories: list[Story] = []

    task_count: int = 0
    def post_task_count(self):
        return sum(s.task_count for s in self.stories)

class Story(BaseModel):
    name: Annotated[str, ExposeAs('story_name')]
    tasks: list[Task] = []

    task_count: int = 0
    def post_task_count(self):
        return len(self.tasks)

class Task(BaseModel):
    full_path: str = ""
    def post_full_path(self, ancestor_context):
        # Access ancestor node data
        team = ancestor_context['team_name']
        sprint = ancestor_context['sprint_name']
        story = ancestor_context['story_name']
        return f"{team}/{sprint}/{story}/{self.name}"

Advantages:

  • resolve_ methods: Fetch related data downward (Forward Fetch)
  • post_ methods: Calculate and transform after data loads (Backward Change)
  • ExposeAs: Parent nodes expose data to child nodes
  • ancestor_context: Child nodes access ancestor node data

GraphQL cannot do this.


2. Dedicated Endpoints vs Generic Query

GraphQL approach:

# One endpoint, all queries
POST /graphql

# But you need different queries for each page
query TeamDashboard { ... }
query TaskList { ... }
query SprintReport { ... }

pydantic-resolve approach:

# Each endpoint is specifically optimized
@app.get("/teams-dashboard", response_model=TeamDashboard)
async def get_teams_dashboard():
    return await Resolver().resolve(teams)

@app.get("/tasks-list", response_model=TaskList)
async def get_tasks_list():
    return await Resolver().resolve(tasks)

@app.get("/sprint-report", response_model=SprintReport)
async def get_sprint_report():
    return await Resolver().resolve(sprints)

Advantages:

  • RESTful style: Follows HTTP semantics, simpler caching
  • Fine-grained permission control: Each endpoint has independent auth
  • More direct performance optimization: Optimize queries per endpoint
  • More stable API contracts: Unlike GraphQL queries that change arbitrarily

3. Type Safety and Auto-Generation

GraphQL:

# Need to maintain GraphQL Schema
type Task { ... }

# Frontend also needs TypeScript types
interface Task { ... }

pydantic-resolve:

# Only need to maintain Pydantic models
class Task(BaseModel):
    id: int
    name: str

# FastAPI auto-generates OpenAPI docs
# Frontend tools auto-generate TypeScript SDK

Advantages:

  • Single source of truth: Pydantic models are the only truth
  • Auto OpenAPI: FastAPI auto-generates documentation
  • TypeScript generation: Tools like openapi-typescript generate frontend types in one click
  • IDE support: Complete type hints and autocomplete

4. Low Adoption Cost

GraphQL:

  • Need dedicated GraphQL server
  • Need to learn SDL and Resolver conventions
  • Need to configure dev tools and monitoring
  • Difficult to integrate with existing architecture

pydantic-resolve:

# Only 3 steps needed

# 1. Install dependency
pip install pydantic-resolve

# 2. Define models (you already know how)
class Task(BaseModel):
    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_loader)):
        return loader.load(self.owner_id)

# 3. Use Resolver
result = await Resolver().resolve(tasks)

Advantages:

  • Pure Python: No new language or DSL to learn
  • Seamless FastAPI integration: Just 3 lines of code
  • Familiar toolchain: pytest, black, mypy all work
  • Gradual adoption: Can incrementally adopt in existing projects

5. Better Suited for BFF Layers

GraphQL's positioning dilemma:

  • It's a universal query language, suitable for external APIs
  • But internal BFF layers need stable, view-specific interfaces

pydantic-resolve's positioning:

  • Designed specifically for BFF layers
  • Each endpoint targets a specific view
  • Stable API contracts, easy to maintain

Comparison:

Dimension GraphQL pydantic-resolve
Use case External generic APIs Internal BFF layers
Flexibility High (client decides) Low (server decides)
Stability Low (queries change arbitrarily) High (API contracts fixed)
Performance optimization Difficult (many query combinations) Simple (fixed endpoints)
Permission control Complex (field-level) Simple (endpoint-level)
Caching strategy Difficult (POST requests) Simple (HTTP cache)

Real-World Comparison: Building a Team Management API

Let's see the differences through a complete example.

Requirements

Build an API that returns teams and all their tasks, including:

  • Team information
  • All Sprints in the team
  • All Stories in each Sprint
  • All Tasks in each Story
  • Owner of each Task
  • Task count statistics at each level

GraphQL Implementation

1. Define Schema

type User {
  id: ID!
  name: String!
}

type Task {
  id: ID!
  name: String!
  owner: User!
}

type Story {
  id: ID!
  name: String!
  tasks: [Task!]!
  taskCount: Int!
}

type Sprint {
  id: ID!
  name: String!
  stories: [Story!]!
  taskCount: Int!
}

type Team {
  id: ID!
  name: String!
  sprints: [Sprint!]!
  taskCount: Int!
}

type Query {
  teams: [Team!]!
}

2. Implement Schema and Resolvers (using strawberry-graphql)

import strawberry

@strawberry.type
class User:
    id: int
    name: str

@strawberry.type
class Task:
    id: int
    name: str
    owner_id: int

    @strawberry.field
    async def owner(self) -> User:
        return await user_loader.load(self.owner_id)

@strawberry.type
class Story:
    id: int
    name: str

    @strawberry.field
    async def tasks(self) -> list[Task]:
        return await task_loader.load(self.id)

    @strawberry.field
    def task_count(self) -> int:
        # ✅ Can calculate here, but cannot pass upward to parent nodes
        return len(self.tasks)

@strawberry.type
class Sprint:
    id: int
    name: str

    @strawberry.field
    async def stories(self) -> list[Story]:
        return await story_loader.load(self.id)

    @strawberry.field
    async def task_count(self) -> int:
        # ❌ Problem: stories data hasn't finished loading
        # You can only query database again here
        return await count_tasks_for_sprint(self.id)

@strawberry.type
class Team:
    id: int
    name: str

    @strawberry.field
    async def sprints(self) -> list[Sprint]:
        return await sprint_loader.load(self.id)

    @strawberry.field
    async def task_count(self) -> int:
        # ❌ Same problem: need to query database again
        return await count_tasks_for_team(self.id)

@strawberry.type
class Query:
    @strawberry.field
    async def teams(self) -> list[Team]:
        return await get_teams_from_db()

schema = strawberry.Schema(query=Query)

3. Query

query GetTeamsWithTaskCount {
  teams {
    id
    name
    sprints {
      id
      name
      stories {
        id
        name
        taskCount
        tasks {
          id
          name
          owner {
            id
            name
          }
        }
      }
      taskCount
    }
    taskCount
  }
}

Problems:

  • taskCount requires separate database queries at each level
  • ❌ Story's taskCount cannot be passed to Sprint and Team
  • ❌ Generates additional database queries
  • ❌ Logic scattered, hard to maintain

pydantic-resolve Implementation

1. Define Models

from pydantic import BaseModel
from typing import Optional, List
from pydantic_resolve import Resolver, Loader, ExposeAs

class User(BaseModel):
    id: int
    name: str

class Task(BaseModel):
    id: int
    name: str
    owner_id: int

    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_batch_loader)):
        return loader.load(self.owner_id)

class Story(BaseModel):
    id: int
    name: str

    tasks: List[Task] = []
    def resolve_tasks(self, loader=Loader(story_to_task_loader)):
        return loader.load(self.id)

    task_count: int = 0
    def post_task_count(self):
        # ✅ Calculate after data loads
        return len(self.tasks)

class Sprint(BaseModel):
    id: int
    name: str

    stories: List[Story] = []
    def resolve_stories(self, loader=Loader(sprint_to_story_loader)):
        return loader.load(self.id)

    task_count: int = 0
    def post_task_count(self):
        # ✅ Aggregate from child nodes
        return sum(s.task_count for s in self.stories)

class Team(BaseModel):
    id: int
    name: str

    sprints: List[Sprint] = []
    def resolve_sprints(self, loader=Loader(team_to_sprint_loader)):
        return loader.load(self.id)

    task_count: int = 0
    def post_task_count(self):
        # ✅ Aggregate from child nodes
        return sum(s.task_count for s in self.sprints)

2. API Endpoint

from fastapi import FastAPI, Depends
from sqlalchemy.ext.asyncio import AsyncSession

app = FastAPI()

@app.get("/teams", response_model=List[Team])
async def get_teams(session: AsyncSession = Depends(get_session)):
    # 1. Get root data
    teams = await get_teams_from_db(session)

    # 2. Resolver automatically resolves all related data
    teams = await Resolver().resolve(teams)

    return teams

Advantages:

  • task_count automatically calculated at each level
  • ✅ Data aggregates bottom-up, no extra queries needed
  • ✅ Clear logic, easy to maintain
  • ✅ Automatic batch loading, avoids N+1 problems

Migration Path: From GraphQL to pydantic-resolve

If you're already familiar with GraphQL, migrating to pydantic-resolve is straightforward:

Step 1: Map GraphQL Type to Pydantic Model

GraphQL:

type User {
  id: ID!
  name: String!
}

Pydantic:

class User(BaseModel):
    id: int
    name: str

Step 2: Migrate Resolver

GraphQL (using strawberry-graphql):

@strawberry.type
class User:
    id: int
    name: str

    @strawberry.field
    async def tasks(self) -> list[Task]:
        return await task_loader.load(self.id)

pydantic-resolve:

class User(BaseModel):
    tasks: list[Task] = []
    def resolve_tasks(self, loader=Loader(task_loader)):
        return loader.load(self.id)

Step 3: Reuse DataLoader

Good news: DataLoaders don't need to change at all!

from aiodataloader import DataLoader

class UserLoader(DataLoader):
    async def batch_load_fn(self, keys):
        return await batch_get_users(keys)

Step 4: Split GraphQL Query into Multiple REST Endpoints

GraphQL:

query TeamDashboard { ... }
query TaskList { ... }
query SprintReport { ... }

pydantic-resolve:

@app.get("/teams/dashboard", response_model=TeamDashboard)
async def get_teams_dashboard():
    return await Resolver().resolve(teams)

@app.get("/tasks/list", response_model=TaskList)
async def get_tasks_list():
    return await Resolver().resolve(tasks)

@app.get("/sprints/report", response_model=SprintReport)
async def get_sprint_report():
    return await Resolver().resolve(sprints)

Learning Time Estimate

Background Learning Time
GraphQL + Python 2-3 days
GraphQL + other languages 4-6 days
No experience 11-17 days

Conclusion: When to Choose pydantic-resolve?

Scenarios for Choosing GraphQL

External APIs that need flexible queries for third-party developers ✅ Stable data structures, suitable for building generic data query interfaces ✅ Multi-client integration, different clients need different data structures ✅ Mature GraphQL ecosystem, team already has relevant experience

Typical examples: GitHub API, Shopify API

Scenarios for Choosing pydantic-resolve

Internal BFF layers, building data for specific frontend views ✅ Fast-iterating business, API structure needs frequent adjustments ✅ Python tech stack, using FastAPI/Pydantic ✅ Need data post-processing, calculating derived fields or aggregating data ✅ Small to medium teams, want to reduce adoption cost

Typical examples:

  • Backend for enterprise management systems
  • BFF layer for mobile apps
  • Backend API for data dashboards
  • Aggregation layer for microservices

Final Recommendation

GraphQL is not a silver bullet, and neither is pydantic-resolve.

But if you're building an internal, business-oriented BFF layer, the FastAPI + pydantic-resolve combination will likely suit you better than GraphQL:

  1. Retains GraphQL's core advantages: declarative descriptions, DataLoader, strong typing
  2. Solves GraphQL's pain points: bidirectional data flow, data post-processing, dedicated endpoints
  3. Lowers adoption cost: pure Python, no additional server needed, 3 steps to get started
  4. Better developer experience: auto OpenAPI, type hints, IDE support

Migrating from GraphQL to pydantic-resolve isn't learning a new technology for GraphQL developers — it's learning a more elegant syntax.


Bonus Advantage: ERD Brings Better Business Model Maintainability

Beyond the core advantages mentioned above, pydantic-resolve provides a unique feature: Entity Relationship Diagram (ERD), which GraphQL lacks. It brings revolutionary improvements to business model maintainability.

What is ERD?

ERD (Entity Relationship Diagram) allows you to define data relationships at the entity level, rather than repeating definitions in every view model:

from pydantic_resolve import base_entity, Relationship, config_global_resolver

# 1. Define base entity
BaseEntity = base_entity()

# 2. Define relationships on entities (define once, reuse everywhere)
class TaskEntity(BaseModel, BaseEntity):
    __relationships__ = [
        Relationship(
            field='owner_id',
            target_kls=UserEntity,
            loader=user_batch_loader
        )
    ]
    id: int
    name: str
    owner_id: int

class StoryEntity(BaseModel, BaseEntity):
    __relationships__ = [
        Relationship(
            field='id',
            target_kls=list[TaskEntity],
            loader=story_to_task_loader
        )
    ]
    id: int
    name: str

# 3. Register to global Resolver
diagram = BaseEntity.get_diagram()
config_global_resolver(diagram)

Maintainability Advantages of ERD

1. Single Source of Truth

Without ERD (traditional approach or GraphQL):

# Every view needs to repeat relationship definitions
class TaskResponse1(BaseModel):
    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_loader)):
        return loader.load(self.owner_id)

class TaskResponse2(BaseModel):
    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_loader)):
        return loader.load(self.owner_id)

class TaskResponse3(BaseModel):
    owner: Optional[User] = None
    def resolve_owner(self, loader=Loader(user_loader)):
        return loader.load(self.owner_id)

# Problem: relationship definitions scattered across multiple places
# - Hard to maintain
# - Easy to become inconsistent
# - Changes require searching everywhere

With ERD:

# Define relationship once
class TaskEntity(BaseModel, BaseEntity):
    __relationships__ = [
        Relationship(field='owner_id', target_kls=UserEntity, loader=user_loader)
    ]

# All views automatically inherit, no need to repeat definitions
class TaskResponse1(DefineSubset):
    __subset__ = (TaskEntity, ('id', 'name'))
    # LoadBy automatically finds relationship in ERD
    owner: Annotated[Optional[User], LoadBy('owner_id')] = None

class TaskResponse2(DefineSubset):
    __subset__ = (TaskEntity, ('id', 'name'))
    owner: Annotated[Optional[User], LoadBy('owner_id')] = None

# Advantage: change relationship in one place, takes effect globally

2. Visualizable Dependency Relationships

With fastapi-voyager, ERD can automatically generate visualized dependency diagrams:

from fastapi_voyager import create_voyager

app.mount('/voyager', create_voyager(
    app,
    er_diagram=diagram,  # Show entity relationships
    enable_pydantic_resolve_meta=True
))

Value:

  • 📊 At a glance: See all entities and their relationships
  • 🔍 Quick navigation: Click entities to jump to definitions
  • 🎨 Color coding: Distinguish resolve, post, expose operations
  • 📈 Dependency analysis: View data flow and dependency chains

3. Simplified Declaration Syntax

With ERD, LoadBy simplifies loading related data:

class TaskResponse(BaseModel):
    # LoadBy automatically finds owner_id relationship in ERD and calls the corresponding loader
    # Developer still needs to explicitly declare the type (should match target_kls in ERD)
    owner: Annotated[Optional[User], LoadBy('owner_id')] = None

Advantages:

  • ✅ No need to write resolve methods manually
  • ✅ Relationship definition reuse, avoid repeating loader calls
  • ✅ Cleaner code

4. Forces Data Relationship Modeling

ERD forces you to think about entity relationships at the business level, not ad-hoc composition at the view layer:

# Wrong way: temporarily define relationships in views
class SomeResponse(BaseModel):
    # Is this relationship business-reasonable?
    related_items: list[Item] = []
    def resolve_related_items(self):
        # Temporary logic, no unified management
        return get_related_items_somehow(self.id)
# Right way: model at entity level
class ItemEntity(BaseModel, BaseEntity):
    __relationships__ = [
        # Clear business relationship: item related to parent via some_field
        Relationship(
            field='some_field',
            target_kls=ParentEntity,
            loader=parent_loader
        )
    ]

Value:

  • 🎯 Clear business modeling: Relationship definitions align with business model
  • 📐 Better architecture: Forces thinking about business relationships between entities
  • 🔒 Avoid ad-hoc composition: Reduces "hardcoding relationships for specific endpoints"

5. Safer Refactoring

When business relationships change, ERD makes refactoring safer:

# Scenario: Task and User relationship changes from owner to assignees

# Without ERD: need to find and modify all Task-related views
# - TaskResponse1.resolve_owner
# - TaskResponse2.resolve_owner
# - TaskResponse3.resolve_owner
# ... everywhere, easy to miss

# With ERD: unified relationship management at entity level
class TaskEntity(BaseModel, BaseEntity):
    # Add new relationship
    __relationships__ = [
        Relationship(
            field='id',
            target_kls=list[UserEntity],  # 1:N relationship
            loader=task_to_assignees_loader
        )
    ]

# Old views will error at runtime, reminding you to fix
class TaskResponse(BaseModel):
    # Runtime error: LoadBy('owner_id') cannot find corresponding relationship in ERD
    owner: Annotated[Optional[User], LoadBy('owner_id')] = None

# New views with explicit type declaration
class TaskResponse(BaseModel):
    # Must explicitly declare type as list[User] (should match target_kls in ERD)
    assignees: Annotated[list[User], LoadBy('id')] = []

Value:

  • Clear change scope: relationship definitions centralized, impact scope clear
  • Runtime checking: using non-existent relationships immediately errors, won't silently fail
  • Easier regression testing: test coverage exposes issues during development

GraphQL vs pydantic-resolve ERD

Dimension GraphQL pydantic-resolve ERD
Relationship definition location Scattered in each Resolver Centralized in entity definitions
Relationship reuse Re-declared for each query Define once, reuse everywhere
Visualization Need additional tools fastapi-voyager auto-generates
Declaration syntax Manual Resolver LoadBy simplifies declaration
Refactoring safety Easy to miss Centralized + Runtime checking
Business modeling Query-driven Model-driven

Real-World Case: Refactoring a Team Management System

Suppose you need to refactor a team management system, changing "task owner" from a single user to multiple users:

Using GraphQL:

# 1. Modify Schema
type Task {
  assignees: [User!]  # Change from owner to assignees
}

# 2. Modify all related Resolvers
@strawberry.type
class Task:
    @strawberry.field
    async def assignees(self) -> list[User]:
        # Need to manually find and modify everywhere
        return await assignee_loader.load(self.id)

# 3. Check all queries
# - Which queries still use owner?
# - Which queries need updating?
# - Easy to miss, causing runtime errors

Using pydantic-resolve ERD:

# 1. Modify entity relationship
class TaskEntity(BaseModel, BaseEntity):
    __relationships__ = [
        Relationship(
            field='id',  # Change to query assignees by id
            target_kls=list[UserEntity],  # Change to list
            loader=task_to_assignees_loader  # New loader
        )
    ]

# 2. Update views
# Old views will error at runtime
class TaskResponse(BaseModel):
    # Runtime error: LoadBy('owner_id') cannot find corresponding relationship in ERD
    owner: Annotated[Optional[User], LoadBy('owner_id')] = None

# New views with correct types
class TaskResponse(BaseModel):
    # Must explicitly declare type as list[User]
    assignees: Annotated[list[User], LoadBy('id')] = []

# 3. Runtime checks
# - If any view still uses old relationship, runtime will immediately error
# - Centralized relationship definition makes impact scope clearer
# - Combined with testing, ensures refactoring safety and completeness

Long-Term Value of ERD

As the project evolves, ERD's value becomes more apparent:

Early project (1-3 months):

  • Entity relationships are simple, ERD advantages not obvious
  • Might feel like "writing definitions twice"

Mid project (3-12 months):

  • Entity count increases, relationships become complex
  • ERD's unified management starts to show value
  • When adding new views, directly reuse ERD definitions

Long-term project (12+ months):

  • ERD provides safety net during business refactoring
  • New members quickly understand business model through ERD
  • fastapi-voyager visualization becomes architecture documentation
  • Maintenance costs significantly reduced

Summary

ERD is pydantic-resolve's hidden advantage over GraphQL:

  1. Centralized management: Define relationships once, reuse globally
  2. Visualization: Auto-generate dependency diagrams
  3. Simplified declaration: LoadBy avoids repeating resolve methods
  4. Refactoring-friendly: Centralized definition + runtime checking
  5. Business modeling: Forces thinking at entity level

This makes pydantic-resolve not just a data assembly tool, but a business modeling framework, providing a solid data model foundation for long-term maintenance projects.


References