Skip to content

Latest commit

 

History

History
82 lines (54 loc) · 8.9 KB

File metadata and controls

82 lines (54 loc) · 8.9 KB

OpenFaith's Schema-Driven Design with Effect Schema

A cornerstone of the OpenFaith platform is its Schema-Driven Design. This means that the structure, validation, transformation, and even interaction patterns for data are primarily defined and governed by formal schemas. OpenFaith leverages the powerful [effect/Schema](https ://effect.website/docs/schema/introduction) library to implement this principle, providing a robust, type-safe, and highly expressive way to manage data throughout the system.

Why Schema-Driven?

Adopting a schema-driven approach offers numerous advantages:

  1. Clarity and Single Source of Truth: Schemas serve as a clear, explicit contract for data structures. The OpenFaith Canonical Data Model (CDM), defined with effect/Schema, becomes the single source of truth for what data looks like and how it behaves.
  2. Validation and Data Integrity: Schemas automatically enforce data validation rules, ensuring data quality and integrity at every stage—from external system synchronization to internal API interactions and AI tool usage. effect/Schema provides rich validation capabilities, from basic type checks to complex business rule enforcement via filters and refinements.
  3. Robust Transformations: Integrating diverse systems requires transforming data between different shapes. effect/Schema's transformation APIs (Schema.transform, Schema.transformOrFail, Schema.compose) provide a powerful and type-safe way to define these mappings between external ChMS formats and the OpenFaith CDM.
  4. Type Safety: By using effect/Schema, OpenFaith benefits from TypeScript's static type checking. Inferred types (Schema.Schema.Type<S>, Schema.Schema.Encoded<S>) ensure that data operations are type-safe at compile time, reducing runtime errors.
  5. Automatic Tooling & Derivations: Schemas can be "compiled" or interpreted into various artifacts:
    • Database Schemas: Annotations within effect/Schema can guide the generation of Drizzle ORM / ZeroSQL database table definitions.
    • API Documentation: Tools like JSON Schema (JSONSchema.make) can be generated from effect/Schema for API documentation.
    • Arbitrary Data Generation: For testing, Arbitrary.make can produce valid test data conforming to the schema.
    • Pretty Printers: Pretty.make can create human-readable string representations of schema-defined data.
    • AI Tool Definitions: Schemas define the input and output for AI tools, enabling LLMs to interact with OpenFaith data predictably.
  6. Extensibility and Maintainability: As the platform evolves, schemas provide a structured way to manage changes to the data model. New fields, entities, or validation rules can be added systematically.
  7. Declarative Power: Many aspects of data handling (validation, basic transformations, default values, optionality) can be declared directly in the schema, reducing boilerplate code.

Key effect/Schema Features Leveraged by OpenFaith

OpenFaith's schema-driven architecture makes extensive use of effect/Schema's capabilities:

  1. Defining Core Entities (Schema.Struct, Schema.Class):

    • The OpenFaith CDM's modules (Management, Directory, Domain, etc.) and their constituent entities (Person, Group, Event, Folder) are defined as Schema.Struct or Schema.Class instances.
    • Schema.Class is particularly useful for entities that might have associated business logic or methods, providing an "active schema" pattern where the schema instance itself can carry behavior.
  2. Primitive and Literal Types:

    • Standard types like Schema.String, Schema.Number, Schema.Boolean, and Schema.Date form the building blocks.
    • Schema.Literal is used for discriminant fields in tagged unions (e.g., _tag: Schema.Literal("person")) and for defining enums or fixed-value properties.
  3. Handling Optionality and Defaults:

    • Schema.optional, Schema.optionalWith (with options like exact, nullable, default, as: "Option") are used extensively to define optional fields, handle nulls gracefully, and provide default values, simplifying data input and entity construction. This is critical for accommodating diverse data sources where some fields may not always be present.
  4. Transformations (Schema.transformOrFail, Schema.compose):

    • This is central to the ChMS Adapters. Adapters use transformations to map data between an external ChMS's native schema and the OpenFaith CDM.
    • Schema.NumberFromString, Schema.parseJson, and custom transformations handle type conversions (e.g., string dates to Date objects, string numbers to number).
    • The ability to chain transformations with Schema.compose allows for complex, multi-step mapping logic.
    • Effectful transformations (returning an Effect) are used when mapping involves asynchronous operations or external service calls (e.g., fetching related data during a transformation).
  5. Filters and Refinements (Schema.filter, Schema.pattern, Schema.minLength, etc.):

    • Beyond basic type checks, effect/Schema filters are used to enforce business rules and data integrity constraints (e.g., ensuring an email is in a valid format, a password meets complexity requirements, or a string is non-empty).
    • These refinements ensure that data not only has the correct type but also meets semantic validity criteria.
  6. Recursive and Mutually Recursive Schemas (Schema.suspend):

    • Essential for modeling hierarchical structures like Folder (which can contain sub-folders) or complex relational data like Edges that might indirectly lead to cycles, or organizational hierarchies.
  7. Branded Types (Schema.brand, Schema.fromBrand):

    • Used to create semantically distinct types that are structurally similar (e.g., UserId vs. PersonId, both might be strings or numbers but represent different concepts). This enhances type safety and prevents accidental misuse of IDs.
  8. Annotations (.annotations({...})):

    • Self-Documentation: identifier, title, description, examples make schemas more understandable and can be used to generate documentation.
    • Error Customization: message, parseIssueTitle allow for user-friendly and context-specific error messages during validation failures.
    • Compiler Guidance: Annotations like jsonSchema, arbitrary, pretty guide how effect/Schema compilers generate other artifacts. OpenFaith will leverage this for database schema hints (_indexes array in our previous discussions) and AI tool definitions.
    • Custom Annotations: OpenFaith can define its own custom annotations to embed platform-specific metadata within schemas (e.g., for permission hints, AI tool descriptions, or sync strategy flags).
  9. Error Handling & Formatting (ParseError, TreeFormatter, ArrayFormatter):

    • When data fails validation against a schema (during decoding by an Adapter or an internal API call), effect/Schema provides detailed ParseError objects.
    • OpenFaith can use formatters like TreeFormatter or ArrayFormatter to present these errors to users or in logs in a structured and understandable way.

Schema-Driven AI Interaction

The schema-driven approach is particularly powerful for OpenFaith's AI-First vision:

  • Tool Definition: Each core OpenFaith entity schema, along with its effect/Schema definition, implicitly or explicitly defines an "AI tool." The schema's structure dictates the input parameters and output shape for the tool. Annotations (like description and title) on the schema and its fields provide the necessary information for an LLM to understand what the tool does and how to use it.
  • System Prompts: The collection of OpenFaith CDM schemas (and their annotations) can be serialized or summarized to form part of the system prompt given to an LLM. This "teaches" the LLM about the available data structures and tools.
  • Structured Input/Output: When an LLM decides to use an OpenFaith tool, the input it provides can be validated against the tool's input schema (derived from the relevant effect/Schema). Similarly, the output from the tool, before being sent back to the LLM, is guaranteed to conform to its output schema.

Conclusion

By deeply integrating effect/Schema, OpenFaith establishes a robust, type-safe, and declarative foundation for its Canonical Data Model and data operations. This Schema-Driven Design is not just about validation; it's a comprehensive approach to defining, transforming, interacting with, and understanding data. It simplifies integration with diverse external systems, enhances data integrity, improves developer experience, and critically, provides the structured underpinning required for an advanced AI-First data platform. The rich feature set of effect/Schema—from basic structs and transformations to advanced annotations and branded types—provides all the necessary building blocks for OpenFaith's ambitious goals.