[Feature]: Embedded Sample dbt+SST Project for Integration Testing

## Problem Statement

SST requires a dbt project to function, but our current integration tests rely heavily on:
1. **Programmatic temp fixtures** - Creating minimal dbt project structures on-the-fly using `tmp_path`
2. **Heavy mocking** - Mocking Snowflake connections and service responses
3. **Fragmented YAML fixtures** - Isolated validation scenarios without full project context

This approach has several limitations:

- **Incomplete coverage**: Cannot test full end-to-end workflows (init → enrich → validate → extract → generate)
- **Maintenance burden**: Each test recreates project structures differently, leading to inconsistent test scenarios
- **Divergence from reality**: Programmatic fixtures may not accurately reflect real-world dbt+SST project structures
- **Difficult debugging**: When tests fail, it's hard to inspect the "project" state because it's ephemeral
- **Missing edge cases**: Hard to systematically test complex interactions (multiple models, relationships, metrics referencing each other)

Currently, the only "real" reference project is [sst-jaffle-shop](https://github.com/mluizzi-whoop/sst-jaffle-shop), which exists as an external tutorial repository and cannot be used directly in CI/CD without network dependencies.

## Proposed Solution

Create an **embedded minimal sample dbt+SST project** within the test fixtures directory that serves as a realistic, self-contained test harness:

```
tests/
├── fixtures/
│   ├── sample_project/                    # NEW: Complete dbt+SST project
│   │   ├── dbt_project.yml
│   │   ├── profiles.yml.example           # Template (not real credentials)
│   │   ├── models/
│   │   │   ├── staging/
│   │   │   │   ├── stg_customers.sql
│   │   │   │   └── stg_customers.yml
│   │   │   ├── marts/
│   │   │   │   ├── customers.sql
│   │   │   │   ├── customers.yml
│   │   │   │   ├── orders.sql
│   │   │   │   └── orders.yml
│   │   ├── target/
│   │   │   └── manifest.json              # Pre-generated for testing
│   │   ├── snowflake_semantic_models/
│   │   │   ├── semantic_views.yml
│   │   │   ├── metrics/
│   │   │   │   └── core_metrics.yml
│   │   │   ├── relationships/
│   │   │   │   └── core_relationships.yml
│   │   │   ├── filters/
│   │   │   ├── verified_queries/
│   │   │   └── custom_instructions/
│   │   └── sst_config.yaml
│   ├── project_factory.py                 # NEW: Factory for creating test variants
│   └── ... (existing fixtures)
```

### Implementation Components

#### 1. Base Sample Project
A minimal but complete dbt+SST project with:
- 2-3 interconnected dbt models (customers, orders)
- Pre-enriched YAML with SST metadata
- Valid relationships, metrics, and a semantic view
- Pre-compiled `manifest.json` (no dbt execution required)

#### 2. Project Factory
A utility class to create project variants for specific test scenarios:

```python
# tests/fixtures/project_factory.py
from pathlib import Path
import shutil
from typing import Optional

class SampleProjectFactory:
    """Factory for creating test project variants."""
    
    BASE_PROJECT = Path(__file__).parent / "sample_project"
    
    @classmethod
    def create(cls, tmp_path: Path, name: str = "test_project") -> Path:
        """Create a clean copy of the sample project."""
        project_dir = tmp_path / name
        shutil.copytree(cls.BASE_PROJECT, project_dir)
        return project_dir
    
    @classmethod
    def create_without_sst(cls, tmp_path: Path) -> Path:
        """Create project without SST initialization (for init wizard tests)."""
        project_dir = cls.create(tmp_path)
        shutil.rmtree(project_dir / "snowflake_semantic_models")
        (project_dir / "sst_config.yaml").unlink()
        return project_dir
    
    @classmethod
    def create_with_invalid_metrics(cls, tmp_path: Path) -> Path:
        """Create project with intentionally invalid metrics."""
        project_dir = cls.create(tmp_path)
        invalid_metrics = project_dir / "snowflake_semantic_models/metrics/invalid.yml"
        invalid_metrics.write_text("""
snowflake_metrics:
  - name: broken_metric
    # Missing required fields
    expr: SELECT * FROM  # Invalid SQL
""")
        return project_dir
    
    @classmethod
    def create_with_circular_dependencies(cls, tmp_path: Path) -> Path:
        """Create project with circular metric dependencies."""
        # Implementation for specific edge case testing
        pass
```

#### 3. Integration Test Base Class

```python
# tests/integration/base.py
import pytest
from pathlib import Path
from tests.fixtures.project_factory import SampleProjectFactory

class IntegrationTestBase:
    """Base class for integration tests with sample project support."""
    
    @pytest.fixture
    def sample_project(self, tmp_path) -> Path:
        """Provide a clean sample project for each test."""
        return SampleProjectFactory.create(tmp_path)
    
    @pytest.fixture
    def uninitalized_project(self, tmp_path) -> Path:
        """Provide a dbt project without SST setup."""
        return SampleProjectFactory.create_without_sst(tmp_path)
```

## Alternatives Considered

| Alternative | Pros | Cons |
|-------------|------|------|
| **Git submodule to sst-jaffle-shop** | Real-world project, shared with tutorials | Network dependency in CI, version drift, tutorial-focused not test-focused |
| **Continue with programmatic fixtures** | Flexible, already in place | Inconsistent, hard to maintain, incomplete coverage |
| **External test repo with CI triggers** | Isolation, real Snowflake testing | Complex CI setup, slow feedback loop, maintenance overhead |
| **Docker-based test environment** | Complete isolation, reproducible | Heavy setup, overkill for parsing/validation tests |

The embedded sample project approach provides the best balance of **realism**, **maintainability**, and **CI/CD compatibility**.

## Priority

**High - Would significantly improve workflow**

This would significantly improve:
- Developer confidence when making changes
- Test coverage for complex workflows
- Debugging experience when tests fail
- Onboarding for new contributors

## Impact

This feature would benefit:

- **SST maintainers** - More comprehensive test coverage, easier debugging
- **Contributors** - Clear examples of expected project structure, easier to add tests
- **CI/CD reliability** - Self-contained tests that don't depend on external resources
- **Documentation** - The sample project itself serves as reference documentation

## Technical Considerations

### Files to Create/Modify

1. **New directory**: `tests/fixtures/sample_project/` with full project structure
2. **New file**: `tests/fixtures/project_factory.py` for project variants
3. **New file**: `tests/integration/base.py` for shared test infrastructure
4. **Update**: `tests/README.md` to document the sample project approach
5. **Update**: Existing integration tests to use the sample project

### Manifest Generation

The `manifest.json` should be pre-generated and committed to avoid requiring dbt in the test environment. It can be regenerated periodically with:

```bash
cd tests/fixtures/sample_project
dbt compile --target test
```

### Considerations

- Keep the sample project **minimal** (2-3 models) to reduce maintenance burden
- Include **commented examples** of advanced features (complex metrics, multi-table relationships)
- Add **deliberate edge cases** (special characters in descriptions, long expressions)
- Maintain **parity with sst-jaffle-shop** structure for consistency

## Example Usage

### Running Full Workflow Tests

```python
# tests/integration/test_full_workflow.py
from click.testing import CliRunner
from snowflake_semantic_tools.interfaces.cli.main import cli
from tests.integration.base import IntegrationTestBase

class TestFullWorkflow(IntegrationTestBase):
    
    def test_validate_succeeds_on_valid_project(self, sample_project):
        """Test that validate passes on a properly configured project."""
        runner = CliRunner()
        result = runner.invoke(
            cli, 
            ['validate'],
            catch_exceptions=False,
            env={'PWD': str(sample_project)}
        )
        assert result.exit_code == 0
        assert "Validation passed" in result.output
    
    def test_init_wizard_on_fresh_dbt_project(self, uninitalized_project):
        """Test init wizard creates correct structure."""
        runner = CliRunner()
        result = runner.invoke(
            cli,
            ['init', '--skip-prompts'],
            catch_exceptions=False,
            env={'PWD': str(uninitalized_project)}
        )
        assert result.exit_code == 0
        assert (uninitalized_project / "sst_config.yaml").exists()
        assert (uninitalized_project / "snowflake_semantic_models").is_dir()
```

### Testing Specific Edge Cases

```python
def test_validation_catches_invalid_metrics(self, tmp_path):
    """Test that validation fails on invalid metrics."""
    project = SampleProjectFactory.create_with_invalid_metrics(tmp_path)
    runner = CliRunner()
    result = runner.invoke(cli, ['validate'], env={'PWD': str(project)})
    
    assert result.exit_code != 0
    assert "broken_metric" in result.output
    assert "Missing required field" in result.output
```

## Additional Context

### Related Work

- The [sst-jaffle-shop tutorial](https://github.com/mluizzi-whoop/sst-jaffle-shop/blob/main/docs/TUTORIAL.md) provides a real-world example that this sample project should mirror (in structure, not size)
- Current integration tests in `tests/integration/` already create temp project structures - this proposal standardizes that approach

### Success Criteria

1. All existing integration tests can be refactored to use the sample project
2. At least one new end-to-end test covering `init → enrich → validate → extract → generate` workflow
3. CI passes without any external dependencies (no network calls for test fixtures)
4. Test execution time does not significantly increase (< 10% slower)

### Suggested Milestones

1. **Phase 1**: Create base sample project structure with minimal models
2. **Phase 2**: Implement `SampleProjectFactory` with common variants
3. **Phase 3**: Refactor existing integration tests to use new fixtures
4. **Phase 4**: Add new end-to-end workflow tests

## Pre-submission Checklist

- [x] I have searched existing issues to avoid duplicates
- [x] I have described a clear problem and solution
- [x] I have considered alternatives and workarounds



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Embedded Sample dbt+SST Project for Integration Testing #95

Problem Statement

Proposed Solution

Implementation Components

1. Base Sample Project

2. Project Factory

3. Integration Test Base Class

Alternatives Considered

Priority

Impact

Technical Considerations

Files to Create/Modify

Manifest Generation

Considerations

Example Usage

Running Full Workflow Tests

Testing Specific Edge Cases

Additional Context

Related Work

Success Criteria

Suggested Milestones

Pre-submission Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Alternative	Pros	Cons
Git submodule to sst-jaffle-shop	Real-world project, shared with tutorials	Network dependency in CI, version drift, tutorial-focused not test-focused
Continue with programmatic fixtures	Flexible, already in place	Inconsistent, hard to maintain, incomplete coverage
External test repo with CI triggers	Isolation, real Snowflake testing	Complex CI setup, slow feedback loop, maintenance overhead
Docker-based test environment	Complete isolation, reproducible	Heavy setup, overkill for parsing/validation tests

[Feature]: Embedded Sample dbt+SST Project for Integration Testing #95

Description

Problem Statement

Proposed Solution

Implementation Components

1. Base Sample Project

2. Project Factory

3. Integration Test Base Class

Alternatives Considered

Priority

Impact

Technical Considerations

Files to Create/Modify

Manifest Generation

Considerations

Example Usage

Running Full Workflow Tests

Testing Specific Edge Cases

Additional Context

Related Work

Success Criteria

Suggested Milestones

Pre-submission Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions