Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 0 additions & 48 deletions CLAUDE.md

This file was deleted.

125 changes: 125 additions & 0 deletions examples/auto_trace/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Gentrace Auto-Tracing Example

This example demonstrates Gentrace's automatic AST-based tracing feature with pipeline ID support, which instruments all functions in specified modules without requiring manual decoration.

## Prerequisites

Before running the example, ensure you have the following environment variables set:

```bash
export GENTRACE_API_KEY="your-gentrace-api-key"
export GENTRACE_PIPELINE_ID="your-pipeline-uuid"
export OPENAI_API_KEY="your-openai-api-key"
```

## Overview

The example simulates a data processing pipeline with multiple layers of function calls, creating a deep tree of spans to showcase the automatic tracing capabilities with pipeline association.

## Structure

```
auto_trace/
├── README.md # This file
├── auto_trace_pipeline_example.py # Main example script
└── pipeline_example_app/ # Example application package
├── __init__.py
└── workflow.py # Pipeline workflow functions
```

## Features Demonstrated

1. **Automatic Function Tracing**: All functions in the `pipeline_example_app` module are automatically traced without decorators
2. **Pipeline ID Support**: All traces are associated with a specific Gentrace pipeline
3. **Deep Span Trees**: The pipeline creates a multi-level hierarchy of spans showing the call flow
4. **OpenTelemetry Integration**: Proper span context propagation with parent-child relationships
5. **Root Span Attribution**: Only the root span includes the `gentrace.pipeline_id` attribute
6. **Decorator Compatibility**: The `@interaction` decorator can be used alongside auto-tracing, with decorator settings taking precedence

## Running the Example

```bash
cd examples/auto_trace
python auto_trace_pipeline_example.py
```

## Expected Output

The example will:

1. Initialize OpenTelemetry with Gentrace configuration
2. Install auto-tracing for the `pipeline_example_app` module
3. Execute a multi-step data processing pipeline
4. Export spans to the Gentrace backend via OTLP

## Span Tree Structure

The automatic tracing creates a span tree like this:

```
run_data_processing_pipeline [root - has pipeline_id from auto-tracing]
├── extract_data
│ └── process_record (multiple calls)
├── transform_data
│ ├── apply_transformation (multiple calls)
│ └── validate_transformation (multiple calls)
└── load_results
├── generate_summary [auto-traced]
└── Manual Summary Enrichment [@traced decorator - custom attributes]
```

## Key Concepts

### Installing Auto-Tracing with Pipeline ID

```python
gentrace.install_auto_tracing(
['pipeline_example_app'],
min_duration=0,
pipeline_id=pipeline_id
)
```

This must be called BEFORE importing the modules you want to trace.

### How Pipeline ID Works

When a pipeline_id is provided:

- The root span (first span in the trace) includes the `gentrace.pipeline_id` attribute
- Child spans do NOT include the pipeline_id attribute
- This allows the Gentrace backend to associate the entire trace tree with the pipeline
- All spans in the trace are connected via the same `traceId`

### Mixing Auto-Tracing with Manual Tracing

The example demonstrates that manual tracing decorators can be used within auto-traced code:

- Functions marked with `@no_auto_trace` are excluded from automatic tracing
- The `@traced` decorator can be used within auto-traced code for fine-grained control
- Manual traces can add custom attributes and names
- Both auto-traced and manually traced spans maintain proper parent-child relationships
- This allows selective manual instrumentation while maintaining automatic tracing for the rest of the code

### OpenTelemetry Configuration

The example shows proper OpenTelemetry setup with:

- Resource configuration with service name
- Gentrace sampler for sampling decisions
- Gentrace span processor for baggage propagation
- OTLP exporter configured for the Gentrace backend

### Minimum Duration

The `min_duration` parameter (in seconds) can be used to only trace functions that take longer than the specified duration. This helps reduce noise from very fast functions.

## Troubleshooting

If spans don't appear in the Gentrace UI:

1. Check that all environment variables are set correctly
2. Verify the Gentrace backend is running and accessible
3. Ensure the ClickHouse replication task is running
4. Check the console output for any error messages
5. Verify the pipeline ID exists in your Gentrace instance
107 changes: 107 additions & 0 deletions examples/auto_trace/auto_trace_pipeline_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
"""Example demonstrating Gentrace's automatic AST-based tracing with pipeline_id support.

This example shows how auto-traced functions can be associated with a specific
Gentrace pipeline and exported to the Gentrace backend.

To run this example, ensure the following environment variables are set:
GENTRACE_API_KEY: Your Gentrace API token for authentication.
GENTRACE_BASE_URL: The base URL for your Gentrace instance (e.g., http://localhost:3000/api).
GENTRACE_PIPELINE_ID: The UUID of the pipeline to associate traces with.
OPENAI_API_KEY: Your OpenAI API key (optional, for OpenAI examples).
"""

import os
import atexit
from typing import Dict

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter

import gentrace
from gentrace import GentraceSampler, GentraceSpanProcessor

# Get configuration from environment
api_key = os.getenv("GENTRACE_API_KEY", "")
gentrace_base_url = os.getenv("GENTRACE_BASE_URL", "")
pipeline_id = os.getenv("GENTRACE_PIPELINE_ID", "")

if not api_key:
raise ValueError("GENTRACE_API_KEY environment variable not set.")
if not gentrace_base_url:
raise ValueError("GENTRACE_BASE_URL environment variable not set.")
if not pipeline_id:
raise ValueError("GENTRACE_PIPELINE_ID environment variable not set.")

print(f"Gentrace Base URL: {gentrace_base_url}")
print(f"Pipeline ID: {pipeline_id}")
print("=" * 60)

# Setup OpenTelemetry with Gentrace
resource = Resource(attributes={"service.name": "auto-trace-pipeline-example"})
tracer_provider = TracerProvider(resource=resource, sampler=GentraceSampler())

# Configure OTLP exporter to send traces to Gentrace
otlp_headers: Dict[str, str] = {}
if api_key:
otlp_headers["Authorization"] = f"Bearer {api_key}"

span_exporter = OTLPSpanExporter(
endpoint=f"{gentrace_base_url}/otel/v1/traces",
headers=otlp_headers,
)

# Add Gentrace span processor for enrichment
gentrace_baggage_processor = GentraceSpanProcessor()
tracer_provider.add_span_processor(gentrace_baggage_processor)

# Add the export processor
simple_export_processor = SimpleSpanProcessor(span_exporter)
tracer_provider.add_span_processor(simple_export_processor)

# Set the global tracer provider
trace.set_tracer_provider(tracer_provider)

# Install auto-tracing with pipeline_id for our example modules
# All functions in these modules will be automatically associated with this pipeline
print("Installing auto-tracing with pipeline_id...")
gentrace.install_auto_tracing(
['pipeline_example_app'],
min_duration=0,
pipeline_id=pipeline_id
)

# Now import the modules - they will be automatically instrumented with pipeline_id
from pipeline_example_app import workflow

if __name__ == "__main__":
# Register shutdown handler
atexit.register(tracer_provider.shutdown)

print("\nStarting Gentrace Auto-Trace Pipeline Example")
print("=" * 60)

# Run the workflow - all traced functions will include the pipeline_id
try:
result = workflow.run_data_processing_pipeline()

print("\n" + "=" * 60)
print(f"Pipeline completed successfully!")
print(f"Result: {result}")
print(f"\nMost spans include 'gentrace.pipeline_id': '{pipeline_id}'")
print("Note: The 'enrich_summary_manually' function uses @traced decorator")
print("to demonstrate manual tracing within auto-traced code.")
print("\nCheck the Gentrace dashboard to see the mixed trace!")
except Exception as e:
print(f"\nError running pipeline: {e}")
raise

print("\nShutting down...")
# Force flush to ensure all spans are sent
tracer_provider.force_flush()

# Add a small delay to ensure spans are processed
import time
time.sleep(1)
1 change: 1 addition & 0 deletions examples/auto_trace/pipeline_example_app/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Example application demonstrating pipeline-aware auto-tracing."""
Loading