Structured behavioral labeling and eval-ready export packaging for replay-faithful AI conversation logs.
DriftTag is a private/local pipeline for converting multi-turn AI conversation transcripts into structured behavioral records for behavioral analysis, trust/posture trajectory review, golden-dataset creation, regression testing, and downstream evaluation workflows.
It is designed for researchers, red-teamers, evaluators, and AI product teams who need to turn messy conversational logs into reviewable, replay-aware behavioral data.
This repository documents the DriftTag system, methodology, example records, public reports, and service-facing output structure. It does not imply a public open-source release of the core implementation.
This repository is a public overview and example-facing companion for DriftTag. The DriftTag source pipeline is private, but this repo documents the methodology, input expectations, output structure, example records, and public validation reports.
DriftTag is designed for teams or researchers working with replay-faithful multi-turn AI conversation logs who want structured behavioral records for offline evaluation, golden-dataset creation, regression testing, and long-horizon conversation analysis.
To understand the project:
- Read this README to understand what DriftTag does and what it is not.
- Review the input requirements to see what kind of conversation logs are suitable for DriftTag processing.
- Inspect the example records in the
examples/directory. - Review the output structure to understand the kinds of behavioral fields DriftTag produces.
- Review the MultiWOZ run report for a larger-scale processing example.
- Contact DriftForge Systems if you have authorized multi-turn AI logs you want evaluated or converted into structured behavioral/eval-ready outputs.
The current public examples are meant to show the shape and purpose of DriftTag records. Newer export-layer examples are being prepared separately and should not be treated as published validation until they are added to the repo.
DriftTag does not fabricate missing metadata. When external identifiers such as session IDs, trace IDs, span IDs, model versions, timestamps, run IDs, tool-call IDs, or retrieval IDs are present, the export layer is designed to preserve those identifiers so outputs can be joined back to the originating system.
DriftTag processes replay-faithful conversation transcripts and produces structured records containing fields such as:
- turn role
- turn index
- original text
- normalized text
- strategy label
- phase label
- score
- emotion label
- emotion polarity
- emotion intensity
- trust delta
- trust curve
- short/long trust smoothing
- inflection markers
- tagging/refinement failure status
- behavioral posture signals
- conversation-level summaries
- provenance and record hash
- preserved external metadata when present
The goal is not to replace human review.
The goal is to make large conversation logs easier to inspect, compare, filter, replay, audit, and prepare for downstream analysis or evaluation.
Most conversational AI logs are difficult to analyze at scale.
Raw transcripts often hide the structure that matters most:
- who said what
- in what order
- whether the interaction is progressing, stalling, recovering, or degrading
- whether trust is increasing, decreasing, or remaining flat
- whether the model is becoming more cooperative, evasive, unstable, or over-accommodating
- which turns may be useful for training, evaluation, or red-team review
- whether the record can be replayed faithfully enough to support serious analysis
DriftTag provides a structured preprocessing and export layer for that problem.
It is built around the principle that multi-turn AI behavior should be analyzed from replay-faithful records, not flattened summaries.
DriftTag currently uses a layered processing pipeline.
transcript input
→ Pass 0 spine extraction, audit, and quarantine
→ Pass 1 strict single-turn tagging
→ Pass 2 conservative refinement
→ Pass 3 deterministic trust derivation
→ Pass 4 behavioral posture / trajectory derivation
→ Export Layer v1.0
→ structured JSONL, CSV, summaries, and reports
Extracts the replay spine from transcript files.
Pass 0 preserves:
- role
- turn order
- original raw text
- normalized text
- provenance
- record hash
It also produces audit data and quarantines malformed or non-replayable inputs.
Uses a local model interface to generate structured labels for each turn.
Current fields include:
- phase
- strategy
- score
- emotion label
- emotion polarity
- emotion intensity
- emotion confidence
- trust delta
- metadata
Runs a conservative patch-only refinement pass.
This pass is designed to improve malformed or weak records without rewriting valid outputs unnecessarily.
Derives trajectory-level trust fields from the Pass 1 / Pass 2 records.
Current outputs include:
- trust curve
- short smoothing
- long smoothing
- trust inflection marker
- missing-trust accounting
Pass 3 trust derivation is deterministic and does not require a model call.
Adds derived behavioral posture and trajectory signals over the structured record stream.
Current posture/signal concepts include:
- cooperative service flow
- stable task progression
- clarification loop
- task switching
- stalled or unresolved behavior
- trust disruption
- recovery after disruption
- cooperation signal
- softening signal
- hardening signal
- stall signal
Public examples for the newest Pass 4/export-layer output shape are still being prepared. Existing examples should be treated as public record-shape demonstrations, not as a complete showcase of every current internal output field.
DriftTag v1.8.10 adds an export layer on top of the frozen v1.8.9 trust-loss corrective baseline.
The export layer is focused on:
- golden-dataset creation
- offline eval support
- regression testing support
- dashboard/spreadsheet-friendly outputs
- trace-friendly behavioral records
- metadata passthrough
- preserving external identifiers when they exist
This release is trace-friendly, not trace-dependent.
It is not full OpenTelemetry integration, not production monitoring, and not LiteLLM/backend abstraction.
You said:
I am looking for a train departing from London Kings Cross.
ChatGPT said:
When would you like to depart London Kings Cross?
You said:
Yes, departing from there heading to Cambridge on Saturday.
{
"schema_version": "v2",
"phase": "analysis",
"strategy": "location_identification",
"score": 0.9,
"emotion_label": "location",
"emotion_polarity": 0.0,
"emotion_intensity": 0.19,
"emotion_confidence": 1.0,
"trust_delta": 0.1,
"trust_delta_missing": 0,
"role": "user",
"turn_index": 0,
"original": "I am looking for a train departing from London Kings Cross.",
"trust_curve": 0.1,
"trust_smoothed_short": 0.1,
"trust_smoothed_long": 0.1,
"trust_inflection": "none"
}This example is intentionally small and simplified. It is meant to show the general record shape, not the full current export-layer output set.
DriftTag produces structured artifacts for review and downstream analysis.
Typical output categories include:
- spine records
- audit records
- strict tagging records
- refined records
- trust-curve records
- behavioral posture records
- quarantine records
- progress markers
- export summaries
- dashboard/spreadsheet-friendly outputs
- trace-friendly JSONL records
Core records are emitted as JSONL records suitable for review and downstream processing.
DriftTag v1.8.10 adds export-oriented outputs such as:
golden_dataset.jsonldataset_summary.jsonturn_signals.csvconversation_metrics.csvtrust_curve.csvaggregate_counts.jsontrace_aligned.jsonl
These outputs are designed to make DriftTag results easier to review, filter, join, and use in offline evaluation workflows.
When available, DriftTag can preserve external identifiers such as:
trace_idspan_idparent_span_idsession_idconversation_idtimestamptool_call_idretrieval_idmodel_versionrun_id
Unknown non-content metadata can be preserved under external_metadata.
Missing metadata is not fabricated.
The goal is to let teams join DriftTag outputs back to their own systems without making DriftTag dependent on any specific tracing, observability, or backend framework.
DriftTag has been tested on MultiWOZ dialogues after converting the raw dialogue JSON files into DriftTag’s transcript format.
Full-run result:
- Files processed: 8,437
- Records processed: 113,408
- Quarantines: 0
- Tagger failures: 14
- Missing trust deltas: 14
- Trust coverage: 99.988%
- Pass 4 failures: 0
This run suggests DriftTag is useful for structured service-dialogue analysis and conversational realism datasets.
The MultiWOZ run is not treated as boundary-erosion or adversarial-drift data. It is primarily useful for:
- service-flow structure
- clarification patterns
- routine task completion
- conversational realism
- cooperative trust movement
- stable task progression
- large-scale replay-faithful processing validation
DriftTag is intended for:
- conversational AI evaluation
- golden-dataset creation
- offline eval dataset preparation
- behavioral regression testing
- support-bot transcript analysis
- red-team log preprocessing
- dataset preparation
- trust/posture trajectory review
- multi-turn behavior review
- training-pool candidate extraction
- downstream DriftForge / DriftBrain workflows
- trace-friendly behavioral enrichment for teams with existing logging systems
Possible users include:
- AI safety researchers
- LLM red-teamers
- dialogue-system researchers
- AI evaluation teams
- support-bot product teams
- agent observability / eval teams
- small AI companies preparing conversational logs for review
DriftTag is not:
- an automated policy-violation judge
- a production monitoring platform
- a jailbreak tool
- a replacement for human review
- a guarantee of semantic correctness
- a compliance or certification system
- a redaction/anonymization authority
- a public self-service product at this stage
- a replacement for OpenTelemetry, MLflow, Grafana, or observability systems
- a LiteLLM/backend abstraction layer
The outputs should be treated as structured signals for review, filtering, dataset preparation, and analysis.
DriftTag is an offline behavioral labeling and export pipeline for authorized multi-turn AI logs.
DriftTag expects replay-faithful transcript structure.
At minimum, useful inputs should preserve:
- turn order
- speaker / role labels
- message boundaries
- original text
- conversation/session grouping when available
Additional metadata is useful but not required, including:
- timestamps
- model version
- run ID
- trace ID
- span ID
- tool-call ID
- retrieval ID
- source system identifiers
Current transcript markers are exact:
You said:
ChatGPT said:
Example:
You said:
Can you help me with my account?
ChatGPT said:
Sure — what issue are you having?
Inputs without recognizable turn markers or usable role/turn structure may be quarantined.
DriftTag currently uses a local model backend for tagging and refinement.
The current internal development setup uses Ollama with Mistral.
Deterministic derivation layers do not require model calls.
Backend details may change as the system evolves.
DriftTag is currently maintained as a private/internal research and service tool.
The codebase is not currently published as an open-source package. Public materials may include methodology notes, sample outputs, run reports, schema examples, and selected documentation, but the core implementation remains private unless explicitly released later.
Current access model:
- private implementation
- supervised internal runs
- controlled service pilots
- selected sample outputs for demonstration
- no public self-service release yet
DriftTag can be used as a supervised processing service for teams or researchers with replay-faithful conversational logs.
A typical service workflow:
- The team provides authorized conversational logs.
- Logs are checked for replay-fidelity requirements.
- DriftTag processes the transcripts through the available pipeline.
- Outputs are reviewed for validity and obvious failure modes.
- The team receives structured records, summary statistics, and optional analysis notes.
Possible deliverables:
- structured JSONL records
- golden-dataset exports
- dashboard/spreadsheet-friendly CSV outputs
- pass-level audit summaries
- quarantine reports
- trust-curve outputs
- behavioral posture outputs
- inflection summaries
- dataset-quality notes
- trace-friendly records preserving external identifiers when present
- optional written findings memo
The practical service framing is simple:
Teams can send authorized multi-turn AI logs, and DriftTag can return enriched behavioral/eval-ready outputs that preserve existing identifiers so results can be joined back to the team’s own systems.
DriftTag is an active private research and tooling project.
Current strengths:
- working multi-pass architecture
- deterministic Pass 0 and Pass 3 layers
- structured JSONL outputs
- behavioral posture derivation
- export-layer outputs for eval/dataset workflows
- quarantine and audit behavior
- progress markers and resumable runs
- metadata passthrough for external identifiers
- full MultiWOZ-scale run completed
- suitable framing for supervised service pilots
Current limitations:
- public examples do not yet show every current Pass 4/export-layer output shape
- semantic labels may require downstream canonicalization
- strategy labels can contain variants such as
information_retrievalandinformation retrieval - local model output quality depends on the configured model backend
- current transcript parser expects exact markers
- outputs should be reviewed before being used for training, reporting, or decision-making
- this is not yet a public self-service product
The current system is suitable for experimentation, dataset preparation, internal evaluation workflows, golden-dataset creation, and supervised service pilots.
It should not be treated as a fully automated production safety system.
Planned improvements include:
Public sample outputs for the v1.8.10 export layer, including:
golden_dataset.jsonldataset_summary.jsonturn_signals.csvconversation_metrics.csvtrust_curve.csvaggregate_counts.jsontrace_aligned.jsonl
These examples should be clearly labeled as sample/synthetic or validated run outputs depending on their source.
Planned or experimental adapters:
- MultiWOZ dialogue JSON normalizer
- exported ChatGPT transcript normalizer
- support-bot transcript normalizer
- generic role/turn JSONL normalizer
Future export tooling may produce filtered views for:
- conversational realism pools
- structured service-flow pools
- boundary-erosion research pools
- recovery / repair examples
- candidate training data
- eval/regression subsets
Future work may include better support for trace-adjacent workflows.
This does not mean DriftTag must become a production observability platform. The near-term goal is to preserve identifiers and produce outputs that can be joined back to external systems.
DriftTag is part of the broader DriftForge Systems research stack.
DriftTag prepares structured behavioral records and eval-ready exports from replay-faithful multi-turn AI logs.
DriftForge is a governed conversational red-team and evaluation engine that can use structured records, training pools, and behavioral summaries produced by DriftTag.
Overseer is the planned live or near-live monitoring layer. DriftTag remains focused on offline processing, dataset preparation, and export-ready behavioral analysis.
DriftTag can also be used independently as a transcript preprocessing, behavioral labeling, and eval-export system.
DriftTag is built for defensive, research, and evaluation workflows.
The project is based on a simple principle: conversational AI behavior should be analyzed with preserved context, clear provenance, and human review.
DriftTag is intended to help teams understand interaction patterns, improve evaluation quality, and prepare better datasets — not to automate enforcement decisions or make unsupported claims about users, models, or organizations.
DriftTag outputs should be treated as structured signals that support review, not as final judgments.
Recommended use:
- process only logs you are authorized to analyze
- preserve replay-faithful records wherever possible
- review outputs before using them for training, reporting, or decision-making
- handle private, sensitive, or regulated data with appropriate permission and safeguards
- avoid using labels as automated enforcement or compliance decisions
This repository is used to maintain DriftTag documentation, public examples, selected reports, methodology notes, and roadmap information.
Public release scope will be decided after cleanup, validation, packaging, and service-boundary review.
Future public materials may include selected methodology notes, schema examples, sample outputs, pilot reports, normalizers, packaged tooling, or selected code depending on what is appropriate to release.
Until then, DriftTag should be treated as an active private research and tooling project rather than a finished public release.