feat: persona inference-time techniques, pricing pipeline overhaul, design system refresh, and simulations#36
Open
jeremykamber wants to merge 58 commits into
Open
Conversation
added 30 commits
May 20, 2026 00:03
…2024b) - Adds PersonaPromptCompiler with four-section architecture: - <<PERSONA IDENTITY>> (demographic anchoring) - <<PSYCHOGRAPHIC PROFILE>> (Big Five with behavior mappings) - <<EPISTEMIC BOUNDARIES>> (knowledge domain constraints) - <<BEHAVIORAL GUARDRAILS>> (response format + refusal patterns) - Adds persona anchor generation (SyTTA-style 4-16 token descriptors) - Adds compileInteractionPrompt for hybrid backstory + anchor + RAG assembly - Adds persona anchor injection before each chat turn for multi-turn consistency - Extends Persona entity with domainExpertise, epistemicBoundaries, responseConstraints, and refusalPatterns fields - Updates ChatAdapter to use the compiled prompt architecture - 7 tests passing for prompt compiler, anchors, sections, metadata
…ization (Joshi et al. 2025) Implements the PB&J (Psychology of Behavior and Judgments) framework: - Three parallel psychological scaffolds: - Big Five Personality Roots (trait-level cause-and-effect explanations) - Cognitive-Reflex Decision Style (System 1 vs System 2 manifestations) - Core Values & Risk Worldview (money, risk, efficiency, trust) - Generates post-hoc rationales explaining WHY a persona holds traits - Formats rationales as backstory appendix with <<PSYCHOLOGICAL RATIONALES>> section - Graceful degradation: partial results survive individual scaffold failures - Reference: Joshi et al. (2025) Findings of EMNLP 2025
…l. 2025) Two-tier identity retrieval-augmented generation: - IdRagStore: in-memory vector store with trigram-based semantic similarity - Backstory chunking with metadata (topic, emotional tone, relationship type) - Automatic topic tagging and tone detection per chunk - Related chunk linking (adjacency + same-topic cross-linking) - Top-K retrieval with relevance scoring - Formatted context string for prompt injection - 8 tests passing for chunking, retrieval, ranking, formatting, edge cases
…nterrogation InCharacter-style evaluator (Wang et al. 2024a): - Open-ended conversational interview across all Big Five dimensions - Expert LLM evaluates transcript for trait evidence (blinded) - Score parsing from expert analysis text - 5 tests for interview protocol, expert analysis, score parsing PICon-style evaluator (Kim et al. 2026): - 8-turn logically-chained multi-turn interrogation - Three consistency dimensions: internal, external, retest - Expert LLM judges for each dimension with JSON scoring - Structured result with total score and detailed breakdown - 4 tests for interrogation, retest, internal consistency, full evaluation
Integration test covering the complete pipeline: - T1+T3: Compartmentalized prompts with persona anchors - T4: ID-RAG chunking, indexing, retrieval, and formatting - T1+T3+T4: Full hybrid prompt assembly (compartmentalized + anchor + RAG) - Realistic persona with detailed 5-part backstory - Verifies all components compose correctly into a unified interaction prompt - 6 integration tests passing
10 E2E tests covering the complete pipeline without LLM dependencies: - Compartmentalized 4-section prompt generation - Persona anchor injection per turn - ID-RAG chunking, retrieval, and formatting - Full hybrid prompt assembly (compartmentalized + RAG + anchor) - Multi-turn conversation simulation - Edge cases: empty backstory, unknown persona, different profiles All 10 tests passing.
Documents all 6 implemented techniques with research citations: - T1: Compartmentalized prompts (Wang et al. 2024b) - T2: PB&J psychological scaffolds (Joshi et al. 2025) - T3: Persona anchors / SyTTA (Xu et al. 2026, Atri et al. 2026) - T4: ID-RAG factual grounding (Tan et al. 2025) - T5: InCharacter interview evaluation (Wang et al. 2024a) - T6: PICon consistency interrogation (Kim et al. 2026) Covers architecture, files changed, test verification, and run instructions.
Removes non-research fields (cognitiveReflex, technicalFluency, economicSensitivity, designStyle, livingEnvironment). Adds psychographic specification from Wang et al. (2024b): values, fears, communicationStyle, decisionStyle. Keeps Big Five (OCEAN) as primary psychometric framework per Joshi et al. (2025). Updates prompts, mapper, UI cards, detail modal, and tests. 44/44 tests passing.
… spec Root cause: analysis prompt had three problems causing uniform scoring: 1. Referenced cognitiveReflex which was removed from Persona entity 2. Only guided on 2 of 5 Big Five traits (Conscientiousness, Neuroticism) 3. Ignored values, fears, communicationStyle, decisionStyle entirely 4. No prompt/result logging to inspect LLM behavior Fix: - Adds behavioral guidance for ALL Big Five: Conscientiousness, Neuroticism, Openness, Extraversion, Agreeableness - Adds VALUES + FEARS DRIVE YOUR MOTIVATION section that injects the persona's actual values and fears directly into the prompt - Adds decisionStyle + communicationStyle guidance - Tells LLM explicitly: 'Different personas MUST give DIFFERENT scores based on their unique Big Five, values, and fears' - Logs full analysis prompt + results for each persona
Previously analysis used flat stringifyPersona() while chat used the 4-compartment PersonaPromptCompiler. Now both use the same Wang et al. (2024b) architecture with <<ANALYSIS TASK>> appended.
…eline Fixes PbjScaffoldEnhancer to use current Persona fields (removed stale cognitiveReflex, economicSensitivity, technicalFluency refs). Scaffolds now use Big Five + values + fears + decisionStyle + communicationStyle. Integrated into GeneratePersonasUseCase as new ENHANCING_WITH_PBJ phase between backstory generation and insight generation. Per Joshi et al. (2025), this adds post-hoc rationales that causally connect Big Five to psychographics, improving alignment by 6-9%.
Adds ENHANCING_WITH_PBJ as step 2.
ChatAdapter: ingests persona backstory into IdRagStore on first interaction, retrieves top-3 relevant chunks per user message, injects as <<RETRIEVED MEMORY>> in system prompt. VisionAnalysisAdapter: ingests persona backstory, retrieves chunks relevant to the pricing page context, injects as <<RETRIEVED MEMORY>> in both streaming and audit analysis prompts. Also adds persona anchor before the task instruction. Per Tan et al. (2025) ID-RAG framework.
Adds periodic re-grounding (Atri et al., 2026b) to ChatAdapter: every 4th turn, a <<REGROUND>> instruction forces the model to re-access its persona definition (values, fears, goals) before responding. This counters the 30-40% persona drift documented in ChronoScope between turns 1-20.
The streaming generateInitialPersonasStream consistently fails at runtime. Replaced with direct non-streaming generateInitialPersonas call. Can re-add streaming later when the AI SDK streamObject issue is resolved.
P0 fixes addressing Linear PM feedback: 1. Domain calibration: generation prompt now tells LLM that -16/user is standard B2B SaaS pricing, Enterprise custom pricing is normal, beta features on paid tiers are common. Prevents personas from penalizing standard industry mechanics. 2. Scoring-sentiment alignment: analysis prompt now enforces that scores must match qualitative sentiment (positive gut = 6+, critical = 4 or below). 3. Actionable recommendations: added recommendations field to PricingAnalysis entity, schema, and validation. Analysis prompt asks for 2-3 specific actionable recommendations per persona.
…ricing context Removes the hardcoded B2B SaaS pricing norms that made the system rigid and non-generalizable. Instead, the LLM now generates pricingSensitivity (0-100), typicalBudget, and domainExpertise per-persona based on their role, industry, and Big Five profile. A bootstrapped founder gets different calibration than a well-funded VP. Also adds recommendations field to PricingAnalysis with prompt guidance. Updates entity, mapper, mock data, all prompts, and parsing.
Three features: 1. Openness Priming: <<OPENNESS PRIMING>> section in analysis prompt tells persona they're open to being convinced, preventing default-negative responses. 2. Intent Funnel: Replaced single likelihoodToBuy with explorationIntent >= analysisIntent >= buyIntent. Each has a 1-10 score and reason. FunnelStage UI component shows drop-off visualization. 3. Score Rationales: Every score now includes a 1-2 sentence LLM-generated reason explaining WHY. Schema: scores → reasons → narrative thoughts. Also deleted dead PricingAnalysisMapper (unused in production). Updated all mock data, fallback objects, and validation.
Model swap: qwen/qwen3.5-flash-02-23 and qwen/qwen3.5-9b replaced with deepseek/deepseek-v4-flash (same price, stronger reasoning). Vision models kept as qwen/qwen3-vl-30b-a3b-instruct (VLM needed). Reasoning token capture: - Streaming: yields <<REASONING>> markers alongside content - Non-streaming: logs reasoning token content - shouldDisableThinking updated (DeepSeek reasoning on by default) UI: New ThinkingBlock component - collapsible gray thinking section with brain icon and character count. PersonaChat parses reasoning markers and renders them as expandable blocks.
Three build fixes: 1) Use new RegExp() instead of regex literal to avoid Turbopack parsing issues with << >> characters. 2) Update benchmark.ts mock personas to use new Persona fields (removed personalityTraits, cognitiveReflex, etc; added values, fears, pricingSensitivity, etc). 3) Add missing recommendations field to mock-2 and mock-3 in MockAnalyses.ts.
Fixes: 1. Reasoning tokens: adds comprehensive debug logging of raw delta keys from DeepSeek V4 Flash (reasoning_content, reasoning, reasoning_details) to determine the actual field name. Falls back through all possible formats. 2. 4th-wall anchor quoting: moved anchor from prepended user message text to a separate system message before the user turn. Format: [Frame: excited trend spotter]. Model won't quote it back as user speech since it's a system directive, not user input. Maintains primacy effect (right before generation).
…orted implementation
…pdate plan progress
added 17 commits
May 24, 2026 20:00
…code quality - Replace hardcoded white/black color references with theme tokens across all components (badge, card, tabs, dialog, results, upload) - Fix dialog overlay opacity to match spec: bg-black/80 -> bg-black/40 - Fix button radius: rounded-lg -> rounded-md (design spec: 6px) - Fix input border width: border-2 -> border (design spec: 1px) - Remove decorative shadows from FlowDialog and ResultsView (no-shadow rule) - Fix sidebar nav radius to match design system (rounded-md) - Fix dropzone radius for card consistency (rounded-xl -> rounded-lg) - Remove debug console.logs and unused framer-motion import - Fix as any type assertions with proper as const - Add DESIGN.md and PRODUCT.md design system documentation - Remove superseded DESIGN_SYSTEM.md and DESIGNER_PROMPT.md
…used download, chat offset - Remove unused FileDown download button (no onClick handler) - Move close button into header action row to prevent overlay with tabs - Switch DialogContent from grid to flex-col for proper height propagation - Wrap chat tab in flex-1 container so header is flush with top - Add min-h-0 to both tab containers for correct overflow handling
Strip full-height side-sheet positioning (fixed right-0 top-0 h-dvh) and revert to base dialog centering. Cap height at 85vh so content isn't full-screen. Override width to 600px/680px for comfortable reading.
- Add .DS_Store and logs/ to .gitignore - Remove .DS_Store, bun.lockb, next-env.d.ts from tracking (already gitignored or auto-generated)
…okens, sonner toasts, button hierarchy - Change accent from indigo-violet to cerulean blue across all CSS tokens - Update DESIGN.md to reflect new color palette and design decisions - Add button hierarchy documentation (primary/secondary/ghost) with ring-1 definition - Add selection state guidelines using tonal hierarchy instead of accent color - Integrate sonner toast system with dark theme tokens - Add sim-ring-fade animation for simulation notifications - Refine button component variants with ring-1 on primary, updated border styles on outline - Update CSS variable lightness values for better contrast in both light and dark modes - Replace hardcoded chat bubble colors with color-mix on primary token - Align accent and sidebar-accent with primary token via var() reference
…chat streams - Extract duplicate parseMessageContent function from PersonaChat and PersonaChatInline into shared module - Add error handling for non-string stream updates (step: ERROR objects) - Fix chatWithPersonaAction to use stream.done instead of stream.error for graceful error delivery - Change streamable value type from string to any to support structured error payloads
…logging, scouting improvements, and result persistence - Add AnalysisLogger — per-run JSONL logging to logs/analysis/ with auto-flush and structured metadata - Add SimulationResultStore — in-memory server-side store with 30-min TTL for simulation results surviving page reloads - Add screenshot and progress side-channel stores to bypass RSC stream size limits on base64 payloads - Add getSimulationResult, getProgress, getScreenshot server actions for client polling/reconnection - Overhaul ParsePricingPageUseCase with adaptive scouting: targeted strike via HTML locator, guided scroll, lazy-load triggering - Add parallel HTML analysis (isPricingVisibleInHtml + summarizeHtml) for faster pricing detection - Add structured logging throughout the entire pipeline — entry/exit, latency tracking, persona summaries - Add runId passthrough for per-request tracing across all pipeline components - Upgrade report API route with request timing, validation logging, and AnalysisLogger integration - Save error states to SimulationResultStore for robust failure recovery - Add unit tests for the pricing pipeline use case with mock browser and LLM services
…-load handling, and vision analysis overhaul - Simplify RemotePlaywrightAdapter: remove redundant wait methods, streamline screenshot capture - Add DOM stability detection during page load for reliable screenshot timing - Add lazy image loading trigger via progressive scroll-down - Add framework rendering detection (React/Angular/Vue) for SPA compatibility - Overhaul VisionAnalysisAdapter with structured pricing detection via HTML locator + vision LLM - Add multi-strategy pricing location: targeted element strike (via selector/anchor text) vs full scroll scout - Add scouting state machine with viewport captures at each stage for progress UI - Add compact HTML summarization for LLM-based pricing element detection - Pass runId through all LLM calls for consistent tracing
… retry, reasoning extraction, persona backstory generation - Refactor LlmServiceImpl with configurable OpenRouter/Ollama provider selection via createFromEnv factory - Add p-limit concurrency limiter (max 20 parallel requests) to prevent API throttling - Add exponential backoff retry logic (5 retries, 429/5xx only) with jitter - Add reasoning token extraction from DeepSeek V4 Flash responses (both streaming and non-streaming) - Add per-request logging with request IDs, duration tracking, and response previews - Implement OpenRouterCriticAdapter for validation/critique of analysis quality - Implement OpenRouterChatAdapter for persona chat conversations - Add LlmMemoryAdapter for memory-enhanced chat interactions - Add structured persona backstory generation adapter with progress callbacks - Add PersonaPromptCompiler — centralized prompt templates for persona generation, backstories, and insights - Add batch backstory and insight generation for efficient persona processing - Extend LlmServicePort interface with new methods: isPricingVisibleInHtml, summarizeHtml, generateBackstory, etc. - Add variantOf field to Persona entity for tracking persona variation lineage - Add aiSuggestion field to PricingAnalysis entity
…liders, inline streaming, and management - Add generateSimilarPersonasAction — server action for streaming variation generation from reference persona - Add variationMapping utility — maps continuous Big Five values to 1-5 discrete scale for slider UI - Add PersonaDetailSheet variant tab with Big Five sliders (1-5), creative freedom slider, count selector (1/3/5) - Add randomize button for quick trait exploration - Add PersonaSkeletonCard — shimmer placeholder shown while variations are generating - Add DashboardClient variation flow: placeholder injection, streaming replacement, toast notifications - Add person deletion with confirmation dialog on PersonaProfilePanel - Add variantOf display indicator on PersonaProfilePanel - Extend personaStore with insertPersonasAfter (inline insertion after reference), updatePersona, removePersona - Add unit tests for variationMapping and personaStore - Add E2E tests for persona variation generation flow
…s polling and sidebar navigation - Add simulations list page (/dashboard/simulations) showing all completed/in-progress simulation runs - Add simulation detail page (/dashboard/simulations/[id]) with persona analyses, progress polling, and reconnection support - Add simulationStore (Zustand) — manages simulation state with localStorage persistence - Add SimulationToaster component for real-time simulation completion notifications - Add Simulation and PersonaProfile domain entities - Add Simulations link to sidebar navigation with active state - Support reconnection to in-progress simulations via polling getProgress/getScreenshot/getSimulationResult actions
…ove results-to-persona matching - Add AnalysisProvider context component for managing analysis state across dashboard views - Improve ResultsView persona matching — match by personaProfile.name or personaId instead of index - Update SetupView layout with refactored persona flow integration - Overhaul useAnalysisFlow hook with improved state machine for simulation lifecycle - Add computeBenchmarks utility for persona analysis comparison metrics - Add benchmark unit tests
…ntity graph retrieval - Generalize IdRagStore to support both backstory and interview signal chunks - Add chunkBackstory with topic detection (early-life, career, finance, setback, etc.) and emotional tone analysis - Add linkRelated — cross-chunk adjacency and same-topic linking for identity graph traversal - Add interview signal chunking pipeline (chunkInterviewSignals) - Add n-gram fingerprinting and cosine similarity for in-memory semantic retrieval - Add InterviewSignalExtractor — extracts structured signals from interview transcripts - Add PsychographicRationalizer (PB&J) — enhances persona backstories with psychological rationales - Add generatePersonasFromInterviews server action — end-to-end pipeline from transcripts to personas - Add pooling and sampling utilities for transcript processing - Add GazePredictionAdapter and InCharacterEvaluator for behavioral analysis - Refactor IdRagService to use the generalized store for all persona context retrieval
…ona adapter, and pipeline - Add PersonaAdapter unit tests for backstory generation and persona creation - Add pricing-analysis-e2e test — end-to-end persona analysis of a live pricing page - Add pricing-analysis-comprehensive test — multi-persona analysis with full scoring - Add pricing-analysis-full-flow test — complete simulation lifecycle including streaming - Add pricing-analysis-real-flow test — real-world flow with browser automation - Add vitest setup file with global test configuration and mocks
- Add sonner dependency for toast notifications - Update tsconfig with .next type includes and consistent formatting - Add vitest setup file configuration for jsdom test environment
- Add opencode.yml workflow triggered by /oc and /opencode commands on issues and PRs - Uses anomalyco/opencode/github action with deepseek-v4-flash model
✅ Deploy Preview for deepbound ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…timeout during LLM batches
Adds a 4-second setInterval heartbeat that sends keep-alive stream updates when no real progress has been received for 3+ seconds. During Promise.allSettled extraction and batch generation, the function goes silent while waiting for parallel LLM calls — this silence triggers the Netlify serverless function timeout (10s default on legacy plans, 10s for streaming functions). The heartbeat keeps the RSC stream active by sending periodic '{ step, heartbeat: true }' updates, preventing the connection drop.
…pipelines (up to 15 min) - Add netlify/functions/process-pipeline.ts — background function with config.background = true for 15-minute execution limit - Add PipelineStore — durable pipeline state persistence via Netlify Blobs (cross-instance, cross-invocation) - Add startPipeline server action — stores input data in Blobs, triggers background function via HTTP fetch, returns jobId - Add getPipelineStatusAction — polling endpoint for background pipeline progress/results - Rewrite useInterviewPipeline from streaming to polling pattern — submits job, polls every 2s - Add @netlify/blobs dependency for shared state storage - Configure netlify.toml with background function settings
…INGESTING -> COMPILING)
…nstead of raw Blobs fetch The background function was using raw fetch to Netlify Blobs which required NETLIFY_ACCESS_TOKEN — not automatically set in the function env. PipelineStore uses @netlify/blobs which auto-configures auth inside Netlify Functions. PipelineStore.ts has zero @/ imports so the relative import resolves correctly through the function bundler.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces inference-time persona construction techniques grounded in the academic literature (Joshi et al. 2025, Wang et al. 2024b, Moon et al. 2024), overhauls the pricing analysis pipeline with production-grade logging and result persistence, refreshes the design system, and adds a simulations UI for managing analysis runs.
Changes
🎨 Design System Refresh
ring-1on primary variants, refined outline/secondary stylescolor-mix()on the primary token💬 Chat Refactor
parseMessageContentfunction fromPersonaChatandPersonaChatInlineinto a shared modulestream.done)stream.doneinstead ofstream.errorfor graceful error delivery📊 Pricing Analysis Pipeline Overhaul
logs/analysis/with auto-flush, latency tracking, and run-level metadataisPricingVisibleInHtmlandsummarizeHtmlconcurrently for faster pricing detection🌐 Browser & Vision Scouting
RemotePlaywrightAdapter— streamlined screenshot capture, added DOM stability detection, lazy image loading trigger, SPA framework rendering detectionVisionAnalysisAdapterwith HTML locator strategy for pricing element detection before engaging vision LLM🧠 LLM Service Refactor
createFromEnvfactoryp-limitconcurrency limiter (max 20 parallel requests) + exponential backoff retry (5 retries with jitter)OpenRouterCriticAdapter,OpenRouterChatAdapter,LlmMemoryAdapterPersonaAdapterwith batch backstory/insight generation +PersonaPromptCompilerfor centralized prompt templatesLlmServicePortwith new methods for HTML summarization, pricing visibility detection, and persona generation👥 Persona Variation Generation
🔬 Simulations UI
/dashboard/simulations): Shows all completed/in-progress simulations with status indicators/dashboard/simulations/[id]): Persona analysis breakdowns with progress polling and reconnection support🧩 Interview Pipeline
IdRagStorefor both backstory and interview signal chunks with topic detection and emotional tone analysislinkRelatedfor identity graph traversal (adjacency + same-topic linking)InterviewSignalExtractor,PsychographicRationalizer, chunking pipeline🧪 Tests
🔧 Chores
sonnerdependency.nexttype includes.DS_Store,logs/) and untracked OS artifactsFiles Changed
~75 files changed, ~7,600+ insertions, ~1,200+ deletions