A personal research project exploring multimodal AI for detecting the Say-Do Gap in user research.
π Part of my Computational Product Research learning journey
π February 2026
π¬ Methodology validation study
Knowledge graph of 50 synthetic user interviews revealing the Say-Do Gap: 611 nodes (2 user segments, 50 interviews, 559 behavioral cues)
Research shows that 42% of startup failures...
Research shows that 42% of startup failures are attributed to misreading market demand - building products users said they wanted during research, but refused to adopt upon launch.
One of the potential culprits? Social Desirability Bias - users smooth over negative feedback to be polite.
Someone might say: "This feature is easy to use."
But their audio reveals: "It's... [pause 3s]... easy." [frustrated tone]
Traditional text-based research misses this gap entirely.
Can we automatically detect the Say-Do Gap by analyzing audio behavioral cues using multimodal AI?
- Pauses (>2 seconds)
- Vocal hesitation ("um", "uh")
- Frustrated tone
- Confused tone
- Sentiment mismatch
Dataset: 50 synthetic user interviews (AcmeCal - a fictional camping gear rental marketplace)
Bias Injection:
- 40 Admin users (smooth experience)
- 10 End Users (friction-filled experience)
Analysis Pipeline:
- Text β Audio (OpenAI TTS)
- Audio β Behavioral Cues (Gemini 3 Pro)
- Cues β Knowledge Graph (Neo4j)
- Graph β Say-Do Consistency Score
Say-Do Consistency Scores:
| User Segment | Score | Interpretation |
|---|---|---|
| Admin Users | 83.0% | HIGH CONSISTENCY - Trustworthy feedback β |
| End Users | 3.2% | LOW CONSISTENCY - Hidden problems detected |
Bias Gap: 79.8 percentage points
- End Users showed 2X more behavioral friction (18.8 vs 9.3 cues per interview)
- But expressed similar verbal sentiment to Admin users
- The audio revealed what the text concealed
| Timestamp | Quote | Behavioral Cue |
|---|---|---|
| 02:24 | "I can't tell if I selected it or not." | Frustrated tone |
| 03:06 | "Honestly, it's kind of confusing..." | Pause (4s) + vocal markers |
| 03:38 | "This is taking a while." | Frustrated tone |
Yet in verbal summaries, these users described the experience as "generally positive."
βββββββββββββββββββ
β Text Scripts β
β (50 interviews) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Audio Files β
β (OpenAI TTS) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββββββββββ
β Gemini 3 Pro β
β (Audio-Direct Analysis) β
β - Extract pauses β
β - Detect vocal markers β
β - Analyze tone β
ββββββββββ¬βββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Behavioral Cues β
β (559 cues extracted)β
ββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Neo4j Knowledge β
β Graph β
β - 50 Interviews β
β - 559 Cues β
β - 2 User Segments β
ββββββββββ¬βββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Say-Do Consistency β
β Score Calculation β
βββββββββββββββββββββββ
Intelligence:
- Gemini 3 Pro (
gemini-3.0-pro) - Multimodal audio analysis - OpenAI TTS (
tts-1) - Synthetic audio generation
Infrastructure:
- Neo4j Aura - Knowledge graph storage
- Python 3.11+ - Analysis pipeline
- Google Cloud Run - Deployment (planned)
π¬ Watch the demo
See the full methodology in action, including:
- Live Neo4j graph queries
- Say-Do Score calculation
- Behavioral cue extraction examples
- Evidence of the 79.8 point bias gap
π Read the full methodology paper
Detailed explanation of:
- Theoretical foundation (Social Desirability Bias)
- Experimental design
- Results analysis
- Limitations & future work
This is a methodology validation study with important limitations:
- Synthetic data only - Real human validation needed
- English language only - Cross-cultural generalization unknown
- Controlled TTS voices - Natural speech variation not tested
- Single product domain - B2B/Enterprise contexts may differ
Next step: Seeking research collaborators for academic conferences (e.g. CHI 2027 paper). Validation study across diverse populations and contexts.
Sample behavioral cue extraction:
{
"interview_id": "end_user_script_03",
"behavioral_cues": [
{
"timestamp": "01:38",
"type": "pause",
"duration": "3 seconds",
"context": "Before confirming selection"
},
{
"timestamp": "02:24",
"type": "frustrated",
"quote": "I can't tell if I selected it or not.",
"context": "Attempting to complete task"
},
{
"timestamp": "03:06",
"type": "vocal_marker",
"quote": "Um, uh, this is kind of confusing",
"context": "Navigation confusion"
}
],
"say_do_score": 3.2,
"interpretation": "LOW CONSISTENCY - Hidden problems detected"
}Seeking research collaborators for academic conferences (e.g. CHI 2027 paper).
If you're a UX researcher or product researcher or similar product roles interested in validating this methodology:
π§ Sign up here
What's involved:
- Provide 1-2 real user research sessions (audio recordings)
- Receive CausalTrack analysis report
- Validate findings against your expert judgment
- Co-author credit on CHI submission (if desired)
Target: Researchers or relevant roles across diverse domains
Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
You are free to:
- Use this methodology for academic research
- Share and adapt the approach
- Cite this work in publications
Under these terms:
- Attribution required - Credit this work appropriately
- Non-commercial - Not for commercial use without permission
For commercial licensing inquiries: kgkazakos@gmail.com
If you use this methodology in your research:
@misc{causaltrack2026,
author = {Kostas Kazakos},
title = {CausalTrack: Audio-Based Behavioral Truth Detection for User Research},
year = {2026},
month = {February},
url = {https://github.com/kgkazakos/causaltrack},
note = {Methodology validation study}
}Built with:
- Gemini 3 Pro by Google DeepMind
- Neo4j graph database
- OpenAI TTS for synthetic audio
Inspired by decades of research on Social Desirability Bias and the Say-Do Gap in behavioral science.
Personal Research Project
Not affiliated with any employer.
π§ Email: kgkazakos@gmail.com
πΌ LinkedIn: www.linkedin.com/in/kazakosk/
Last Updated: February 23, 2026
Status: Methodology Validation Phase
Next Milestone: Scaled validation for academic conference
