This repository serves as an index of experimental projects, evaluations, proof-of-concepts, templates, patterns, and exploratory ideas related to AI/LLM development and workflows.
This collection brings together various experimental repositories exploring AI agent workflows, LLM capabilities, evaluation frameworks, and development patterns. These repositories represent hands-on experiments, proof-of-concepts, benchmarking efforts, and reusable templates for AI-driven development.
| Repository | Category | Key Finding |
|---|---|---|
| Whisper Fine-Tune Accuracy Eval | Speech | Smaller models improve with fine-tuning; larger models degrade unless handling code-switching |
| One-Shot Transcription Microphone Eval | Speech | Environment matters more than equipment cost for STT accuracy |
| Transcription Cleanup Eval | Speech | Compares cloud models on single-step transcription + cleanup |
| Whisper WPM Background Noise Eval | Speech | Speaking pace and background noise impact on Whisper accuracy |
| Long Form Audio Eval | Speech | Long-form audio transcription evaluation |
| Local ASR STT Benchmark | Speech | Local ASR/STT benchmarking |
| Hebrew Image Generation Eval | Image | Hebrew text rendering in AI image generation |
| Bias Censorship Eval Tests | LLM | Testing for bias and censorship in LLMs |
| Repository | Category | Description |
|---|---|---|
| Voice Cloning Difference Test | Speech | How training data duration affects voice cloning quality |
| Text Cleanup Fine-Tuning Set | Speech | Dataset for training AI to clean up STT transcripts |
| Voice Cleanup Prompt Experiment | Speech | Comparing OpenAI vs Gemini for transcript cleanup |
| Impact Bond Policy Simulator | Multi-Agent | Simulating stakeholder reactions to policy proposals |
| Peace In The Middle East | Multi-Agent | AI simulation of geopolitical dialogue |
| Weird AI Experiment Ideator | Multi-Agent | Blind multi-pass review for generating experiment ideas |
| LLM Long Codegen Test | LLM | Testing long-form code generation |
| Single Shot Brevity Training | LLM | Training for concise responses |
| Repository | Link | Description | Date |
|---|---|---|---|
| Microphone Audio Samples | Collection of microphone audio samples | 2025 |
| Repository | Link | Description | Date |
|---|---|---|---|
| Hebrew Image Generation Eval | Evaluation of AI image generation models for Hebrew text rendering | 2025 |
| Repository | Link | Description | Date |
|---|---|---|---|
| OSINT Missile Intelligence Agent | OSINT-focused intelligence agent | 2025 |
| Repository | Link | Description | Date |
|---|---|---|---|
| GHG EBITDA Correlations | Analysis of greenhouse gas and EBITDA correlations | 2025 |
| Repository | Link | Description | Date |
|---|---|---|---|
| Test Markdown Docs | Test repository for markdown documentation | 2025 | |
| Test System Prompts | Test repository for system prompts | 2025 |
| Index | Link | Description |
|---|---|---|
| Speech & ASR Evaluations | Comprehensive index of speech recognition and ASR evaluation studies |
Note: This is a focused index covering experimental AI/LLM development projects. For a higher-level collection of all repository indexes and other projects, see the GitHub Master Index.
Daniel Rosehill Contact: public@danielrosehill.com Website: danielrosehill.com