- Run builds (
npm run build) to verify your changes compile correctly. - Only run linting (
eslint ./src/) when preparing for a pull request.
- Plugin: AI Toolbox - A personal collection of AI tools to enhance Obsidian workflows.
- Core features: AI-powered transcription and configurable chat workflows with multiple AI providers.
- Target: Obsidian Community Plugin (TypeScript → bundled JavaScript).
- Entry point:
src/main.tscompiled tomain.jsand loaded by Obsidian. - Required release artifacts:
main.js,manifest.json, and optionalstyles.css. - Desktop only (
isDesktopOnly: true) due to external tool dependencies (yt-dlp, ffmpeg).
- Node.js: use current LTS (Node 18+ recommended).
- Package manager: npm (
package.jsondefines npm scripts and dependencies). - Bundler: esbuild (
esbuild.config.mjsand build scripts depend on it). - Types:
obsidiantype definitions.
npm installnpm run devnpm run build- To use eslint install eslint from terminal:
npm install -g eslint - To use eslint to analyze this project use this command:
eslint main.ts - eslint will then create a report with suggestions for code improvement by file and line number.
- If your source code is in a folder, such as
src, you can use eslint with this command to analyze all files in that folder:eslint ./src/
- Organize code into multiple files: Split functionality across separate modules rather than putting everything in
main.ts. - Source lives in
src/. Keepmain.tssmall and focused on plugin lifecycle (loading, unloading, registering commands). - Example file structure:
src/ main.ts # Plugin entry point, lifecycle management settings/ # Settings interfaces, types, and UI index.ts # Settings tab (Providers, Workflows, Settings tabs) types.ts # All type definitions and defaults providers.ts # Provider settings UI workflows.ts # Workflow settings UI additional-settings.ts # Additional settings UI providers/ # AI model provider implementations index.ts # Re-exports types.ts # Provider interfaces (ModelProvider, etc.) provider-factory.ts # Factory for creating providers base-provider.ts # Base provider class openai-provider.ts # OpenAI implementation azure-openai-provider.ts # Azure OpenAI implementation handlers/ # Input and output handlers input/ # Input handlers for acquiring media output/ # Output handlers for presenting results context/ # Context handling utilities processing/ # Audio/video processing and workflow execution workflow-executor.ts # Main workflow execution logic audio-processor.ts # Audio file processing video-processor.ts # Video extraction (yt-dlp) workflow-chaining.ts # Workflow chaining logic components/ # UI components collapsible-section.ts workflow-suggester.ts workflow-type-modal.ts tokens/ # Token/template processing utils/ # Utility functions - Do not commit build artifacts: Never commit
node_modules/,main.js, or other generated files to version control. - Keep the plugin small. Avoid large dependencies. Prefer browser-compatible packages.
- Generated output should be placed at the plugin root or
dist/depending on your build setup. Release artifacts must end up at the top level of the plugin folder in the vault (main.js,manifest.json,styles.css).
- Must include (non-exhaustive):
id(plugin ID; for local dev it should match the folder name)nameversion(Semantic Versioningx.y.z)minAppVersiondescriptionisDesktopOnly(boolean)- Optional:
author,authorUrl,fundingUrl(string or map)
- Never change
idafter release. Treat it as stable API. - Keep
minAppVersionaccurate when using newer APIs. - Canonical requirements are coded here: https://github.com/obsidianmd/obsidian-releases/blob/master/.github/workflows/validate-plugin-entry.yml
- Manual install for testing: copy
main.js,manifest.json,styles.css(if any) to:<Vault>/.obsidian/plugins/<plugin-id>/ - Reload Obsidian and enable the plugin in Settings → Community plugins.
- Any user-facing commands should be added via
this.addCommand(...). - If the plugin has configuration, provide a settings tab and sensible defaults.
- Persist settings using
this.loadData()/this.saveData(). - Use stable command IDs; avoid renaming once released.
- Bump
versioninmanifest.json(SemVer) and updateversions.jsonto map plugin version → minimum app version. - Create a GitHub release whose tag exactly matches
manifest.json'sversion. Do not use a leadingv. - Attach
manifest.json,main.js, andstyles.css(if present) to the release as individual assets. - After the initial release, follow the process to add/update your plugin in the community catalog as required.
Before creating a GitHub release, follow these steps to update version files:
- Bump the
versionfield inmanifest.jsonfollowing Semantic Versioning (x.y.z format) - Ensure the
minAppVersionis accurate for any new Obsidian APIs used
The versions.json file maps plugin versions to minimum Obsidian app versions. Update it based on the release type:
For patch releases (x.y.Z):
- Add the new version entry:
"x.y.z": "minimum-app-version" - Keep all existing entries unchanged
For minor releases (x.Y.0):
- Add the new version entry:
"x.y.0": "minimum-app-version" - For the previous minor version (x.Y-1.*), keep only the latest patch version
- Remove all other patch versions for that minor version
- Example: If releasing 1.2.0, keep only 1.1.3 (remove 1.1.0, 1.1.1, 1.1.2)
For major releases (X.0.0):
- Add the new version entry:
"x.0.0": "minimum-app-version" - For the previous major version (X-1..), keep the latest two minor versions, each with their latest patch version
- Remove all other minor and patch versions for that major version
- Example: If releasing 2.0.0, keep only 1.2.3 and 1.3.2 (remove 1.0.0, 1.1.0, 1.2.1, 1.2.2, 1.3.0, 1.3.1, etc.)
- Run
npx eslint ./src/to check for code quality issues - Fix all errors and warnings before creating the release
- Common issues to watch for:
- Unused imports (remove them)
- Floating promises (add
awaitorvoidoperator) - Sentence case violations in UI text (use lowercase after first word)
- Unnecessary type assertions (remove redundant
ascasts) - Misused promises in callbacks (use
voidfor fire-and-forget async calls) - Console statements (only
console.warn,console.error, andconsole.debugare allowed)
Follow Obsidian's Developer Policies and Plugin Guidelines. In particular:
- Default to local/offline operation. Only make network requests when essential to the feature.
- No hidden telemetry. If you collect optional analytics or call third-party services, require explicit opt-in and document clearly in
README.mdand in settings. - Never execute remote code, fetch and eval scripts, or auto-update plugin code outside of normal releases.
- Minimize scope: read/write only what's necessary inside the vault. Do not access files outside the vault.
- Clearly disclose any external services used, data sent, and risks.
- Respect user privacy. Do not collect vault contents, filenames, or personal information unless absolutely necessary and explicitly consented.
- Avoid deceptive patterns, ads, or spammy notifications.
- Register and clean up all DOM, app, and interval listeners using the provided
register*helpers so the plugin unloads safely.
- Prefer sentence case for headings, buttons, and titles.
- Use clear, action-oriented imperatives in step-by-step copy.
- Use bold to indicate literal UI labels. Prefer "select" for interactions.
- Use arrow notation for navigation: Settings → Community plugins.
- Keep in-app strings short, consistent, and free of jargon.
- Keep startup light. Defer heavy work until needed.
- Avoid long-running tasks during
onload; use lazy initialization. - Batch disk access and avoid excessive vault scans.
- Debounce/throttle expensive operations in response to file system events.
- TypeScript with
"strict": truepreferred. - Keep
main.tsminimal: Focus only on plugin lifecycle (onload, onunload, addCommand calls). Delegate all feature logic to separate modules. - Split large files: If any file exceeds ~200-300 lines, consider breaking it into smaller, focused modules.
- Use clear module boundaries: Each file should have a single, well-defined responsibility.
- Bundle everything into
main.js(no unbundled runtime deps). - Avoid Node/Electron APIs if you want mobile compatibility; set
isDesktopOnlyaccordingly. - Prefer
async/awaitover promise chains; handle errors gracefully.
- Where feasible, test on iOS and Android.
- Don't assume desktop-only behavior unless
isDesktopOnlyistrue. - Avoid large in-memory structures; be mindful of memory and storage constraints.
Do
- Add commands with stable IDs (don't rename once released).
- Provide defaults and validation in settings.
- Write idempotent code paths so reload/unload doesn't leak listeners or intervals.
- Use
this.register*helpers for everything that needs cleanup. - When refreshing collapsible settings sections, preserve the expand state by setting
callbacks.setExpandState({ workflowId: workflow.id })before callingcallbacks.refresh()if the section is currently expanded. - When editing action settings that trigger a UI refresh, use
preserveActionExpandState()callback to maintain expanded state.
Don't
- Introduce network calls without an obvious user-facing reason and documentation.
- Ship features that require cloud services without clear disclosure and explicit opt-in.
- Store or transmit vault contents unless essential and consented.
main.ts (minimal, lifecycle only):
import { Plugin } from "obsidian";
import { MySettings, DEFAULT_SETTINGS } from "./settings";
import { registerCommands } from "./commands";
export default class MyPlugin extends Plugin {
settings: MySettings;
async onload() {
this.settings = Object.assign({}, DEFAULT_SETTINGS, await this.loadData());
registerCommands(this);
}
}settings.ts:
export interface MySettings {
enabled: boolean;
apiKey: string;
}
export const DEFAULT_SETTINGS: MySettings = {
enabled: true,
apiKey: "",
};commands/index.ts:
import { Plugin } from "obsidian";
import { doSomething } from "./my-command";
export function registerCommands(plugin: Plugin) {
plugin.addCommand({
id: "do-something",
name: "Do something",
callback: () => doSomething(plugin),
});
}this.addCommand({
id: "your-command-id",
name: "Do the thing",
callback: () => this.doTheThing(),
});interface MySettings { enabled: boolean }
const DEFAULT_SETTINGS: MySettings = { enabled: true };
async onload() {
this.settings = Object.assign({}, DEFAULT_SETTINGS, await this.loadData());
await this.saveData(this.settings);
}this.registerEvent(this.app.workspace.on("file-open", f => { /* ... */ }));
this.registerDomEvent(window, "resize", () => { /* ... */ });
this.registerInterval(window.setInterval(() => { /* ... */ }, 1000));Workflows are containers for sequential actions. Each workflow can contain multiple actions that execute in order, with later actions able to reference outputs from earlier actions using tokens.
interface WorkflowConfig {
id: string; // Unique identifier
name: string; // Display name
actions: WorkflowAction[]; // Sequential list of actions
outputType: WorkflowOutputType; // 'popup' | 'new-note' | 'at-cursor'
outputFolder: string; // Folder for new-note output
}There are two action types: chat and transcription.
Sends a prompt to an AI model and receives a response.
interface ChatAction extends BaseAction {
type: 'chat';
promptText: string; // The prompt text (when inline)
promptSourceType: PromptSourceType; // 'inline' | 'from-file'
promptFilePath: string; // Path to prompt file (when from-file)
contexts?: ChatContextConfig[]; // Context sources (selection, clipboard, etc.)
}Configuration options:
- Provider: Select which AI provider and model to use
- Prompt source: Inline text or load from a file in the vault
- Prompt text: The prompt template with token placeholders
Transcribes audio/video content using a speech-to-text model.
interface TranscriptionAction extends BaseAction {
type: 'transcription';
transcriptionContext?: {
mediaType: 'video' | 'audio';
sourceUrlToken?: string; // Token containing the URL (e.g., 'workflow.clipboard')
};
language?: string; // ISO language code (e.g., 'en', 'es')
timestampGranularity?: 'disabled' | 'segment' | 'word';
}Configuration options:
- Provider: Select which AI provider and model to use (must support transcription)
- Media type: Video or audio
- Source URL: Token containing the media URL (default:
workflow.clipboard) - Language: Optional language hint for better accuracy
- Timestamp granularity: Disabled (default), segment-level, or word-level
Actions can reference values from workflow context and previous action outputs using {{tokenName}} syntax.
| Token | Description |
|---|---|
{{workflow.selection}} |
Currently selected text in the editor |
{{workflow.clipboard}} |
Contents of the system clipboard |
{{workflow.file.content}} |
Full contents of the active file |
{{workflow.file.path}} |
Path of the active file |
| Token | Description |
|---|---|
{{actionId.prompt}} |
The original prompt text |
{{actionId.response}} |
The AI response text |
| Token | Description |
|---|---|
{{actionId.title}} |
Video title |
{{actionId.author}} |
Video uploader/author |
{{actionId.sourceUrl}} |
Original video URL |
{{actionId.description}} |
Video description |
{{actionId.tags}} |
Video tags (comma-separated) |
{{actionId.transcription}} |
Plain transcription text |
{{actionId.transcriptionWithTimestamps}} |
Transcription with [MM:SS] prefixes (only when timestamps enabled) |
A workflow with two actions:
- Transcription action (id:
abc123) - transcribes a video from clipboard URL - Chat action - summarizes the transcription
The chat action's prompt can reference the transcription output:
Summarize the following video transcript:
Title: {{abc123.title}}
Author: {{abc123.author}}
Transcript:
{{abc123.transcription}}
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐
│ Gather workflow │ ──▶ │ Execute Action 1│ ──▶ │ Execute Action 2│ ──▶ │ Output Handler │
│ context │ │ (store tokens) │ │ (use prev tokens)│ │ (final result) │
└─────────────────┘ └─────────────────┘ └─────────────────┘ └────────────────┘
When editing action settings that trigger a UI refresh (e.g., changing prompt source type), preserve the action's expanded state:
// In action-specific settings functions, use preserveActionExpandState callback
.onChange(async (value) => {
action.someProperty = value;
await plugin.saveSettings();
preserveActionExpandState(); // Sets both workflowId and actionId
callbacks.refresh();
});The ExpandOnNextRenderState interface tracks which items should be expanded:
interface ExpandOnNextRenderState {
providerId?: string;
modelId?: string;
workflowId?: string;
actionId?: string; // Added for action expand state
availableTokensExpanded?: boolean;
}When creating new actions, always create fresh instances of nested objects and arrays:
// Correct: Create new array/object instances
const newWorkflow: WorkflowConfig = {
id: generateId(),
...DEFAULT_WORKFLOW_CONFIG,
actions: [] // New array, not shared reference
};
const newChatAction = {
...DEFAULT_CHAT_ACTION,
id: generateId(),
contexts: [] // New array
};
const newTranscriptionAction = {
...DEFAULT_TRANSCRIPTION_ACTION,
id: generateId(),
transcriptionContext: { mediaType: 'video', sourceUrlToken: 'workflow.clipboard' } // New object
};The plugin uses a handler-based architecture to separate concerns for workflow execution:
Input handlers acquire media for transcription actions. Each handler implements the InputHandler interface:
interface InputHandler {
getInput(context: InputContext): Promise<InputResult | null>;
}Available input handlers:
VaultFileInputHandler- Prompts user to select an audio file from the vaultClipboardUrlInputHandler- Extracts audio from a video URL in the clipboard (uses yt-dlp)SelectionUrlInputHandler- Extracts audio from a video URL in the current text selectionTokenUrlInputHandler- Extracts audio from a URL resolved from a token value
InputResult structure:
interface InputResult {
audioFilePath: string; // Absolute path to the audio file
sourceUrl?: string; // Source URL if extracted from video
metadata?: VideoMetadata; // Title, uploader, description, tags
}Output handlers present workflow results to the user. Each handler implements the OutputHandler interface:
interface OutputHandler {
handleOutput(responseText: string, context: OutputContext): Promise<void>;
}Available output handlers:
PopupOutputHandler- Displays result in a modal popupNewNoteOutputHandler- Creates a new note with the resultAtCursorOutputHandler- Inserts result at the current cursor position
- Input handler classes end with
InputHandler(e.g.,VaultFileInputHandler) - Output handler classes end with
OutputHandler(e.g.,PopupOutputHandler) - Handler files use kebab-case matching the class name (e.g.,
vault-file-input-handler.ts)
To add a new input handler:
- Create a new file in
src/handlers/input/(e.g.,my-custom-input-handler.ts) - Implement the
InputHandlerinterface - Export from
src/handlers/input/index.tsandsrc/handlers/index.ts
To add a new output handler:
- Create a new file in
src/handlers/output/(e.g.,my-custom-output-handler.ts) - Implement the
OutputHandlerinterface - Export from
src/handlers/output/index.tsandsrc/handlers/index.ts - Add the new output type to
WorkflowOutputTypeinsrc/settings/types.ts - Update
createOutputHandler()insrc/processing/workflow-executor.ts
- Plugin doesn't load after build: ensure
main.jsandmanifest.jsonare at the top level of the plugin folder under<Vault>/.obsidian/plugins/<plugin-id>/. - Build issues: if
main.jsis missing, runnpm run buildornpm run devto compile your TypeScript source code. - Commands not appearing: verify
addCommandruns afteronloadand IDs are unique. - Settings not persisting: ensure
loadData/saveDataare awaited and you re-render the UI after changes. - Mobile-only issues: confirm you're not using desktop-only APIs; check
isDesktopOnlyand adjust.
- Obsidian sample plugin: https://github.com/obsidianmd/obsidian-sample-plugin
- API documentation: https://docs.obsidian.md
- Developer policies: https://docs.obsidian.md/Developer+policies
- Plugin guidelines: https://docs.obsidian.md/Plugins/Releasing/Plugin+guidelines
- Style guide: https://help.obsidian.md/style-guide