Skip to content

Feature/refactor: Stacks (notebooks), chat improvements, model management#1

Open
aniketkno wants to merge 82 commits into
mainfrom
REFACTOR-0001
Open

Feature/refactor: Stacks (notebooks), chat improvements, model management#1
aniketkno wants to merge 82 commits into
mainfrom
REFACTOR-0001

Conversation

@aniketkno
Copy link
Copy Markdown
Collaborator

Summary
Major refactoring bringing together stacks (formerly notebooks), improved chat experience, and unified model management across the app.
Changes
UI Improvements

  • FAB Positioning: Chat FAB now positioned above other FABs (Stacks/Notebooks) with 80dp bottom margin
  • Pinned Chat TopBar: Top bar no longer collapses on scroll
  • AI Message Alignment: AI messages now extend to right edge of screen
    Stacks (formerly Notebooks)
  • Renamed notebooks to "stacks" throughout the codebase
  • Added systemPrompt and agentPrompt support for AI context
  • Streaming AI analysis results in real-time
    Chat Enhancements
  • Parts-based message architecture (Text, Reasoning, ToolCall, ToolResponse, Image, Audio)
  • Rich message rendering with Markdown and code blocks
  • Conversation persistence across sessions
  • File attachment support
  • Context panel with over-drag reveal
    Model Management
  • Unified model status indicator across all tabs
  • Load/unload model functionality
  • Model download UI with progress visualization
  • Shared model state between screens
    Audio Features
  • Audio recording and playback
  • Speech recognition integration
  • Audio transcription with Whisper
    Data Layer
  • Database schema v6 with new fields (systemPrompt, agentPrompt)
  • New DAOs for stacks, conversations, messages
  • ONNX MiniLM embedder for semantic search
  • Vector store for retrieval-augmented generation
    Stats
  • 131 files changed
  • ~25,500 insertions
  • ~2,800 deletions
    Testing
    Build verified with ./gradlew :app:compileDebugKotlin

aniketkno added 30 commits May 6, 2026 21:52
- Convert to Kotlin DSL build files
- Add core modules: ai, data, media, processing, ui
- Add feature modules: chat, inference, notebooks, process, settings
- Add version catalog (libs.versions.toml) for dependency management
- Configure AndroidManifest for feature modules
- Add local.defaults.properties for SDK paths
Core modules:
- core/ai: ML Kit inference bridge interface
- core/data: Room database with notebooks entity
- core/media: Media processing utilities
- core/processing: Processing utilities
- core/ui: Shared UI components

Feature modules:
- feature/chat: AI chat interface
- feature/inference: LLM inference screen
- feature/notebooks: Notebook editor with block-based UI
- feature/process: Data processing screen
- feature/settings: App settings with model management
- Add MainScreen with bottom navigation (5 tabs)
- Add MainComposeActivity for Compose entry point
- Update app/build.gradle.kts for new module structure
- Add WebSearchCapture utility class
- Update MainActivity to use Compose
- Update PenpalApplication for dependency injection
- Update AndroidManifest with feature permissions
- Update DrawingView with Compose integration
- Update themes and layouts
- Update GemmaServerClient for new inference bridge
- Update GemmaTranscriber with improved transcription
- Update HandwritingRecognizer integration
- Update InferenceService with feature module support
- Update LlmInferenceEngine for LiteRT-LM
- Update ProcessingQueueManager for feature modules
- Update SelectionFrameView with Compose integration
- Ignore build outputs for feature modules
- Add exceptions for local.defaults.properties
- Update to exclude new module directories
- Add ARCHITECTURE.md with module overview and dependency graph
- Update CHANGELOG.md with recent changes
- Update DEVELOPMENT.md with setup instructions
- Update README.md with project status
- Add MIGRATION.md documenting conversion steps
- Remove old build.gradle (replaced by build.gradle.kts)
- Remove old settings.gradle (replaced by settings.gradle.kts)
- Remove app/build.gradle (replaced by app/build.gradle.kts)
- Remove TestReflection.kt (no longer needed)
- Add ARCHITECTURE.md with system diagrams
- Add DATA_AND_PROCESSING.md with data flow
- Add FEATURES.md with feature specifications
- Add MODULES.md with module details
- Add THREADING.md with concurrency model
- Add reference images and diagrams
- Add WorkManager dependency to core:ai/build.gradle.kts
- Replace LmEngineManager with placeholder implementation
- Fix ModelDownloadManager to use getWorkInfosForUniqueWorkFlow()
- Remove unavailable LiteRT-LM imports (marked for future)
- Update gradle/libs.versions.toml with latest.release for litertlm
- Add litertlm-android dependency to core/ai/build.gradle.kts
- Rewrite LmEngineManager using com.google.ai.edge.litertlm.Engine
- Rewrite LiteRtInferenceBridge using real LiteRT-LM API:
  - Engine/Conversation pattern from InferenceService
  - GPU/CPU backend fallback support
  - MessageCallback for streaming responses
  - Image and audio content support
- Copy ModelManager from main branch (com.google.ai.edge.litertlm package)
- Supports HuggingFace and Kaggle model download sources
- Uses Android DownloadManager for reliable downloads
- Handles model file location and path persistence
- Downloads from: https://huggingface.co/litert-community/gemma-4-E2B-it-litert-lm
- SettingsViewModel: Integrate ModelManager for HuggingFace/Kaggle downloads
- SettingsViewModel: Add download polling with progress updates
- SettingsViewModel: Remove OllamaApiService dependency
- SettingsScreen: Simplify UI (remove model selector, add download dialog)
- Add ModelDownloadBottomSheet: Compose component for download flow
  - Source selection (HuggingFace/Kaggle)
  - Token input for authentication
  - Progress indicator with percentage
  - Error handling with retry option
- Add DownloadSource enum and helper functions
- Add model readiness check with isModelReady state
- Build prompt with document context from vector store
- Use placeholder assistant message for streaming updates
- Update last assistant message as inference streams
- Add isModelReady indicator to UI state
- Fix ChunkEntity import (text not content field)
- Reset conversation on clear chat
- Add ModelInfo data class with serialization
- Track model history in SharedPreferences
- listAvailableModels() scans .litertlm files across device
- loadModel() initializes specific model by path
- deleteModel(path) removes specific model file
- Persist discovered paths and support manual downloads
- Add Ollama deleteModel API endpoint
- Remove duplicate ModelManager from app module
- Update all imports to com.penpal.core.ai.ModelManager
- Fix ChatViewModel streaming to update on every token
- Model now streams partial results live instead of waiting for done
- Add AvailableModelsSection to SettingsScreen
- Show all .litertlm models with load/delete actions
- Add model list management events to SettingsViewModel
- Keep InferenceScreen for backward compatibility
- Remove Process and Inference from bottom navigation
- Navigation now: Chat, Think, Settings only
- Add ProcessBlock type for PDF/Audio/Image/URL/Code/File
- Add process toolbar buttons and FAB menu items
- ProcessBlockContent shows status with visual indicators
- Support PENDING/QUEUED/RUNNING/DONE/ERROR states
- Document LiteRT-LM Engine API integration
- Add model management architecture diagrams
- Update CHANGELOG with model download and UI changes
- Add TODO.md for tracking pending work
…line

Add production-ready parsers for PDF (PdfBox), images (ML Kit OCR),
URLs (Jsoup), code files, and audio metadata. Wire ExtractionWorker
to use ParserFactory by mime type and persist chunks to vector store
via VectorStoreProvider. Add jsoup, mlkit-text-recognition, and
onnxruntime dependencies to version catalog.
…leration

Implement OnnxMiniLmEmbedder with mean pooling and L2 normalization,
fallback to deterministic mock embedder if model missing. Add
VectorStoreProvider singleton for cross-module access from workers.
Switch LiteRtInferenceBridge and LmEngineManager to GPU-first with
CPU fallback for Gemma 4 on-device inference.
…igned UI

Upgrade Room schema to v3 with ChatConversationEntity and updated
ChatMessageDao. Overhaul ChatViewModel with conversation CRUD,
notebook attachment/detachment for RAG context, and file picker
integration. Redesign ChatScreen with ModalNavigationDrawer for
history, attached notebook chips, pinned file chips, and improved
input area with paperclip attachment.
Automatically enqueue ProcessBlocks when sourceUri is set in
NotebookEditorViewModel. Observe WorkerLauncher job states and
update block status live (PENDING -> QUEUED -> RUNNING -> DONE/ERROR).
Add core:processing dependency to notebooks module.
Wire ChatViewModel with all DAOs and WorkerLauncher in MainScreen.
Add OnnxMiniLmEmbedder to PenpalApplication DI with initialization
check and fallback. Configure app build for arm64-v8a and optimize
packaging to reduce APK size from 160MB to 83MB.
Update ARCHITECTURE.md to Phase 5 with Room v3, ONNX embedder,
real parsers, and chat redesign. Add [Unreleased] section to
CHANGELOG.md documenting document parsing, vector store persistence,
ONNX embeddings, GPU acceleration, and chat overhaul. Update
README.md capabilities and TODO.md completion status.
- Update ARCHITECTURE.md with MessagePart hierarchy and Flow-based inference
- Update CHANGELOG.md with structured message parts sprint details
- Add CORE_AI.md, CORE_AI_SIMPLIFY.md, CORE_AI_REFERENCES.md
- Update README.md with LiteRT-LM Engine API and parts architecture
- Update TODO.md with completed sprint items
- Update DEVELOPMENT.md, MIGRATION.md, and testingground docs
- Add GemmaSpecialTokens.kt with all Gemma 4 control token definitions
- Add StreamingTokenFilter.kt with trie-based character filtering
- Add WordPieceTokenizer.kt for tokenization support
- Add MessagePart.kt sealed class hierarchy (Text, Reasoning, ToolCall, ToolResponse, Image, Audio)
- Add tokenizer vocab.txt asset
- Update InferenceBridge interface with runInferenceFlow() and runInferenceFlowParts()
- Rewrite LiteRtInferenceBridge with Flow-based inference and MessagePart parsing
- Update OllamaInferenceBridge for parity with Flow-based API
- Update OnnxMiniLmEmbedder with mean pooling and L2 normalization
- Update ChatScreen with MessagePart rendering (Thinking blocks, ToolCall cards)
- Add MarkdownText.kt lightweight markdown renderer (code blocks, bold, italic, lists)
- Add collapsible ReasoningBlock and expandable ToolCallBlock UI components
- Update chat build.gradle.kts dependencies
…tence

- Migrate from callback-based to Flow-based inference collection
- Add MessagePart aggregation and structured message building
- Fix loadConversation() to preserve pending assistant message during streaming
- Add notebook attachment and file pinning support
- Add conversation persistence with Room database
aniketkno and others added 30 commits May 9, 2026 17:04
- Add defaultSystemPrompt to SettingsUiState
- Load/save from SharedPreferences
- Add UpdateDefaultSystemPrompt event and handler
- Add UI section for editing default system prompt
- Add ChatPromptsPanel with over-drag reveal
- Add defaultSystemPrompt to ChatUiState, loaded from SharedPreferences
- Prepend system prompt (per-conversation or default) to buildPrompt()
- Persist per-conversation system prompt to database
- Add systemPrompt, agentPrompt columns to StackEntity
- Add agentPrompt column to ChatConversationEntity
- Add updatePrompts method to StackDao
- Add optional agentPrompt parameter to enqueue() method
- Pass agentPrompt through to ExtractionWorker via workData
- Add KEY_AGENT_PROMPT constant to ExtractionWorker
Added columns to existing entities:
- StackEntity: systemPrompt, agentPrompt
- ChatConversationEntity: agentPrompt

This is a destructive migration that clears old data.
- Update block status to RUNNING with extracted text during streaming
- Only mark as DONE after collection completes
- Show incremental text updates as AI generates them
- Fix LiteRtInferenceBridge to use getContents() instead of getContent()
  The LiteRT Message class returns a Contents object, not a simple string.
  This was causing empty responses in chat despite the model working.
- Add conversation history to prompts for multi-turn context
- Update FAB visibility: shows on tabs, hides in chat, reappears after exit
- Tab clicks from chat trigger popBackStack() (same as X button)
- Add debug logging for inference flow and model status
- Update documentation with all changes
- Add 80dp bottom margin to chat FAB on Stacks/Notebooks screens
- Pin chat TopBar to prevent collapse on scroll
- Extend AI messages to right edge with fillMaxWidth()
- Update documentation with UI polish changes
…upport

- Add syntax highlighting for multiple languages (Python, Kotlin, Java, JavaScript, TypeScript, Go, Rust, C, C++, Swift, SQL, Bash, JSON, YAML, HTML, CSS)
- Implement LaTeX block and inline rendering using MathJax via Android WebView
- Improve code block presentation with dark background and monospace font
- Add copy-to-clipboard functionality for LaTeX expressions on long press
- Maintain all existing markdown features (bold, italic, headers, lists, links, inline code)
- Hide FAB on Stacks and Notebooks routes to avoid duplication
- Add onNavigateToChat callback to StackScreen and other screens
- Improve FAB visibility logic for cleaner UI
- Add chat FAB alongside create notebook FAB
- Pass onNavigateToChat callback for chat navigation
- Maintain existing create notebook functionality
- Move Record Audio button to different position in toolbar
- Adjust marginBottom from 8dp to 16dp for consistent spacing
- Maintain all other UI elements
…cator

- Add cancel button to ChatInputArea for cancelling ongoing operations
- Replace inline message rendering with enhanced MarkdownText for assistant messages
- Show loading spinner when inference is in progress
- Conditionally enable/disable input based on loading state
- Add Cancel event to ChatEvent enum
- Implement cancelInference() to stop ongoing inference
- Add retryLastMessage() with intelligent continuation
- Fix newline character handling in message storage
- Improve inference logging for debugging
- Add combinedClickable for long-press selection mode
- Add isSelectionMode and isSelected parameters to StackCard
- Add checkbox display when in selection mode
- Add onNavigateToChat callback and chat FAB
- Add rename dialog support
… panel

- Reorganize StackScreen for better readability with improved spacing
- Extract over-drag panel logic into separate PromptsPanel component
- Add onNavigateToChat callback for chat navigation from stacks
- Improve permission launcher handling for media access
- Add isSelectionMode, selectedStackIds, showRenameDialog to StackListState
- Add enterSelectionMode, toggleStackSelection, exitSelectionMode methods
- Add showRenameDialog, dismissRenameDialog, renameStack methods
- Add deleteSelectedStacks for batch deletion
- Add autoRenameFromFirstBlock in StackEditorViewModel using AI inference
- Add hasAutoRenamed flag to prevent re-renaming
…olRegistry)

- Define Tool interface with execute() method for AI function calling
- Define ToolSchema and ToolParameter for OpenAI-compatible JSON schemas
- Define ToolResult sealed class (Success, Error, StreamOutput)
- Define MessageRole enum and ChatMessageInfo for conversation context
- Define ToolExecutionContext with tool call details and conversation history
- Implement ToolRegistry for tool registration and execution
- Implement ToolExecutor for multi-turn tool-calling loop with model
- Add WebSearchTool for DuckDuckGo HTML search
- Add FetchUrlContentTool for URL content extraction with JSoup
- Add StoreWebContentTool for storing web content in vector store
- Add ListStoredSourcesTool and DeleteStoredSourceTool for source management
- Add ToolRegistryWebSearchBuilder for easy tool registration
- Add jsoup dependency for HTML parsing
- Add SearchKnowledgeTool for vector store similarity search
- Add ReadStackTool for reading notebook content
- Add GetConversationHistoryTool for conversation context
- Add ListAttachedStacksTool for listing attached notebooks
- Add ToolRegistryBuilder for fluent tool registration
- Add ChunkDao.getAllSourceIds() for querying distinct source IDs
- Add VectorStoreRepository.getAllSourceIds() interface method
- Implement getAllSourceIds in VectorStoreRepositoryImpl
- Add toolRegistry parameter for tool execution
- Add executeToolAndContinue() for multi-turn tool calling
- Add executeToolCallsAfterInference() for post-inference tool execution
- Add buildToolContinuationPrompt() for tool response injection
- Execute detected ToolCallParts after inference completes
- Document Agent Framework Flow in ARCHITECTURE.md with tool execution loop diagram
- Add Tool Registry & Schema section with code examples
- Update phase status table with Phase 6.0: Agent Framework
- Add Agent Framework section to FEATURES.md with built-in tools list
- Add comprehensive Agent Framework documentation to TODO.md
ToolRegistry.execute() and Flow.collect() are suspend functions that must be
called from a coroutine or another suspend function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants