This document describes the implementation of the Logseq directory import and file synchronization features using a simplified DDD architecture.
This implementation provides the core file processing system for importing and syncing Logseq markdown directories. It follows Domain-Driven Design principles while maintaining pragmatism suitable for a personal project.
The implementation follows a three-layer architecture:
┌─────────────────────────────────────────────────────┐
│ Application Layer │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ ImportService │ │ SyncService │ │
│ │ - Concurrent │ │ - File watching │ │
│ │ processing │ │ - Debouncing │ │
│ │ - Progress │ │ - Auto-sync │ │
│ │ tracking │ │ │ │
│ └────────────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Domain Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐│
│ │ Page │ │ Block │ │ Events ││
│ │ (Aggregate) │ │ (Entity) │ │ ││
│ └──────────────┘ └──────────────┘ └────────────┘│
│ ┌─────────────────────────────────────────────────┤
│ │ Value Objects: │
│ │ - PageId, BlockId, Url, PageReference │
│ │ - LogseqDirectoryPath, ImportProgress │
│ └──────────────────────────────────────────────────┘
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ Infrastructure Layer │
│ ┌────────────────────┐ ┌────────────────────────┐ │
│ │ File System │ │ Parsers │ │
│ │ - Discovery │ │ - Markdown parser │ │
│ │ - Watcher │ │ - URL extraction │ │
│ │ - Debouncer │ │ - Reference extract │ │
│ └────────────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────┘
New additions:
-
LogseqDirectoryPath: Validated directory path containingpages/andjournals/subdirectories- Validates directory exists and has required structure
- Provides convenient accessors for subdirectories
-
ImportProgress: Tracks import operation progress- Total files, processed files, current file
- Percentage calculation for UI display
Existing (reused):
PageId,BlockId,Url,PageReference,BlockContent,IndentLevel
Import Events:
ImportStarted- Import operation beginsFileProcessed- Individual file processedImportCompleted- Import finished successfullyImportFailed- Import failed with errors
Sync Events:
SyncStarted- Sync operation beginsFileCreatedEvent- New file detected and syncedFileUpdatedEvent- File modified and syncedFileDeletedEvent- File deleted and syncedSyncCompleted- Sync batch completed
Converts Logseq markdown files into Page and Block domain objects.
Features:
- Async file reading with Tokio
- Indentation-based hierarchy parsing (tabs or 2-space indents)
- Bullet point marker removal (
-,*,+) - URL extraction (http:// and https://)
- Page reference extraction (
[[page]]) - Tag extraction (
#tag) - Proper parent-child block relationships
Example:
- Root block with https://example.com
- Child block mentioning [[another page]]
- Another child with #tag
- Second root blockBecomes:
Page
├─ Block 1 (indent 0) + URL
│ ├─ Block 1.1 (indent 1) + PageReference
│ └─ Block 1.2 (indent 1) + Tag
└─ Block 2 (indent 0)
Functions:
discover_markdown_files(dir)- Recursively find all.mdfilesdiscover_logseq_files(dir)- Find.mdfiles inpages/andjournals/
Features:
- Skips hidden directories (starting with
.) - Skips Logseq internal directory (
logseq/) - Async with Tokio
LogseqFileWatcher - Watches directory for file changes using notify crate
Features:
- Cross-platform file watching (
RecommendedWatcher) - Built-in debouncing (500ms default) using
notify-debouncer-mini - Filters to only
.mdfiles inpages/orjournals/ - Event types: Created, Modified, Deleted
- Non-blocking (
try_recv) and blocking (recv) modes
Handles importing entire Logseq directories.
Features:
- Bounded concurrency: Processes 4 files concurrently (configurable with
with_concurrency()) - Progress tracking: Real-time progress updates via callbacks
- Error resilience: Continues processing if individual files fail
- Progress events:
Started,FileProcessed,Completed,Failed
Usage:
let mut service = ImportService::new(repository)
.with_concurrency(6);
let summary = service.import_directory(
directory_path,
Some(progress_callback)
).await?;
println!("Imported {}/{} files",
summary.pages_imported,
summary.total_files
);Implementation Details:
- Uses
tokio::sync::Semaphorefor bounded concurrency - Async channel (
mpsc) for collecting results - Tracks errors without stopping import
- Returns
ImportSummarywith statistics
Handles incremental updates when files change.
Features:
- File watching: Monitors directory for changes
- Debouncing: 500ms window to handle rapid changes (configurable)
- Auto-sync: Runs indefinitely watching for changes
- Event callbacks: Real-time sync event notifications
Usage:
let service = SyncService::new(
repository,
directory_path,
Some(Duration::from_millis(500))
)?;
service.start_watching(Some(sync_callback)).await?;Sync Operations:
- Create: Parse new file and save to repository
- Update: Re-parse modified file and update repository
- Delete: Log deletion (full implementation needs file→page mapping)
Note: File deletion handling is simplified. A production implementation would maintain a bidirectional mapping between file paths and page IDs.
All components include unit tests:
-
Value Objects (
value_objects.rs)LogseqDirectoryPathvalidationImportProgresstracking and percentage calculation
-
Domain Events (
events.rs)- Event type and aggregate ID verification
- All import and sync events tested
-
Markdown Parser (
logseq_markdown.rs)- Indentation calculation
- Content extraction (bullet point removal)
- URL extraction
- Page reference and tag extraction
- Full markdown parsing with hierarchy
-
File Discovery (
discovery.rs)- Recursive file discovery
- Logseq-specific directory filtering
- Uses
tempfilefor isolated tests
-
File Watcher (
watcher.rs)- Event filtering (markdown files only)
- Logseq directory filtering
-
Import Service (
import_service.rs)- Import summary statistics
- Success rate calculation
- Mock repository for isolated testing
Create integration tests in backend/tests/:
#[tokio::test]
async fn test_full_import_workflow() {
// 1. Create temporary Logseq directory
// 2. Add sample markdown files
// 3. Run ImportService
// 4. Verify all pages imported
// 5. Verify block hierarchy preserved
}
#[tokio::test]
async fn test_file_sync_workflow() {
// 1. Import initial files
// 2. Modify a file
// 3. Verify SyncService detects change
// 4. Verify repository updated
}notify = "6.1" # Cross-platform file watching
notify-debouncer-mini = "0.4" # Event debouncing
tokio = { version = "1.41", features = ["fs", "rt-multi-thread", "macros", "sync", "time"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
thiserror = "2.0" # Error handling
anyhow = "1.0"
tracing = "0.1" # Structured logging
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
uuid = { version = "1.11", features = ["v4", "serde"] }tempfile = "3.14" # Temporary directories for testsFollowing the "pragmatic DDD for personal projects" philosophy:
- No Complex Event Sourcing: Events are for notifications, not persistence
- Direct Callbacks: No event bus/CQRS complexity
- Simple Error Handling: Continue on error, collect failures
- File System as Source of Truth: No conflict resolution needed
- In-Memory Progress: No import session persistence
- Simplified Deletion: Log only (full implementation deferred)
- SQLite Persistence: Implement
PageRepositorywith SQLite - File→Page Mapping: Enable proper deletion handling
- Error Retry: Simple retry for transient errors (file locks)
- Metrics: Track import/sync performance
- Full-Text Search: Integrate Tantivy for BM25 search
- Tauri Integration: Add commands and event emitters
- UI Progress: Real-time import/sync status display
- Configuration: Debounce duration, concurrency limits
- Semantic Search: fastembed-rs + Qdrant integration
- URL Metadata: Parse and index linked content
- Advanced Conflict Resolution: Handle simultaneous edits
- Performance Optimization: Incremental parsing, caching
cargo buildcargo testcargo test --libcargo test --test integration_testRUST_LOG=debug cargo testuse backend::application::{ImportService, SyncService};
use backend::domain::value_objects::LogseqDirectoryPath;
use std::sync::Arc;
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Initialize repository (mock for now, SQLite later)
let repository = MockPageRepository::new();
// Validate Logseq directory
let dir_path = LogseqDirectoryPath::new("/path/to/logseq")?;
// Import the directory
let mut import_service = ImportService::new(repository.clone())
.with_concurrency(6);
let progress_callback = Arc::new(|event| {
match event {
ImportProgressEvent::Started { total_files } => {
println!("Starting import of {} files", total_files);
}
ImportProgressEvent::FileProcessed { file_path, progress } => {
println!("Processed {} ({:.1}%)",
file_path.display(),
progress.percentage()
);
}
ImportProgressEvent::Completed { pages_imported, duration_ms } => {
println!("Imported {} pages in {}ms", pages_imported, duration_ms);
}
ImportProgressEvent::Failed { error, files_processed } => {
eprintln!("Import failed after {} files: {}", files_processed, error);
}
}
});
let summary = import_service.import_directory(
dir_path.clone(),
Some(progress_callback)
).await?;
println!("Import complete: {}/{} files ({}% success)",
summary.pages_imported,
summary.total_files,
summary.success_rate()
);
// Start sync service
let sync_service = SyncService::new(
repository,
dir_path,
Some(Duration::from_millis(500))
)?;
let sync_callback = Arc::new(|event| {
match event {
SyncEvent::FileCreated { file_path } => {
println!("New file: {}", file_path.display());
}
SyncEvent::FileUpdated { file_path } => {
println!("Updated: {}", file_path.display());
}
SyncEvent::FileDeleted { file_path } => {
println!("Deleted: {}", file_path.display());
}
_ => {}
}
});
// This runs indefinitely
sync_service.start_watching(Some(sync_callback)).await?;
Ok(())
}