Skip to content

Latest commit

 

History

History
265 lines (205 loc) · 9.55 KB

File metadata and controls

265 lines (205 loc) · 9.55 KB

SpaceHoarder Project Structure

Overview

SpaceHoarder is a GTK3-based disk space visualization tool that uses interactive treemap visualizations to help users understand and manage their storage usage. The application features high-performance rendering with optional Rust acceleration.


Root Directory

/home/ray/Desktop/files/wrk/sphoard exp (copy)/
├── main.py                      # Application entry point
├── spacehoarder/                # Main application package
├── scanners/                    # Experimental scanner implementations
├── doc/                         # Documentation and benchmarks
├── benchmark_accelerator.py     # Performance testing for Rust accelerator
├── verify_accelerator.py        # Validation tests for accelerator
├── verify_import.py             # Import verification script
├── check_assignment.py          # Development utility
├── check_cairo_ptr.py           # Cairo pointer validation utility
├── goals.txt                    # Project TODOs
├── notes.md                     # Performance tuning documentation
└── .agent/                      # AI agent workflow definitions

Core Application (spacehoarder/)

The main application package containing all GTK3 UI components and business logic.

Files

File Purpose
__init__.py Package initializer
main_window.py Primary application window (SpaceHoarderApp, SpaceHoarderWindow)
header_bar.py GTK HeaderBar implementation with navigation controls
treemap_widget.py Core rendering engine - Highly optimized treemap visualization widget
scanner.py Directory scanning logic with size calculation
gio_scanner.py GIO-based scanner for virtual filesystems (FTP, SMB, WebDAV, etc.)
resolvers.py Path resolution strategies (FUSE → native backend URIs)
models.py Data models for directory/file representation
config.py Application configuration and color themes
utils.py Utility functions (e.g., hex to RGB conversion)

Key Components

main_window.py (436 lines, 26 functions/classes)

  • SpaceHoarderApp: GTK Application class handling CLI arguments and activation
  • SpaceHoarderWindow: Main window with:
    • Treemap visualization integration
    • Thread-safe directory scanning
    • Context menu operations (copy path, open, show in file manager, trash)
    • Navigation controls (up directory, refresh)
    • D-Bus integration for file manager highlighting

treemap_widget.py (Performance-Optimized)

High-performance Cairo-based treemap renderer with:

  • Squarified treemap layout algorithm
  • Size culling (configurable via CULL_SIZE constant)
  • Smart text rendering with adaptive character estimation
  • Loop-invariant optimization for color calculations
  • Interactive features: click navigation, right-click context menus
  • Theme-aware rendering (light/dark mode support)

Performance Tunables (documented in notes.md):

  • CULL_SIZE = 2.0: Skip boxes smaller than 2 pixels
  • TEXT_THRESHOLD: Minimum box size for text rendering
  • Character width estimation multiplier (0.4-0.6 range)

scanner.py

Directory traversal and size calculation with:

  • Recursive directory scanning
  • Size aggregation
  • Error handling for permission issues
  • Optional Rust accelerator integration

gio_scanner.py (~630 lines, Network-Optimized)

GIO-based scanner for virtual filesystems (FTP, SMB, WebDAV).

Scan Modes:

  • Deep Threading (_WorkQueueScanner): Work-queue based parallel scanning. All threads share a queue of directories. Supports adaptive concurrency (ADAPTIVE_SCANNING=True ramps threads up/down based on server stability).
  • Single-Threaded (_recursive_scan_gio): Fallback for debugging or servers rejecting concurrent connections. Triggered by MAX_SCAN_THREADS=1.
  • Async (experimental): Dispatches to gio_async.py when USE_ASYNC_SCANNER=True.

Features:

  • Adaptive concurrency management (auto-adjusts thread count based on success/failure)
  • Retry logic for transient network errors (via gio_guards)
  • Content-type sniffing to handle misidentified files (e.g., .lnk files)
  • Protocol detection (FTP, SMB, WebDAV)
  • Integration with path resolvers to bypass FUSE layer

Note: Shallow parallel scanning and fixed-thread modes were removed in Jan 2026.

  • Shallow parallel was redundant — deep threading handles both wide and deep trees.
  • Fixed threads can be achieved by setting ADAPTIVE_SCANNING=False. The DEEP_THREADING config flag is now deprecated (always True when MAX_SCAN_THREADS > 1).

resolvers.py (162 lines)

Path resolution utilities for network filesystems:

  • GMount resolution (preferred): Bypasses kernel FUSE layer entirely
  • String parsing fallback: Manual extraction from FUSE paths
  • Converts FUSE/GVfs paths to native backend URIs (e.g., ftp://, dav://)
  • Configurable fallback strategies via config.ENABLE_FUSE_FALLBACK

Rust Acceleration (spacehoarder/treemap_accelerator_rust_source/)

Structure

treemap_accelerator_rust_source/
├── Cargo.toml          # Rust project manifest
├── Cargo.lock          # Dependency lock file
├── src/
│   └── lib.rs          # Rust layout calculation implementation
└── target/             # Compiled Rust artifacts (build output)

Purpose

Optional native Rust implementation for treemap layout calculations to further accelerate performance on large directory trees. Provides Python bindings via PyO3/maturin.

Related Files:

  • benchmark_accelerator.py: Performance comparison script
  • verify_accelerator.py: Correctness validation against pure Python

Scanner Variants (scanners/)

Experimental implementations for directory scanning optimization:

File Description
scanner.py Pure Python scanner (baseline)
scanner_threaded.py Multi-threaded FUSE-based scanner using ThreadPoolExecutor
only pure rs.py Pure Rust scanner implementation
both.py Hybrid Python/Rust scanner
ftp_async.py Async FTP scanner experiment

Purpose: Testing different approaches for optimal scanning performance before integration into main codebase.


Documentation (doc/)

Performance Research

File Content
benchmarks.md Comprehensive performance benchmarks
count.md File counting strategy analysis
scandir.md os.scandir() vs alternatives
walk.md os.walk() performance analysis
smooth treemap.txt Rendering smoothness optimization notes

Test Scripts

  • benchmark.py: Main benchmark harness
  • count.py, count_cm.py: File counting tests
  • scandir.py: Scandir performance tests
  • walk.py: Walk performance tests

Subdirectories

  • examples/: Sample data/test cases
  • images/: Screenshots and visual documentation
  • scandir-rs docs/: Documentation and benchmarks for scandir-rs library
    • README.md, benchmarks.md: Performance analysis
    • count.md, scandir.md, walk.md: Algorithm comparisons
    • Test scripts: benchmark.py, count.py, scandir.py, walk.py

Entry Point (main.py)

Purpose: Application launcher that:

  1. Sets up Python module path
  2. Parses color configuration from config.COLORS_STRING
  3. Converts hex colors to RGB tuples
  4. Instantiates and runs SpaceHoarderApp
  5. Handles graceful exit

Configuration

Color Theming

  • Defined in spacehoarder/config.py
  • Parsed as hex color strings → RGB tuples
  • Applied to treemap rendering based on directory depth
  • Supports light/dark theme switching

Performance Tuning

See notes.md for detailed tuning guidelines:

  • Visual quality vs. Speed trade-offs
  • Pixel-level culling thresholds
  • Text rendering optimization
  • Character width estimation tuning

Build & Execution

Running the Application

python3 main.py [directory_path]

Building Rust Accelerator

cd spacehoarder/treemap_accelerator_rust_source
cargo build --release

Performance Testing

python3 benchmark_accelerator.py
python3 verify_accelerator.py

Architecture Highlights

Threading Model

  • Main Thread: GTK UI event loop
  • Scan Threads: Background directory scanning
  • Thread Safety: Scan state management with ID-based invalidation

Rendering Pipeline

  1. Directory scan → scanner.py
  2. Data model creation → models.py
  3. Layout calculation → treemap_widget.py (or Rust accelerator)
  4. Cairo rendering → treemap_widget.py
  5. Interactive events → GTK signal handlers

Performance Optimizations

  1. Cairo Drawing: Loop-invariant hoisting, size culling
  2. Text Rendering: Adaptive character estimation, threshold-based skipping
  3. Optional Rust: Native acceleration for layout calculations
  4. Lazy Rendering: Only draw visible/significant rectangles

Current Goals (from goals.txt)

  1. ✅ Connect UI selector to scanner (scanner type selection)
  2. 🔜 Add AVX detection to prevent SIMD-related crashes

Dependencies

  • GTK 3.0: UI framework
  • Cairo: 2D graphics rendering
  • GLib: Event loop and utilities
  • Python 3.x: Runtime
  • Rust + Cargo (optional): For accelerator compilation

Notes

  • The project is actively optimized for sub-millisecond treemap rendering
  • Performance tuning parameters are externalized in notes.md
  • Multiple scanner implementations suggest ongoing performance research
  • Rust accelerator is optional - application functions with pure Python