Skip to content

Add Cython-optimized _offset_subpixel implementation#14

Open
dfkphelps wants to merge 1 commit intojasper-tms:mainfrom
dfkphelps:Cython
Open

Add Cython-optimized _offset_subpixel implementation#14
dfkphelps wants to merge 1 commit intojasper-tms:mainfrom
dfkphelps:Cython

Conversation

@dfkphelps
Copy link
Copy Markdown

This PR adds a high-performance Cython implementation of the _offset_subpixel function with 10-50x speedup.

Performance

  • Standard Cython: 3-5x faster than Python
  • Fast nogil path: 10-50x faster than Python
  • Tested on Apple Silicon M3, Intel, and ARM architectures

Features

  • Two implementations: standard (all edge modes) and fast (nogil, constant edge only)
  • Automatic CPU architecture detection in build script
  • Comprehensive documentation in COMPARISON.md
  • Full test suite with benchmarks
  • Graceful fallback to Python if Cython not built

Files Added

  • offset_subpixel_fast.pyx - Cython implementation
  • setup_offset_subpixel.py - Build script
  • COMPARISON.md - Detailed comparison guide
  • README_offset_subpixel.md - Usage documentation
  • compare_implementations.py - Performance benchmarks
  • test_offset_subpixel.py - Test suite
  • .gitignore - Ignore build artifacts

Testing

Run python compare_implementations.py to see performance comparison.

Backwards Compatibility

Completely optional - existing code unchanged. Falls back to Python if Cython not available.

- Implements 10-50x speedup for subpixel image offsets
- Two implementations:  standard Cython (3-5x) and nogil fast (10-50x)
- Automatic CPU architecture detection (Intel, Apple Silicon, ARM)
- Comprehensive tests and benchmarks included
- Full documentation in COMPARISON.md and README_offset_subpixel.md
- Graceful fallback to Python if Cython not built
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant