Skip to content

feat: auto-download ONNX models from ModelScope#118

Merged
GreatV merged 4 commits into
mainfrom
autodownload
May 15, 2026
Merged

feat: auto-download ONNX models from ModelScope#118
GreatV merged 4 commits into
mainfrom
autodownload

Conversation

@GreatV
Copy link
Copy Markdown
Owner

@GreatV GreatV commented May 15, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an auto-download feature that allows the library to automatically fetch and cache OCR model files from ModelScope. It includes a new download module in the core library, a static registry of supported models with SHA-256 verification, and integration into the high-level builders. Review feedback suggests optimizing performance by avoiding redundant hashing of large files, improving network efficiency through ureq::Agent reuse, and preventing race conditions during concurrent downloads by using unique temporary filenames.

Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread oar-ocr-core/src/core/download/mod.rs Outdated
Comment thread oar-ocr-core/src/core/download/mod.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an auto-download feature that lets high-level OCR/structure builders accept registered bare model filenames and automatically fetch (and cache) the corresponding files from ModelScope with SHA-256 verification via oar-ocr-core.

Changes:

  • Introduces oar_ocr_core::core::download (feature-gated) with a static registry, cache resolution rules, download + hash verification, and unit tests.
  • Wires model-path resolution into OCR and structure builders so bare filenames are resolved through the auto-download cache.
  • Updates docs/README and adds an auto_download example demonstrating the feature.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/oarocr/structure.rs Resolves structure pipeline model/dict/tokenizer paths via auto-download before building.
src/oarocr/ocr.rs Resolves required OCR model/dict paths via auto-download before building adapters.
src/oarocr/builder_utils.rs Adds resolve_model_path and applies it to optional adapter construction.
src/lib.rs Re-exports a download module when auto-download is enabled.
README.md Documents the new auto-download feature and behavior.
oar-ocr-core/src/core/mod.rs Feature-gates and exposes the new download module.
oar-ocr-core/src/core/download/registry.rs Adds the ModelScope file registry plus registry validation tests.
oar-ocr-core/src/core/download/mod.rs Implements cache resolution + download/verification logic and tests.
oar-ocr-core/Cargo.toml Adds the auto-download feature and optional deps (ureq, sha2, dirs).
Cargo.toml Plumbs the top-level auto-download feature through to oar-ocr-core.
examples/auto_download.rs Adds a runnable example showing bare-name resolution and caching.
docs/models.md Documents auto-download usage and path resolution rules.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread oar-ocr-core/src/core/download/mod.rs Outdated
Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread docs/models.md
Comment thread src/oarocr/structure.rs
@GreatV GreatV requested a review from Copilot May 15, 2026 11:41
@GreatV
Copy link
Copy Markdown
Owner Author

GreatV commented May 15, 2026

/gemini review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Comment thread oar-ocr-core/src/core/download/mod.rs Outdated
Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread src/oarocr/builder_utils.rs
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an auto-download feature that allows the library to automatically fetch OCR model files from ModelScope into a local cache, verified by SHA-256 hashes. It updates the high-level builders to resolve model paths transparently and adds comprehensive documentation and examples. Review feedback points out a breaking change in the ureq 3.0 API usage, suggests improving error reporting for HTTP status codes, and highlights potential platform-specific issues with atomic file renaming on Windows.

Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread oar-ocr-core/src/core/download/mod.rs Outdated
@GreatV
Copy link
Copy Markdown
Owner Author

GreatV commented May 15, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an auto-download feature that allows the library to automatically fetch OCR model files from ModelScope into a local cache directory ($OAR_HOME). It includes a new download module in oar-ocr-core for handling registry lookups, SHA-256 verification, and atomic file replacement, along with updates to the high-level builders to resolve model paths transparently. Feedback was provided regarding the thread-safety of modifying environment variables in tests and the importance of explicit error handling for non-2xx HTTP status codes during model downloads.

Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread oar-ocr-core/src/core/download/mod.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread oar-ocr-core/src/core/download/mod.rs
Comment thread docs/models.md
Comment thread src/oarocr/structure.rs Outdated
Comment thread docs/models.md
@GreatV GreatV merged commit efad41c into main May 15, 2026
7 checks passed
@GreatV GreatV deleted the autodownload branch May 15, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants