Optimization Proposal: Consolidating Tree-sitter Dependencies and Localizing Document Parsing #118
Closed
1353604736
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @safishamsi,
I've been diving deep into Graphify's architecture, particularly the Semantic Extraction pipeline in skill.md and the multi-language support in extract.py. The use of parallel subagents for vision and citation mining is an impressive way to bridge implementation and design.
Based on the current implementation, I’d like to propose two optimizations to improve maintainability and performance:
Currently, pyproject.toml manages 13 separate tree-sitter- language packages.
The current pipeline relies heavily on Claude/Vision subagents for Part B (PDFs, Images, and Docs).
Summary:
I'm curious to know if you've considered a unified language pack before, or if the current manual dependency management was a conscious choice for build size optimization.
Beta Was this translation helpful? Give feedback.
All reactions