This is mainly about improving scalability—or, more specifically, reducing token consumption during graph building.
I’ve tried Hound on several projects, and the graph-building process is very token-intensive. I noticed that no local model could handle it effectively. However, a model like Qwen 30B Coder performed well as a “scout,” while a Qwen3 “thinking” model worked well as a “strategist.”
I read the paper, and the decision to avoid AST logic makes sense. However, have you explored a hybrid approach?
I’m suggesting this mainly from a scalability perspective: if we precompute and build an AST/CFG-like structure to capture basic elements (fn calls, imports, reads/writes), and then send only targeted micro-cards to the LLM for intent analysis, it might significantly reduce token usage. In this setup, the LLM would still handle semantic reasoning but would operate on pre-extracted structures.
In my view, this approach could preserve Hound’s agent-driven design while improving efficiency. I just wanted to ask whether you’ve tried this and if you have any suggestions or input.
This is mainly about improving scalability—or, more specifically, reducing token consumption during graph building.
I’ve tried Hound on several projects, and the graph-building process is very token-intensive. I noticed that no local model could handle it effectively. However, a model like Qwen 30B Coder performed well as a “scout,” while a Qwen3 “thinking” model worked well as a “strategist.”
I read the paper, and the decision to avoid AST logic makes sense. However, have you explored a hybrid approach?
I’m suggesting this mainly from a scalability perspective: if we precompute and build an AST/CFG-like structure to capture basic elements (fn calls, imports, reads/writes), and then send only targeted micro-cards to the LLM for intent analysis, it might significantly reduce token usage. In this setup, the LLM would still handle semantic reasoning but would operate on pre-extracted structures.
In my view, this approach could preserve Hound’s agent-driven design while improving efficiency. I just wanted to ask whether you’ve tried this and if you have any suggestions or input.