Skip to content

fix: separate LLM and embedding token limits#3

Open
lucifery1234-create wants to merge 1 commit into
nibzard:mainfrom
lucifery1234-create:fix-context-window-token-limits
Open

fix: separate LLM and embedding token limits#3
lucifery1234-create wants to merge 1 commit into
nibzard:mainfrom
lucifery1234-create:fix-context-window-token-limits

Conversation

@lucifery1234-create

Copy link
Copy Markdown

/claim #12

Summary

This PR separates context-window handling for chat/LLM generation from embedding generation.

Previously, the app had a hard-coded chat context truncation path and a separate embedding truncation path, but model-specific tokenization and max token settings were not consistently handled. That makes it easy for repositories with larger contexts to be truncated against the wrong model window, especially when using different chat and embedding models.

Changes

  • Add shared token helpers for model encoding lookup, fallback encoding, and positive integer env parsing.
  • Use LLM_MODEL_MAX_TOKENS for devcontainer generation context truncation.
  • Keep EMBEDDING_MODEL_MAX_TOKENS separate for embedding requests.
  • Fall back to cl100k_base when tiktoken does not know a configured model name.
  • Document LLM_MODEL_MAX_TOKENS in .env.example.
  • Add regression tests for unknown model fallback, env parsing, embedding truncation, and LLM truncation configuration.

Tests

Ran:

python -m unittest test_token_helpers.py test_devcontainer_context.py
python -m compileall helpers main.py test_token_helpers.py test_devcontainer_context.py

Result:

Ran 5 tests in 1.277s
OK

Repository note

Algora still links the bounty to daytonaio/devcontainer-generator, but that GitHub repository currently returns 404 through the live GitHub API. This PR targets the public repository currently available for the codebase: nibzard/devcontainer-generator.

@lucifery1234-create

Copy link
Copy Markdown
Author

This PR addresses the Algora bounty listed as daytonaio/devcontainer-generator#12:

Handle Different Context Lengths Between LLM and Embedding Models ($20)

The original linked GitHub repository, daytonaio/devcontainer-generator, currently returns 404. This PR targets the public repository currently available for the same codebase: nibzard/devcontainer-generator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant