fix: separate LLM and embedding token limits by lucifery1234-create · Pull Request #3 · nibzard/devcontainer-generator

lucifery1234-create · 2026-06-10T05:50:54Z

/claim #12

Summary

This PR separates context-window handling for chat/LLM generation from embedding generation.

Previously, the app had a hard-coded chat context truncation path and a separate embedding truncation path, but model-specific tokenization and max token settings were not consistently handled. That makes it easy for repositories with larger contexts to be truncated against the wrong model window, especially when using different chat and embedding models.

Changes

Add shared token helpers for model encoding lookup, fallback encoding, and positive integer env parsing.
Use LLM_MODEL_MAX_TOKENS for devcontainer generation context truncation.
Keep EMBEDDING_MODEL_MAX_TOKENS separate for embedding requests.
Fall back to cl100k_base when tiktoken does not know a configured model name.
Document LLM_MODEL_MAX_TOKENS in .env.example.
Add regression tests for unknown model fallback, env parsing, embedding truncation, and LLM truncation configuration.

Tests

Ran:

python -m unittest test_token_helpers.py test_devcontainer_context.py
python -m compileall helpers main.py test_token_helpers.py test_devcontainer_context.py

Result:

Ran 5 tests in 1.277s
OK

Repository note

Algora still links the bounty to daytonaio/devcontainer-generator, but that GitHub repository currently returns 404 through the live GitHub API. This PR targets the public repository currently available for the codebase: nibzard/devcontainer-generator.

lucifery1234-create · 2026-06-10T06:02:28Z

This PR addresses the Algora bounty listed as daytonaio/devcontainer-generator#12:

Handle Different Context Lengths Between LLM and Embedding Models ($20)

The original linked GitHub repository, daytonaio/devcontainer-generator, currently returns 404. This PR targets the public repository currently available for the same codebase: nibzard/devcontainer-generator.

fix: separate llm and embedding token limits

e58e0c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: separate LLM and embedding token limits#3

fix: separate LLM and embedding token limits#3
lucifery1234-create wants to merge 1 commit into
nibzard:mainfrom
lucifery1234-create:fix-context-window-token-limits

lucifery1234-create commented Jun 10, 2026

Uh oh!

lucifery1234-create commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lucifery1234-create commented Jun 10, 2026

Summary

Changes

Tests

Repository note

Uh oh!

lucifery1234-create commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant