feat: Allow large file uploads (up to 100 MB) for code interpreter agents with LLM prompt truncation by Copilot · Pull Request #1 · GEBIT/onyx

Copilot · 2026-03-10T14:02:28Z

Previously, all uploaded files were subject to a token limit (~100k tokens) and rejected if exceeded. This adds a two-tier file handling strategy based on whether the target agent has code interpreter access.

Behavior changes

Agents with code interpreter (PythonTool):

Accept files up to 100 MB (configurable via CODE_INTERPRETER_MAX_FILE_SIZE_BYTES)
If a file exceeds 10,000 tokens (configurable via CODE_INTERPRETER_FILE_TOKEN_THRESHOLD), only the first and last 1,000 tokens (configurable via CODE_INTERPRETER_FILE_TOKEN_CONTEXT_SIZE) are injected into the LLM prompt, with a note that the full file is available to the code interpreter
Files are still fully stored and accessible to the code interpreter

Agents without code interpreter: unchanged — token limit rejection still applies.

Implementation

Upload time (projects_file_utils.py): categorize_uploaded_files() receives has_code_interpreter flag, bypasses token-count rejection, enforces 100 MB size cap instead
Chat time (chat_utils.py): convert_chat_history() receives has_code_interpreter + tokenizer; calls _truncate_file_text_for_code_interpreter() for oversized files, replacing the middle with an omission notice
Persona detection (db/persona.py): persona_has_code_interpreter_tool() checks if a persona has PythonTool attached; called at upload time using persona_id passed from the frontend
Chat processing (process_message.py): detects PythonTool in the constructed tool list, passes has_code_interpreter=True and a tokenizer to convert_chat_history()
New config vars in app_configs.py: CODE_INTERPRETER_FILE_TOKEN_THRESHOLD, CODE_INTERPRETER_FILE_TOKEN_CONTEXT_SIZE, CODE_INTERPRETER_MAX_FILE_SIZE_BYTES
Frontend: uploadFiles() / beginUpload() now accept and forward personaId so the backend can gate the size limit per agent

How Has This Been Tested?

Unit tests for _truncate_file_text_for_code_interpreter() (short/exact/large files, first+last token coverage)
Unit tests for categorize_uploaded_files() covering: size cap rejection for CI agents, token limit bypass for CI agents, token limit enforcement for non-CI agents, skip-threshold global override

Additional Options

[Optional] Please cherry-pick this PR to the latest release version.
[Optional] Override Linear Check

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

disposable.github.io
- Triggering command: /home/REDACTED/work/onyx/onyx/.venv/bin/pytest pytest tests/unit/ --ignore=tests/unit/model_server -x (dns block)
huggingface.co
- Triggering command: /home/REDACTED/work/onyx/onyx/.venv/bin/pytest pytest tests/unit/onyx/server/features/test_projects_file_utils.py -xvs (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Co-authored-by: kleincode <10796927+kleincode@users.noreply.github.com>

Initial plan

4b0b3e9

Copilot AI assigned Copilot and kleincode Mar 10, 2026

Copilot started work on behalf of kleincode March 10, 2026 14:02 View session

Implement large file support for code interpreter agents

4ecc55c

Co-authored-by: kleincode <10796927+kleincode@users.noreply.github.com>

Copilot started work on behalf of kleincode March 10, 2026 14:50 View session

Add 100MB file size limit for code interpreter agents and new unit tests

06b58fe

Co-authored-by: kleincode <10796927+kleincode@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update file upload limits for code interpreter access~~ feat: Allow large file uploads (up to 100 MB) for code interpreter agents with LLM prompt truncation Mar 10, 2026

Copilot finished work on behalf of kleincode March 10, 2026 15:10

kleincode marked this pull request as ready for review March 10, 2026 16:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Allow large file uploads (up to 100 MB) for code interpreter agents with LLM prompt truncation#1

feat: Allow large file uploads (up to 100 MB) for code interpreter agents with LLM prompt truncation#1
Copilot wants to merge 3 commits intorelease/v3.0from
copilot/update-file-upload-limits

Copilot AI commented Mar 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Behavior changes

Implementation

How Has This Been Tested?

Additional Options

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 10, 2026 •

edited

Loading