feat(sdk): implement LangChain Code Interpreter sandbox provider#290
feat(sdk): implement LangChain Code Interpreter sandbox provider#290yashisrani wants to merge 5 commits intovolcano-sh:mainfrom
Conversation
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request introduces LangChain integration for the AgentCube SDK through the new AgentCubeSandbox class and associated unit tests. Additionally, the write_file method has been updated to support both string and binary content. Feedback was provided regarding the implementation of asynchronous methods in the LangChain integration, which currently execute blocking synchronous calls and should be offloaded to separate threads to avoid blocking the event loop.
| async def aexecute( | ||
| self, | ||
| command: str, | ||
| *, | ||
| timeout: int | None = None, | ||
| ) -> ExecuteResponse: | ||
| """Async version of execute. Currently wraps synchronous call.""" | ||
| return self.execute(command, timeout=timeout) | ||
|
|
||
| async def aupload_files( | ||
| self, | ||
| files: list[tuple[str, bytes]], | ||
| ) -> list[FileUploadResponse]: | ||
| """Async version of upload_files. Currently wraps synchronous call.""" | ||
| return self.upload_files(files) | ||
|
|
||
| async def adownload_files(self, paths: list[str]) -> list[FileDownloadResponse]: | ||
| """Async version of download_files. Currently wraps synchronous call.""" | ||
| return self.download_files(paths) |
There was a problem hiding this comment.
The async methods aexecute, aupload_files, and adownload_files are currently implemented as synchronous blocking calls. Since the underlying CodeInterpreterClient uses the requests library for network I/O, these methods will block the entire event loop when called in an asynchronous context. This can lead to performance issues in applications that rely on non-blocking behavior (e.g., LangChain agents running multiple tasks concurrently).
Consider using asyncio.to_thread (available in Python 3.9+) to offload these blocking calls to a separate thread. This ensures that the event loop remains responsive.
There was a problem hiding this comment.
Pull request overview
Adds a Python SDK integration so AgentCube Code Interpreter sessions can be used as a LangChain/DeepAgents-compatible sandbox provider, including binary file transfer support.
Changes:
- Introduces
AgentCubeSandboximplementing the (DeepAgents)BaseSandboxinterface with sync + async methods. - Extends
write_fileto acceptbytesend-to-end and base64-encode raw bytes in the data-plane client. - Adds a unit test suite for the LangChain sandbox wrapper.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk-python/agentcube/integrations/langchain.py | Adds AgentCubeSandbox adapter with optional DeepAgents dependency handling, execute/upload/download, and async wrappers |
| sdk-python/agentcube/integrations/init.py | Introduces the integrations package |
| sdk-python/agentcube/code_interpreter.py | Broadens write_file to accept `str |
| sdk-python/agentcube/clients/code_interpreter_data_plane.py | Updates write_file to base64-encode either text or raw bytes |
| sdk-python/tests/test_langchain_integration.py | Adds unit tests for AgentCubeSandbox behavior |
Comments suppressed due to low confidence (1)
sdk-python/agentcube/code_interpreter.py:206
CodeInterpreterClient.write_filenow acceptsbytes, but the existing unit tests forCodeInterpreterClientdon't cover passing binary content. Adding a test that passesbytesand asserts the data-planewrite_filecall (and/or correct request payload) would prevent regressions in binary transfer support.
def write_file(self, content: Union[str, bytes], remote_path: str):
"""
Write content to a file in the remote environment.
Args:
content: The string or binary content to write to the file.
remote_path: The destination path of the file in the remote environment.
This path is relative to the session's working directory.
"""
self.dp_client.write_file(content, remote_path)
| @patch("os.remove") | ||
| @patch("builtins.open", new_callable=MagicMock) | ||
| def test_download_files(self, mock_open, mock_remove): |
There was a problem hiding this comment.
In this test, patching os.remove prevents the temp file created by NamedTemporaryFile(delete=False) inside download_files() from actually being deleted, so the test leaks a real file into the system temp directory. Consider not mocking os.remove, or alternatively mock tempfile.NamedTemporaryFile/os.path.exists so no real file is created.
| @patch("os.remove") | |
| @patch("builtins.open", new_callable=MagicMock) | |
| def test_download_files(self, mock_open, mock_remove): | |
| @patch("builtins.open", new_callable=MagicMock) | |
| def test_download_files(self, mock_open): |
| # Internal types for base class compliance | ||
| try: | ||
| from deepagents.backends.protocol import ( | ||
| ExecuteResponse, | ||
| FileDownloadResponse, | ||
| FileUploadResponse, | ||
| ) | ||
| from deepagents.backends.sandbox import BaseSandbox | ||
| except ImportError: | ||
| # Define fallback classes if deepagents is not installed |
There was a problem hiding this comment.
Catching a broad ImportError here can unintentionally mask real import-time failures inside deepagents (e.g., transitive dependency issues or API changes) and silently fall back to the stub types, making debugging difficult. Prefer catching ModuleNotFoundError specifically for the missing optional dependency (and/or checking e.name), and re-raising other import errors with a helpful message.
|
|
||
| @property | ||
| def id(self) -> str: | ||
| """Return the unique session ID of the sandbox instance.""" | ||
| return self._client.session_id or "unknown" |
There was a problem hiding this comment.
id can change from a real session ID to the literal string "unknown" after the underlying CodeInterpreterClient.stop() clears session_id, which can break callers that rely on sandbox IDs being stable. Consider capturing the initial session ID in __init__ and always returning that, or raising if the sandbox has been stopped instead of returning a placeholder.
| @property | |
| def id(self) -> str: | |
| """Return the unique session ID of the sandbox instance.""" | |
| return self._client.session_id or "unknown" | |
| self._id = client.session_id or "unknown" | |
| @property | |
| def id(self) -> str: | |
| """Return the unique session ID of the sandbox instance.""" | |
| return self._id |
| async def aexecute( | ||
| self, | ||
| command: str, | ||
| *, | ||
| timeout: int | None = None, | ||
| ) -> ExecuteResponse: | ||
| """Async version of execute. Currently wraps synchronous call.""" | ||
| return self.execute(command, timeout=timeout) | ||
|
|
||
| async def aupload_files( | ||
| self, | ||
| files: list[tuple[str, bytes]], | ||
| ) -> list[FileUploadResponse]: | ||
| """Async version of upload_files. Currently wraps synchronous call.""" | ||
| return self.upload_files(files) | ||
|
|
||
| async def adownload_files(self, paths: list[str]) -> list[FileDownloadResponse]: | ||
| """Async version of download_files. Currently wraps synchronous call.""" | ||
| return self.download_files(paths) |
There was a problem hiding this comment.
The async methods (aexecute/aupload_files/adownload_files) currently call the synchronous implementations directly, which will block the event loop when used from async LangChain/DeepAgents code. If these methods are part of the expected async interface, run the sync work in a thread (e.g., via asyncio.to_thread/executor) or provide a truly async client implementation.
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #290 +/- ##
==========================================
+ Coverage 43.37% 47.57% +4.19%
==========================================
Files 30 30
Lines 2610 2819 +209
==========================================
+ Hits 1132 1341 +209
+ Misses 1355 1338 -17
- Partials 123 140 +17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| return ExecuteResponse( | ||
| output=stdout, | ||
| exit_code=0, | ||
| truncated=False, | ||
| ) | ||
| except CommandExecutionError as e: | ||
| # Map AgentCube execution error | ||
| return ExecuteResponse( | ||
| output=e.stderr or "", | ||
| exit_code=e.exit_code, | ||
| truncated=False, | ||
| ) |
There was a problem hiding this comment.
Could we preserve combined stdout/stderr in ExecuteResponse instead of returning stdout on success and stderr on failure separately?
It's the useful context for the agent
There was a problem hiding this comment.
Would it make sense to add the standard LangChain sandbox integration tests as well?
https://docs.langchain.com/oss/python/contributing/implement-langchain
There was a problem hiding this comment.
That's necessary to provide a demo
| # Internal types for base class compliance | ||
| try: | ||
| from deepagents.backends.protocol import ( | ||
| ExecuteResponse, | ||
| FileDownloadResponse, | ||
| FileUploadResponse, | ||
| ) | ||
| from deepagents.backends.sandbox import BaseSandbox | ||
| except ImportError: | ||
| # Define fallback classes if deepagents is not installed |
hzxuzhonghu
left a comment
There was a problem hiding this comment.
Overall this is a clean LangChain integration, but there are a few correctness issues to address before merging.
Notable omission: pyproject.toml has no [project.optional-dependencies] section for deepagents / langchain. Since the import is guarded with a try/except, pip install agentcube-sdk silently installs a non-functional integration. Users have no documented way to install the required extras (e.g. pip install agentcube-sdk[langchain]). Please add:
[project.optional-dependencies]
langchain = ["deepagents", "langchain-core"]| exit_code=e.exit_code, | ||
| truncated=False, | ||
| ) | ||
| except Exception as e: |
There was a problem hiding this comment.
This except Exception catch-all silently converts any unexpected error — AttributeError, TypeError, transient network glitches, programming bugs — to exit_code=1. This makes production debugging very hard: a typo in the _client plumbing would surface as a successful-looking exit_code=1 response instead of a traceback.
Only catch errors you can meaningfully translate here. Re-raise anything else:
except CommandExecutionError as e:
return ExecuteResponse(output=e.stderr or "", exit_code=e.exit_code, truncated=False)
# Let unexpected exceptions propagate — don't swallow themIf you need a safety net, at minimum log the unexpected exception before returning.
| timeout: int | None = None, | ||
| ) -> ExecuteResponse: | ||
| """Async version of execute. Currently wraps synchronous call.""" | ||
| return self.execute(command, timeout=timeout) |
There was a problem hiding this comment.
All three async methods (aexecute, aupload_files, adownload_files) call blocking synchronous I/O directly on the event-loop thread. When a LangChain agent awaits any of these, the entire asyncio event loop stalls for the full round-trip HTTP latency, blocking every other coroutine.
Use asyncio.to_thread (Python ≥ 3.9) to offload to a thread pool:
import asyncio
async def aexecute(self, command: str, *, timeout: int | None = None) -> ExecuteResponse:
return await asyncio.to_thread(self.execute, command, timeout=timeout)
async def aupload_files(self, files: list[tuple[str, bytes]]) -> list[FileUploadResponse]:
return await asyncio.to_thread(self.upload_files, files)
async def adownload_files(self, paths: list[str]) -> list[FileDownloadResponse]:
return await asyncio.to_thread(self.download_files, paths)For Python < 3.9 compatibility use loop.run_in_executor(None, ...) instead.
| self.mock_client.write_file.assert_any_call(b"world", "test2.txt") | ||
|
|
||
| @patch("os.remove") | ||
| @patch("builtins.open", new_callable=MagicMock) |
There was a problem hiding this comment.
Two patches are missing here, causing the test to touch the real filesystem:
tempfile.NamedTemporaryFileis not patched → the test creates a real temp file in/tmp. Becauseos.removeis mocked to a no-op, that file is never cleaned up and accumulates across test runs.os.path.existsis not patched → in thefinallyblock, the realos.path.existsis called on the real temp path; if the cleanup ever does run it would hit the live filesystem.
Fix by also patching these in the decorator chain:
@patch("os.path.exists", return_value=True)
@patch("os.remove")
@patch("builtins.open", new_callable=MagicMock)
@patch("tempfile.NamedTemporaryFile")
def test_download_files(self, mock_tmpfile, mock_open, mock_remove, mock_exists):
mock_tmpfile.return_value.__enter__.return_value.name = "/tmp/fake_path"
...| if isinstance(content, str): | ||
| content_bytes = content.encode('utf-8') | ||
| else: | ||
| content_bytes = content |
There was a problem hiding this comment.
can you make this change separately in another pr?
There was a problem hiding this comment.
Please move the sandboxer provider in a separate dir. Utimately, i would like to see it is accepted in langchain similar as aws agent core.
There was a problem hiding this comment.
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """LangChain integration for AgentCube Code Interpreter.""" |
There was a problem hiding this comment.
| """LangChain integration for AgentCube Code Interpreter.""" | |
| """AgentCube Code Interpreter sandbox integration for Langchain.""" |
There was a problem hiding this comment.
That's necessary to provide a demo
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
sdk-python/agentcube/code_interpreter.py:206
CodeInterpreterClient.write_filenow acceptsbytes, but it forwards directly todp_client.write_file(...). The current data-plane implementation still treatscontentasstrand calls.encode(...), which will fail forbytesinput. Either update the data-planewrite_fileto handle bytes (base64 encode bytes directly) or convert bytes to the expected wire format here before delegating.
def write_file(self, content: Union[str, bytes], remote_path: str):
"""
Write content to a file in the remote environment.
Args:
content: The string or binary content to write to the file.
remote_path: The destination path of the file in the remote environment.
This path is relative to the session's working directory.
"""
self.dp_client.write_file(content, remote_path)
| if result["exit_code"] != 0: | ||
| raise CommandExecutionError( | ||
| exit_code=result["exit_code"], | ||
| stderr=result["stderr"], | ||
| stdout=stdout, | ||
| stderr=stderr, | ||
| command=command | ||
| ) |
There was a problem hiding this comment.
The raise CommandExecutionError(...) line is over-indented relative to the surrounding if block, which is easy to miss in reviews and can trip formatters/lint rules. Align the indentation to a single level inside the if result["exit_code"] != 0: block.
| client = CodeInterpreterClient() | ||
| sandbox = AgentCubeSandbox(client) | ||
|
|
||
| response = sandbox.execute("print('hello world')") |
There was a problem hiding this comment.
The usage example calls sandbox.execute("print('hello world')"), but execute runs a shell command in the sandbox. As written, this will fail on most systems unless there happens to be a print(...) executable. Use a shell command (e.g., python3 -c ... or echo ...) in the example to match the implementation.
| response = sandbox.execute("print('hello world')") | |
| response = sandbox.execute("python3 -c \"print('hello world')\"") |
| # Note: Using text content as SDK only supports str for now (per maintainer request) | ||
| files_to_upload = [ | ||
| ("greeting.txt", b"Hello LangChain!"), | ||
| ("config.json", b'{"status": "isolated"}') | ||
| ] |
There was a problem hiding this comment.
This comment says the SDK "only supports str for now", but the code passes bytes and this PR also updates write_file to accept bytes. Please update the comment (and/or the example payload type) to reflect the actual behavior so users don't get conflicting guidance.
| # If bytes, try to decode to string as write_file currently only supports str | ||
| # SDK support for raw bytes will be added in a separate PR. | ||
| if isinstance(content, bytes): | ||
| content = content.decode("utf-8") |
There was a problem hiding this comment.
upload_files decodes bytes to UTF-8 strings before calling write_file, but this PR updates CodeInterpreterClient.write_file to accept raw bytes and the LangChain file API is explicitly bytes. Decoding will corrupt binary data and can raise UnicodeDecodeError, and it also contradicts the method type hint (list[tuple[str, bytes]]). Pass bytes through unchanged (and only accept/convert str if needed).
| # If bytes, try to decode to string as write_file currently only supports str | |
| # SDK support for raw bytes will be added in a separate PR. | |
| if isinstance(content, bytes): | |
| content = content.decode("utf-8") |
| self.assertEqual(responses[0].content, file_content) | ||
| self.assertIsNone(responses[0].error) | ||
|
|
||
| self.mock_client.download_file.assert_called_once() |
There was a problem hiding this comment.
This test asserts download_file was called with no arguments, but AgentCubeSandbox.download_files calls self._client.download_file(path, tmp_path). Update the assertion to check the expected arguments (including the temp path) so the test actually validates the integration.
| self.mock_client.download_file.assert_called_once() | |
| self.mock_client.download_file.assert_called_once_with("remote1.txt", "/tmp/fake_path") |
| def __init__(self, exit_code: int, stdout: str, stderr: str, command: str = None): | ||
| self.exit_code = exit_code |
There was a problem hiding this comment.
command is annotated as str = None, which is an invalid type/default pairing and will be flagged by type checkers. Use an optional type (e.g., command: str | None = None) and consider widening it to match callers (the data-plane client can pass a list of argv).
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
There was a problem hiding this comment.
u can also add a e2e test based on it
Signed-off-by: Yash Israni <118755067+yashisrani@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
sdk-python/agentcube/code_interpreter.py:206
CodeInterpreterClient.write_filenow advertisesUnion[str, bytes], but it forwards directly todp_client.write_file(...)which (currently) expectsstrand calls.encode('utf-8')on the content. Passingbyteswill raise anAttributeErrorat runtime. Either update the data-plane client'swrite_fileto acceptbytes(base64-encode raw bytes) or coercebytesto the required form here before delegating.
def write_file(self, content: Union[str, bytes], remote_path: str):
"""
Write content to a file in the remote environment.
Args:
content: The string or binary content to write to the file.
remote_path: The destination path of the file in the remote environment.
This path is relative to the session's working directory.
"""
self.dp_client.write_file(content, remote_path)
| files = [ | ||
| ("test1.txt", b"hello"), | ||
| ("test2.txt", b"world") | ||
| ] | ||
|
|
||
| responses = self.sandbox.upload_files(files) | ||
|
|
||
| self.assertEqual(len(responses), 2) | ||
| self.assertEqual(responses[0].path, "test1.txt") | ||
| self.assertIsNone(responses[0].error) | ||
| self.assertEqual(responses[1].path, "test2.txt") | ||
| self.assertIsNone(responses[1].error) | ||
|
|
||
| # Verify client calls | ||
| self.assertEqual(self.mock_client.write_file.call_count, 2) | ||
| self.mock_client.write_file.assert_any_call("hello", "test1.txt") | ||
| self.mock_client.write_file.assert_any_call("world", "test2.txt") |
There was a problem hiding this comment.
This test currently asserts write_file is called with decoded strings ("hello"/"world"), but the integration is intended to support binary transfers (bytes). Once AgentCubeSandbox.upload_files passes bytes through (instead of decoding), these assertions will fail and the test won't validate binary behavior. Update the test to assert bytes are forwarded and (optionally) add a case that includes non-UTF8 bytes to ensure true binary support.
| @pytest.fixture(scope="class") | ||
| def sandbox(self) -> Iterator[AgentCubeSandbox]: | ||
| """Provide a configured AgentCubeSandbox for testing. | ||
|
|
||
| Note: This currently uses a mocked backend to allow CI execution. | ||
| To test against a real backend, provide the necessary environment variables | ||
| (ROUTER_URL, etc.) and remove the mocking logic. | ||
| """ | ||
| # For standard integration tests, we provide a mocked client | ||
| # that simulates the behavior required by the test suite. | ||
| mock_client = MagicMock(spec=CodeInterpreterClient) | ||
| mock_client.session_id = "test-session-id" | ||
|
|
||
| # Simulate successful command execution | ||
| mock_client.execute_command.return_value = "standard output" | ||
|
|
||
| # Simulate file operations | ||
| mock_client.list_files.return_value = [] | ||
|
|
||
| # Return the sandbox with the mocked client | ||
| backend = AgentCubeSandbox(client=mock_client) |
There was a problem hiding this comment.
SandboxIntegrationTests are only enabled when langchain-tests is installed, but the provided sandbox fixture uses a very minimal MagicMock backend. If langchain-tests is present, the upstream standard suite is likely to exercise more behaviors than execute_command/list_files, and even if it passes it won't validate real AgentCube integration. Consider skipping these tests unless a real backend is configured via env vars, or provide a more complete fake that matches what SandboxIntegrationTests expects.
| import os | ||
| import pytest | ||
| from agentcube import CodeInterpreterClient | ||
| from agentcube.integrations.langchain import AgentCubeSandbox | ||
|
|
||
| # Skip these tests unless the E2E environment variables are set | ||
| pytestmark = pytest.mark.skipif( | ||
| not os.getenv("ROUTER_URL") or not os.getenv("WORKLOAD_MANAGER_URL"), | ||
| reason="E2E environment variables (ROUTER_URL, WORKLOAD_MANAGER_URL) not set" | ||
| ) | ||
|
|
||
| @pytest.fixture | ||
| async def sandbox(): | ||
| """Fixture to manage the lifecycle of an AgentCubeSandbox during E2E tests.""" | ||
| client = CodeInterpreterClient(name="e2e-test-sandbox", verbose=False) | ||
| sb = AgentCubeSandbox(client) | ||
| yield sb | ||
| # Cleanup after test | ||
| client.stop() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_langchain_sandbox_e2e_flow(sandbox): |
There was a problem hiding this comment.
This file defines async def tests/fixtures with @pytest.mark.asyncio, but the repo's Python SDK CI workflow installs only pytest (no pytest-asyncio). Without the plugin, pytest will error at collection before the skipif can take effect. Either add a module-level pytest.importorskip("pytest_asyncio") (and use pytest_asyncio.fixture), or ensure pytest-asyncio is installed in the SDK test job so these tests can be safely collected.
| # If bytes, try to decode to string as write_file currently only supports str | ||
| # SDK support for raw bytes will be added in a separate PR. | ||
| if isinstance(content, bytes): | ||
| content = content.decode("utf-8") | ||
| self._client.write_file(content, path) |
There was a problem hiding this comment.
upload_files unconditionally decodes bytes as UTF-8 before calling write_file. This will corrupt arbitrary binary uploads and can raise UnicodeDecodeError for non-text content. Since the SDK is being updated to support bytes, prefer passing bytes through to self._client.write_file (and broaden the type to accept str | bytes if needed) rather than decoding here.
| def __init__(self, exit_code: int, stdout: str, stderr: str, command: str = None): | ||
| self.exit_code = exit_code | ||
| self.stdout = stdout | ||
| self.stderr = stderr | ||
| self.command = command |
There was a problem hiding this comment.
Type hint mismatch: command defaults to None but is annotated as str. This should be Optional[str] (or str | None) to match the actual allowed value and avoid type-checker errors.
hzxuzhonghu
left a comment
There was a problem hiding this comment.
can you also add a more natural example like https://docs.langchain.com/oss/python/integrations/sandboxes/aws#use-with-deep-agents
What type of PR is this?
What this PR does / why we need it:
Key Changes
LangChain Integration
SDK Enhancements
Testing
Testing Performed
Which issue(s) this PR fixes:
Fixes #278
Special notes for your reviewer:
Does this PR introduce a user-facing change?: