Skip to content

fix: restore error handling in deliberate execution service#298

Merged
ArthurCRodrigues merged 2 commits intomainfrom
fix/deliberate-execution-error-handling
Apr 22, 2026
Merged

fix: restore error handling in deliberate execution service#298
ArthurCRodrigues merged 2 commits intomainfrom
fix/deliberate-execution-error-handling

Conversation

@ArthurCRodrigues
Copy link
Copy Markdown
Member

The except block was accidentally removed during the asset injection refactor, causing unhandled exceptions (e.g. S3 asset not found) to propagate as raw 500 errors instead of structured SYSTEM_ERROR responses.

The except block was accidentally removed during the asset injection
refactor, causing unhandled exceptions (e.g. S3 asset not found) to
propagate as raw 500 errors instead of structured SYSTEM_ERROR responses.
Copilot AI review requested due to automatic review settings April 21, 2026 22:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Restores structured error handling in the Deliberate Code Execution (DCE) service so unexpected failures (e.g., during asset resolution/injection or sandbox operations) return SYSTEM_ERROR results instead of propagating as unhandled 500s.

Changes:

  • Reintroduced a broad except block in the DCE service to catch unexpected exceptions.
  • Logs the exception with traceback and returns a DeliberateCodeExecutionResponse containing a SYSTEM_ERROR result.

Comment on lines +190 to +199
return DeliberateCodeExecutionResponse(
results=[
DeliberateCodeExecutionResult(
output="",
category=ResponseCategory.SYSTEM_ERROR,
error_message=f"Execution failed: {str(e)}",
execution_time=0.0
)
]
)
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On exception this returns a response with exactly one result, which breaks the service’s normal contract of returning one result per provided test case (and can confuse clients when request.test_cases has length > 1). Consider building the results list to match the number of requested test cases (e.g., len(request.test_cases or [[]])), or preserving any already-computed results and appending SYSTEM_ERROR entries for the remaining cases.

Copilot uses AI. Check for mistakes.
Comment on lines +193 to +196
output="",
category=ResponseCategory.SYSTEM_ERROR,
error_message=f"Execution failed: {str(e)}",
execution_time=0.0
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response error_message includes the raw exception string. Since this endpoint appears to be unauthenticated and potentially student-facing, this can leak internal details (e.g., S3 bucket/key names, filesystem paths). Safer approach: log the full exception (as you already do) but return a generic client message (optionally include a correlation/request ID) unless an explicit debug/internal flag is enabled.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fb48eab. The raw exception string is no longer included in the response — the client now receives the generic message "An unexpected error occurred. Please try again later." while the full exception details continue to be logged server-side. Also addressed the first reviewer comment in the same commit: the results list is now sized to match len(request.test_cases) (or 1 when none are provided), keeping the response contract intact across all error scenarios.

… test cases

Agent-Logs-Url: https://github.com/webtech-network/autograder/sessions/f9b1d561-a098-4f12-b1b6-85aaee79515b

Co-authored-by: ArthurCRodrigues <167308923+ArthurCRodrigues@users.noreply.github.com>
Copy link
Copy Markdown
Member Author

@ArthurCRodrigues ArthurCRodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@ArthurCRodrigues ArthurCRodrigues merged commit 71f0231 into main Apr 22, 2026
2 checks passed
@ArthurCRodrigues ArthurCRodrigues deleted the fix/deliberate-execution-error-handling branch April 22, 2026 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants