Add a new agentic evaluation using mini-swe-agent

mini-swe-agent is a simple agent harness that interacts with tools through bash, either in the local environment or in a Docker image. We want to use this harness to build an agentic evaluation for HLS design tasks.

The simplest integration approach is to build a new evaluator for generation and editing that:

- Sets up the design directory
- Runs the agent in the design directory to generate or edit kernel code
- Evaluates all pass metrics on our end (extraction, compile, testbench, synthesis)

The agent should be able to run a normal C++ compiler in its environment to test for compilablity and testbench execution. However, for now we won't have the agent call Vitis HLS in its environment, since that would take non-trivial time to run and requires significant engineering to integrate Vitis HLS into a Docker image where the agent runs. These engineering challenges are being addressed, but the initial agentic evaluation will forego this extra complexity for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new agentic evaluation using mini-swe-agent #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add a new agentic evaluation using mini-swe-agent #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions