Skip to content

Add a new agentic evaluation using mini-swe-agent #2

@stefanpie

Description

@stefanpie

mini-swe-agent is a simple agent harness that interacts with tools through bash, either in the local environment or in a Docker image. We want to use this harness to build an agentic evaluation for HLS design tasks.

The simplest integration approach is to build a new evaluator for generation and editing that:

  • Sets up the design directory
  • Runs the agent in the design directory to generate or edit kernel code
  • Evaluates all pass metrics on our end (extraction, compile, testbench, synthesis)

The agent should be able to run a normal C++ compiler in its environment to test for compilablity and testbench execution. However, for now we won't have the agent call Vitis HLS in its environment, since that would take non-trivial time to run and requires significant engineering to integrate Vitis HLS into a Docker image where the agent runs. These engineering challenges are being addressed, but the initial agentic evaluation will forego this extra complexity for now.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions