Bring your own judge functionality

The current FMBench implementation for evaluations uses judges on Amazon Bedrock via litellm. To add a bring your own judge functionality, we will have to change this to have a base evaluatorClass that will make predictions and calculate the cost (similar to the FMBenchPredictor base class).

Using this implementation, customers will be able to evaluate models using their own judge LLMs in a custom/personalized manner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring your own judge functionality #283

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bring your own judge functionality #283

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions