Back to projects
MLOps and machine learning

LLM evaluation harness

intermediate

regression test prompts and models

Status

Track where this project stands in your portfolio.

Suggested stack

promptfoo or deepeval, a dataset, CI

Proves

evaluation, cost control

Milestones
  1. 01golden set
  2. 02metrics
  3. 03CI gate
  4. 04cost report