Quick Start
Get your first eval running in under 5 minutes. No signup required.
Step 1 — Run locally first
npx agentura@latest init
npx agentura@latest run --localinit creates an agentura.yaml config and a starter eval dataset in your project. run --local runs your evals on your machine — no GitHub App, no cloud, no login required.
Step 2 — Point it at your agent
Open agentura.yaml and update the endpoint:
version: 1
agent:
type: http
endpoint: https://your-agent.example.com/invoke
timeout_ms: 10000
evals:
- name: accuracy
type: golden_dataset
dataset: ./evals/accuracy.jsonl
scorer: fuzzy_match
threshold: 0.8
ci:
block_on_regression: false
compare_to: main
post_comment: trueYour agent needs to accept POST requests with {"input": "..."} and return {"output": "..."}. No SDK required.
Step 3 — Create your eval dataset
{"input": "what is the return policy", "expected": "30 days"}
{"input": "what is the capital of France", "expected": "Paris"}
{"input": "what color is the sky", "expected": "blue"}Save this as evals/accuracy.jsonl.
Step 4 — Run and see results
npx agentura@latest run --localYou'll see pass/fail per case, scores against your baseline, and any regressions flagged.
Step 5 (optional) — Add to CI with the GitHub App
Once local evals are working, install the GitHub App to run evals automatically on every pull request:
Install Agentura GitHub App →After install, push agentura.yaml to your repo and open a PR. Results appear as a GitHub Check Run and PR comment within 30 seconds.