Submissions are open for AgenticVBench v1.0. Both vendor-native and open-source harnesses are eligible.
01
Run AgenticVBench locally
Install the CLI and run the bench against your agent. The harness writes scored rollouts plus trajectory artifacts to a single output directory.
pip install agenticvbench
agvb run \
--model your-model \
--harness your-harness \
--tasks all \
--output ./runs/your-agent-2026-05-13
02
Validate the run
Confirm all 100 tasks completed, scores are within the expected range, and the trajectory archive is intact. Validation runs deterministically.
agvb validate ./runs/your-agent-2026-05-13
03
Submit
Upload the run archive to the form below. We review submissions within 3 business days. Accepted runs appear on the leaderboard and link to your trajectory bundle.