https://www.tbench.ai/leaderboard
Terminal-Bench
A benchmark for terminal agents
terminal bench
https://harborframework.com/docs/running-tbench
Running Terminal-Bench
Running Terminal-Bench on Harbor
terminal benchrunning
https://factory.ai/news/terminal-bench
Droid: The #1 Software Development Agent on Terminal-Bench | Factory.ai
Droid: The #1 Software Development Agent on Terminal-Bench With a score of 58.75%, Droid sets the new state-of-the-art o...
software developmentdroid
https://harborframework.com/docs/migration
Migrating from Terminal-Bench
Migrating from Terminal-Bench to Harbor
terminal benchmigrating
https://www.tbench.ai/
Terminal-Bench
A benchmark for terminal agents
terminal bench
https://snorkel.ai/blog/terminal-bench-2-0-raising-the-bar-for-ai-agent-evaluation/
Terminal-Bench 2.0: Raising the bar for AI agent evaluation
Terminal-Bench 2.0 launches today, marking a major leap in AI agent evaluation. Snorkel AI contributed key research and task design to this release.
terminal benchai agentraising
https://www.tbench.ai/leaderboard/terminal-bench/2.0
Terminal-Bench
A benchmark for terminal agents
terminal bench
https://huggingface.co/datasets/harborframework/terminal-bench-2-leaderboard
harborframework/terminal-bench-2-leaderboard · Datasets at Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
terminal benchhugging face
https://harborframework.com/docs/task-difference
Differences from Terminal-Bench
Explanation of the differences in the task format from Terminal-Bench to Harbor
terminal benchdifferences