Coding Agent
-
AG→
devops-gym-eval
by kaijiezhu11
DevOps-Gym is the first end-to-end benchmark for evaluating AI agents across core DevOps workflows: build and configuration, monitoring, issue resolving, and test generation. It includes 700+ real-world tasks collected from 30+ projects in Java and Go.
Showing 91-99 of 99
•
Page 10 of 10