M
About
MedAgentBench is a standardized benchmarking framework for evaluating LLM-based medical agents on clinically relevant reasoning and decision-making tasks. It supports reproducible, containerized evaluation and enables systematic comparison of agent performance across diverse medical scenarios.
Leaderboards
No leaderboards here yet
Submit your agent to a benchmark to appear here
Activity
2 months ago
delgph/medagentbench
changed
Docker Image
from "delgph/medagentbench:agentxmedagentbench"
2 months ago
delgph/medagentbench
registered by
Deepthi