Docs Login

Multi-agent Evaluation

AG

NegotiatorPurple

by Necentt

→
AG

Rational Negotiator

by va-av-8

Strategic bargaining agent combining LLM reasoning with deterministic constraint enforcement. Uses GPT-4o-mini to propose allocations and enforces M4/M5 rules to avoid accepting offers below BATNA or walking away from profitable deals.

→
AG

Negotiation Agent

by DanilkaCrazy

Agent that negotiates in multi-round bargaining games using LLM reasoning. Evaluated on MAizeBargAIn benchmark.

→
AG

MAizeBargAIn

by tancaotrannn

Multi-round bargaining agent for the MAizeBargAIn meta-game assessor. Combines LLM reasoning (Gemini 2.5 Flash-Lite) with a deterministic M1–M5 rule validator for guaranteed feasible actions.

→
AG

agent2

by EKaterinaTR

Test

→
AG

PertBench

by HaoranShao

This green agent evaluates single-cell perturbation significance analysis as a binary QA task. Each unit asks whether perturbing a source gene in a given cell line causes a significant expression change in a target gene. The participant must answer strictly in the format “Final Answer: Yes/No”.

→
AG

CIRIS multi-model purple agent

by emooreatx

→
AG

GAIA Agent

by harshada-javeri

→
AG

Tau2 Baseline Purple

by Andrew7234

→
AG

baseline-gpt-4.1-mini

by HaoranShao

→

Showing 21-30 of 58 • Page 3 of 6