Expectations API¶
All expectations live in agentgate.expectations and are also available from the top-level agentgate import.
Tool Expectations¶
ExpectToolCall¶
s.expect_tool_call("search",
order=1, # absolute position (1-indexed)
before="book", # must come before "book"
after="login", # must come after "login"
with_args={"q": "X"}, # argument matching
times=2, # exact count
min_times=1, # at least N
max_times=3, # at most N
)
ExpectNoToolCall¶
ExpectToolOrder¶
s.expect_tool_order(["search", "select", "book"])
# Checks subsequence — other tools may be interleaved
State Expectations¶
Output Expectations¶
Resource Limits¶
Behavioral Quality¶
# Side effects (AgentRewardBench)
s.expect_no_side_effects(
allowed_tools=["search", "book"],
mutating_tools=["cancel", "delete"],
)
# Repetition detection
s.expect_no_repetition(max_rate=0.0)
Milestone (Partial Credit)¶
s.expect_milestone("Search done", tool="search", weight=1)
s.expect_milestone("Booked", tool="book", weight=2)
# score = weighted milestone achievement
LLM Judge¶
def my_judge(criteria, trace_text):
# Call any LLM
return (True, "Agent was professional")
s.expect_llm_judge("Agent is empathetic", judge_fn=my_judge)
Custom Expectations¶
from agentgate.expectations import Expectation
from agentgate.types import AgentTrace, ExpectationResult
class ExpectCustom(Expectation):
def check(self, trace: AgentTrace) -> ExpectationResult:
if some_condition(trace):
return ExpectationResult(True, "custom_check")
return ExpectationResult(False, "custom_check", "Failed because...")
s.expectations.append(ExpectCustom())