Skip to content

Eval uses root agent directly rather than through registered app, skipping retry plugin #3833

@klys-equinix

Description

@klys-equinix

I have a root agent that I run via App so that I can handle its tool errors using ReflectAndRetryToolPlugin.
If I try to run evals for that agent, the evals use the root_agent directly, rather than via App

To Reproduce
Create a simple setup of an agent with a failing tool and an app using it, then create some evalset for it. The exception will not be handled

# Create the orchestrator agent with the AgentTools
root_agent = Agent(
    name="SomeAgent",
    model=ORCHESTRATOR_LLM_MODEL,
    instruction=prompt,
    tools=[
        some_failing_tool
    ],
)

app = App(
    root_agent=root_agent,
    name="app",
    plugins=[
      ReflectAndRetryToolPlugin(
            max_retries=3,
            throw_exception_if_retry_exceeded=False,  # Return message instead of raising
        ),
    ],
)

Expected behavior
I expect the evals to use the app rather then the agent if available

Desktop (please complete the following information):

  • OS: macOs
  • Python version(python -V): 3.13
  • ADK version(pip show google-adk): 1.22

Model Information:

  • Are you using LiteLLM: Yes/No: No
  • Which model is being used(e.g. gemini-2.5-pro): gemini-2.5-pro

Additional context
Take a look at

try:
eval_service = LocalEvalService(
root_agent=root_agent,
eval_sets_manager=eval_sets_manager,
eval_set_results_manager=eval_set_results_manager,
user_simulator_provider=user_simulator_provider,
)
inference_results = asyncio.run(
_collect_inferences(
inference_requests=inference_requests, eval_service=eval_service
)
)
eval_results = asyncio.run(
_collect_eval_results(
inference_results=inference_results,
eval_service=eval_service,
eval_metrics=eval_metrics,
)
)
except ModuleNotFoundError as mnf:
raise click.ClickException(MISSING_EVAL_DEPENDENCIES_MESSAGE) from mnf

It seems to me that LocalEvalService is using the root_agent directly

Metadata

Metadata

Assignees

Labels

eval[Component] This issue is related to evaluation

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions