Add MLflow scorers integration page#1829
Conversation
Adds a new integration page at docs/integrations/mlflow-scorers.md covering MLflow's five Google ADK scorers (ToolTrajectory, ResponseMatch, ResponseEvaluation, Safety, Hallucination) for agent evaluation. The integration wraps ADK's TrajectoryEvaluator, RougeEvaluator, FinalResponseMatchV2Evaluator, SafetyEvaluatorV1, and HallucinationsV1Evaluator behind MLflow's scorer interface, so ADK users can evaluate agents inside mlflow.genai.evaluate() runs without leaving the ADK ecosystem. Complements the existing MLflow Tracing and MLflow AI Gateway integration pages by covering evaluation, the third leg of the MLflow stack for ADK. Signed-off-by: debu-sinha <debusinha2009@gmail.com>
✅ Deploy Preview for adk-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
I have signed the Google CLA. Please re-run the CLA check. |
Signed-off-by: debu-sinha <debusinha2009@gmail.com>
|
@PattaraS Built this page to live alongside your tracing and gateway integrations. Followed your template (frontmatter, language-support-tag, Use cases section, cross-links between all three pages) so the three pages read as a set. Flagging in case you want to align on tone or structure before review. Happy to incorporate any feedback. |
Summary
Adds a new integration page at
docs/integrations/mlflow-scorers.mddocumenting MLflow's five Google ADK scorers for agent evaluation. The integration wraps ADK'sTrajectoryEvaluator,RougeEvaluator,FinalResponseMatchV2Evaluator,SafetyEvaluatorV1, andHallucinationsV1Evaluatorbehind MLflow's scorer interface, so ADK users can evaluate agents insidemlflow.genai.evaluate()runs.Complements the existing MLflow Tracing and MLflow AI Gateway integration pages by covering evaluation, the third leg of the MLflow stack for ADK.
Changes
docs/integrations/mlflow-scorers.md— new page covering:mlflow.genai.evaluatecomposition)expectations[\"actual_tool_calls\"]andTOOLspan fallback)SafetyVertex AI requirementWhy this is useful for ADK users
ADK already exposes a rich evaluator suite in
google.adk.evaluation, but users running evaluations in MLflow-tracked agent applications previously had to build their own glue code to invoke ADK evaluators on MLflow data. The integration removes that glue. ADK evaluators can now be composed in the samemlflow.genai.evaluate()call as MLflow's built-in or other third-party scorers (TruLens, DeepEval, Guardrails AI), with the sameFeedbackreturn type and the same MLflow evaluation UI.Upstream MLflow PRs
This documentation describes work that has already shipped in MLflow:
ToolTrajectory,ResponseMatch(MLflow 3.11)ResponseEvaluation,Safety,Hallucination(MLflow 3.13)Verification
CONTRIBUTING.md(frontmatter, Use cases, Prerequisites, Install dependencies, Quick start, Resources).mlflow/genai/scorers/google_adk/__init__.pyonmlflow/mlflowmaster.catalog_iconreuses the existing/integrations/assets/mlflow.pngasset already used by the other two MLflow pages./integrations/mlflow-tracing/) matching the convention in the existing pages.mkdocs.ymlchanges needed: integration pages are excluded from nav vianot_in_nav: /integrations/*.mdand discovered through the catalog frontmatter.CLA
I have signed the Google CLA through my employer Databricks (covers all OSS contributions). Happy to confirm via any process the maintainers prefer.