fix(a2ui): pin schema_manager file reads to UTF-8#2911
Open
genisis0x wants to merge 2 commits into
Open
Conversation
A2UISchemaManager loads JSON and text files from disk through five
open() / Path() call sites — the custom catalog passed by the caller
and the four internal _load_spec_json / _load_spec_text /
_load_version_json / _load_version_text helpers that read the
upstream A2UI spec bundle and the version-pinned schemas. None of
the call sites passed an explicit encoding, so on platforms whose
default locale.getencoding() is not UTF-8 — notably Windows, which
defaults to cp1252 / cp932 — non-ASCII bytes in catalog descriptions,
component labels, or rule files raise UnicodeDecodeError mid-load
and the schema manager fails to construct.
JSON is defined over UTF-8 (RFC 8259 §8.1) and the .md rule files
follow the same convention, so pin encoding='utf-8' at every call
site.
Adds test_custom_catalog_file_load_handles_non_ascii exercising the
custom-catalog file-load branch with a payload that mixes Chinese
('智能按钮') and accented Latin ('Composant interactif — étiquette')
characters across both the catalog $id and a nested component
description. Asserts the loaded schema round-trips bit-for-bit.
Local: uv run pytest test/agents/experimental/a2ui/test_schema_manager.py
-- 15/15 green (incl. new test). ruff check + format clean.
The non-ASCII regression test's tmp_path fixture lacked a type annotation, tripping mypy no-untyped-def in the type-check CI. Annotate it as Path, matching the existing fixture convention in the test suite.
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 535 files with indirect coverage changes 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A2UISchemaManagerinautogen/agents/experimental/a2ui/schema_manager.pyloads JSON and text files from disk through five call sites:_load_spec_jsonand_load_spec_textreading the upstream A2UI spec bundle (lines 228, 236);_load_version_jsonand_load_version_textreading the version-pinned schemas (lines 243, 251).None of these passed an explicit encoding, so on platforms whose default
locale.getencoding()is not UTF-8 — notably Windows, which defaults to cp1252 / cp932 — non-ASCII bytes in catalog descriptions, component labels, or rule files raiseUnicodeDecodeErrormid-load andA2UISchemaManager.__init__fails before the manager can be constructed.JSON is defined over UTF-8 (RFC 8259 §8.1), and the
.mdrule files follow the same convention.Change
autogen/agents/experimental/a2ui/schema_manager.py— passencoding="utf-8"at every file-read call site. No behavioural changes.test/agents/experimental/a2ui/test_schema_manager.py— new regression testtest_custom_catalog_file_load_handles_non_asciiwriting a catalog with Chinese ("智能按钮") and accented Latin ("Composant interactif — étiquette") characters across both the catalog$idand a nested component description, then asserting the loaded schema round-trips bit-for-bit.Same pattern as the previously-merged UTF-8 encoding pins (#2815, #2818, #2819, #2825, #2826, #2827) and the matching pins in #2909 / #2910.
Validation
uv run python -m pytest test/agents/experimental/a2ui/test_schema_manager.py -v— 15/15 pass (including new test).uv run ruff check autogen/agents/experimental/a2ui/schema_manager.py test/agents/experimental/a2ui/test_schema_manager.py— clean.uv run ruff format --check ...— already formatted.Note: the
a2uitest module is gated on thea2aoptional dependency (a2a-sdk>=1.0.0,<2); installed locally viauv pip installbefore running the suite.AI assistance
Diff drafted with assistance and reviewed end-to-end against the existing UTF-8 pin convention used in the surrounding codebase. The regression test exercises the catalog file-load branch with mixed-script bytes through both the
$idand a nested component description, mirroring the read-side tests added in #2909 / #2910.