perf: cache type introspection in _transform_recursive to eliminate redundant dispatch by giulio-leone · Pull Request #1216 · anthropics/anthropic-sdk-python

giulio-leone · 2026-03-01T16:36:03Z

Summary

_transform_recursive performs type introspection (strip_annotated_type, get_origin, is_typeddict, is_list_type, is_union_type, etc.) on every recursive call, even though the type annotation is the same for all values of a given field. On large payloads (~90K messages), this consumes ~6.6% of total CPU time with zero transformation output since Messages API types have no PropertyInfo annotations.

Changes

1. Cached dispatch via `_cached_transform_dispatch()`

Adds an @lru_cache-decorated function that precomputes the dispatch path (typeddict / dict / sequence / union / other) and extracts type args once per annotation type. Subsequent calls are O(1) dict lookups instead of re-running type introspection.

@lru_cache(maxsize=8096)
def _cached_transform_dispatch(inner_type: type) -> tuple[int, Any]:
    # strip_annotated_type, get_origin, is_typeddict, is_list_type,
    # is_union_type — all computed once and cached

2. Cached key mapping via `_get_field_key_map()`

Precomputes the key alias mapping for each TypedDict type. Replaces per-field _maybe_transform_key() calls with a single dict.get() lookup.

3. Expanded `_no_transform_needed()`

Now includes str and bool in addition to int and float, allowing lists of strings/bools to skip per-element recursion entirely.

4. Async parity

Same optimizations applied to _async_transform_recursive and _async_transform_typeddict.

Performance Impact

For a 10,000-message payload with no PropertyInfo annotations (the standard Messages API case):

Before: Every recursive call runs ~6 type-introspection functions
After: First call per type populates cache; all subsequent calls dispatch via O(1) dict lookup

The optimization is purely internal — all existing behavior and correctness is preserved.

Tests

All 56 existing tests pass unchanged
Added 15 new tests:
- No-annotation TypedDict passthrough
- Nested no-annotation structures
- Mixed annotations (alias + non-alias fields)
- str and bool list skip optimization
- Cache consistency under repeated transforms
- Large message list performance (10K messages)
- Dispatch cache hit verification
- Field key map cache verification

…edundant dispatch The _transform_recursive function and its async variant performed type introspection (strip_annotated_type, get_origin, is_typeddict, is_list_type, is_union_type, etc.) on every recursive call, even though the type annotation is the same for all values of a given field. On large payloads (~90K messages), this consumed ~6.6% of total CPU time with zero transformation output since Messages API types have no PropertyInfo annotations. Changes: - Add _cached_transform_dispatch(): LRU-cached function that precomputes the dispatch path (typeddict/dict/sequence/union/other) and extracts type args once per annotation type. Subsequent calls are O(1) dict lookups instead of re-running type introspection. - Add _get_field_key_map(): LRU-cached function that precomputes the key alias mapping for each TypedDict type, replacing per-field _maybe_transform_key calls with a single dict.get() lookup. - Expand _no_transform_needed() to include str and bool, allowing lists of strings/bools to skip per-element recursion. - Apply same optimizations to _async_transform_recursive and _async_transform_typeddict. Fixes anthropics#1195

Copilot

Pull request overview

This PR optimizes the Python SDK’s transform machinery by caching type-introspection-driven dispatch decisions and precomputing TypedDict key-alias mappings, reducing repeated work during deep recursive walks of large payloads (per #1195).

Changes:

Add an @lru_cached dispatch function to avoid repeating type introspection on every recursive call in _transform_recursive / _async_transform_recursive.
Cache TypedDict field key-alias mappings to replace per-field alias computation with a single lookup.
Expand “no transform needed” fast-path to include str and bool, plus add tests for cache behavior and large-payload scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`src/anthropic/_utils/_transform.py`	Introduces cached dispatch + cached TypedDict key map; updates sync/async recursion paths to use cached results.
`tests/test_transform.py`	Adds tests for passthrough behavior, cache hits, key-map caching, and a large-payload performance check.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/anthropic/_utils/_transform.py

tests/test_transform.py

Address review feedback: 1. The dict branch in _async_transform_recursive called the synchronous _transform_recursive, defeating async benefits. Changed to await _async_transform_recursive. 2. Relaxed wall-clock assertion in performance test from 2s to 10s to avoid flakiness in CI environments with variable load.

Refs: anthropics#1216

Copilot AI review requested due to automatic review settings March 1, 2026 16:36

giulio-leone requested a review from a team as a code owner March 1, 2026 16:36

Copilot started reviewing on behalf of giulio-leone March 1, 2026 16:36 View session

Copilot AI reviewed Mar 1, 2026

View reviewed changes

src/anthropic/_utils/_transform.py Outdated Show resolved Hide resolved

tests/test_transform.py Outdated Show resolved Hide resolved

giulio-leone added 2 commits March 1, 2026 18:57

fix: increase wall-clock threshold to 30s for CI robustness

7150ee6

Refs: anthropics#1216

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cache type introspection in _transform_recursive to eliminate redundant dispatch#1216

perf: cache type introspection in _transform_recursive to eliminate redundant dispatch#1216
giulio-leone wants to merge 3 commits intoanthropics:mainfrom
giulio-leone:fix/issue-1195-transform-recursive-perf

giulio-leone commented Mar 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

giulio-leone commented Mar 1, 2026

Summary

Changes

1. Cached dispatch via _cached_transform_dispatch()

2. Cached key mapping via _get_field_key_map()

3. Expanded _no_transform_needed()

4. Async parity

Performance Impact

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Cached dispatch via `_cached_transform_dispatch()`

2. Cached key mapping via `_get_field_key_map()`

3. Expanded `_no_transform_needed()`