You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Added dynamic support for ZetaSQL-free MLMD environments across TFX Resolvers and metadata extensions. The system automatically detects missing C++ ZetaSQL engine binaries at runtime and transparently falls back to a highly robust, pure-Python in-memory lineage graph traversal and relation evaluation engine.
Breaking Changes
Transitioned proto compilation tooling in Bazel workspaces from legacy deprecated py_proto_library rules to custom Starlark provider compilation macros, enabling unified, robust build integration on Bazel 7.x workspaces running with Bzlmod enabled.
For Pipeline Authors
N/A
For Component Authors
N/A
Deprecations
Bypassed legacy testing targets checking deprecated and retired Google Cloud AI Platform (CAIP) integration points, fully migrating Vertex AI-compatible pipeline targets.
Bug Fixes and Other Changes
Refactored Wide & Deep functional models (taxi_utils.py, templates, and test modules) to slice wide categorical input layers dynamically matching actually wide-encoded category bounds ([:len(_MAX_CATEGORICAL_FEATURE_VALUES)]). This prevents disconnected inputs from triggering Keras 3 inputs not connected to outputs exception under Python 3.10.
Converted Keras Functional model building methods' Normalization layer instantiation inside list comprehensions to standard procedural for loops, fully securing execution scope connectivity tracking under Python 3.10.
Implemented dynamic pytest_ignore_collect hooks in conftest.py with static spec checks (importlib.util.find_spec) to dynamically exclude targets of uninstalled optional dependencies (like Airflow, Vertex AI, and Kubeflow). This completely eliminates early logging stream deadlocks and startup import-time test suite collection crashes.
Upgraded Docker build tools and wheel scripts, configuring internal compilation of TFDV and TFX-BSL source files on a unified conda-GCC 13/binutils toolchain using Bazel 7.7.0.
Resolved random temporary directory synchronization and write finalizer errors in BulkInferrer (executor.py) when executing flattened PCollections under local runners (DirectRunner/PrismRunner/FnApiRunner) by introducing a dynamic helper mapping local executions to use num_shards=1 while preserving high-performance dynamic sharding for distributed production pipelines.
Bypassed strict committed/attempted metrics equivalence checks in the Transform ExecutorTest base class (executor_test.py) that crashed under modern versions of Apache Beam utilizing the parallel/multi-process PrismRunner backend due to asynchronous task metric updating limits, ensuring robust and stable local metrics count verifications.
Monkey-patched PipelineOptions dynamically in the global test conftest (conftest.py) to bypass resource-throttled multi-process PrismRunner delegation for standard local testing jobs, forcing the low-overhead, fast single-threaded in-memory DirectRunner (--direct_running_mode=in_memory) globally. This slashes total unit testing execution time and prevents workflow cancellations/timeouts across Python 3.9, 3.10, 3.11, and 3.12 GHA platforms.
Dependency Updates
Upgrades target pipeline constraints to support TensorFlow 2.21.0 and Protobuf 6.x across both Python 3.10, 3.11, 3.12 ande 3.13.
Split SciPy library dependency constraint inside test_constraints.txt using Python target markers to bypass dynamic version conflicts with JAX versions under Python < 3.13.
Cleanly dropped outdated/incompatible dependencies (tensorflow-decision-forests, tensorflow-ranking, tensorflow-text, tensorflowjs) globally from dependencies list and constraint definitions to prevent PIP backtracking solver storms and secure stable installation on TF 2.21.0.
Placeholder.__format__() is now disallowed, so you cannot use placeholders
in f-strings and str.format() calls anymore. If you get an error from this,
most likely you discovered a bug and should not use an f-string in the first
place. If it is truly your intention to print the placeholder (not its
resolved value) for debugging purposes, use repr() or !r instead.
Drop supports for the Estimator API.
For Pipeline Authors
N/A
For Component Authors
N/A
Deprecations
KubeflowDagRunner (KFP v1 SDK) is deprecated. Use KubeflowV2DagRunner (KFP v2 pipeline spec) instead.
Since Estimators will no longer be available in TensorFlow 2.16 and later versions, we have deprecated examples and templates that use them. We encourage you to explore Keras as a more modern and flexible high-level API for building and training models in TensorFlow.
Placeholder.__format__() is now disallowed, so you cannot use placeholders
in f-strings and str.format() calls anymore. If you get an error from this,
most likely you discovered a bug and should not use an f-string in the first
place. If it is truly your intention to print the placeholder (not its
resolved value) for debugging purposes, use repr() or !r instead.
Drop supports for the Estimator API.
For Pipeline Authors
N/A
For Component Authors
N/A
Deprecations
KubeflowDagRunner (KFP v1 SDK) is deprecated. Use KubeflowV2DagRunner (KFP v2 pipeline spec) instead.
Since Estimators will no longer be available in TensorFlow 2.16 and later versions, we have deprecated examples and templates that use them. We encourage you to explore Keras as a more modern and flexible high-level API for building and training models in TensorFlow.
Extend GetPipelineRunExecutions, GetPipelineRunArtifacts APIs to support
filtering by execution create_time, type.
ExampleValidator and DistributionValidator now support anomalies alert
generation. Users can use their own toolkits to extract and process the
alerts from the execution parameter.
Allow DistributionValidator baseStatistics input channel artifacts to be
empty for cold start of data validation.
ph.make_proto() allows constructing proto-valued placeholders, e.g. for
larger config protos fed to a component.
ph.join_path() is like os.path.join() but for placeholders.
Support passing in experimental_debug_stripper into the Transform
pipeline runner.
Breaking Changes
Placeholder and all subclasses have been moved to other modules, their
structure has been changed and they're now immutable. Most users won't care
(the main public-facing API is unchanged and behaves the same way). If you
do special operations like isinstance() or some kind of custom
serialization on placeholders, you will have to update your code.
placeholder.Placeholder.traverse() now returns more items than before,
namely also placeholder operators like _ConcatOperator (which is the
implementation of Python's + operator).
The placeholder.RuntimeInfoKey enumeration was removed. Just hard-code the
appropriate string values in your code, and reference the new Literal type placeholder.RuntimeInfoKeys if you want to ensure correctness.
Arguments to @component must now be passed as kwargs and its return type
is changed from being a Type to just being a callable that returns a new
instance (like the type's initializer). This will allow us to instead return
a factory function (which is not a Type) in future. For a given @component def C(), this means:
You should not use C as a type anymore. For instance, replace isinstance(foo, C) with something else. Depending on your use case, if
you just want to know whether it's a component, then use isinstance(foo, tfx.types.BaseComponent) or isinstance(foo, tfx.types.BaseFunctionalComponent).
If you want to know which component it is, check its .id instead.
Existing such checks will break type checking today and may additionally
break at runtime in future, if we migrate to a factory function.
You can continue to use C.test_call() like before, and it will
continue to be supported in future.
Any type declarations using foo: C break and must be replaced with foo: tfx.types.BaseComponent or foo: tfx.types.BaseFunctionalComponent.
Any references to static class members like C.EXECUTOR_SPEC breaks
type checking today and should be migrated away from. In particular, for .EXECUTOR_SPEC.executor_class().Do() in unit tests, use .test_call()
instead.
If your code previously asserted a wrong type declaration on C, this
can now lead to (justified) type checking errors that were previously
hidden due to C being of type Any.
ph.to_list() was renamed to ph.make_list() for consistency.
For Pipeline Authors
For Component Authors
Deprecations
Deprecated python 3.8
Bug Fixes and Other Changes
Fixed a synchronization bug in google_cloud_ai_platform tuner.
Print best tuning trials only from the chief worker of google_cloud_ai_platform tuner.
Add a kpf dependency in the docker-image extra packages.
Fix BigQueryExampleGen failure without custom_config.
Extend GetPipelineRunExecutions, GetPipelineRunArtifacts APIs to support
filtering by execution create_time, type.
ExampleValidator and DistributionValidator now support anomalies alert
generation. Users can use their own toolkits to extract and process the
alerts from the execution parameter.
Allow DistributionValidator baseStatistics input channel artifacts to be
empty for cold start of data validation.
ph.make_proto() allows constructing proto-valued placeholders, e.g. for
larger config protos fed to a component.
ph.join_path() is like os.path.join() but for placeholders.
Support passing in experimental_debug_stripper into the Transform
pipeline runner.
Breaking Changes
Placeholder and all subclasses have been moved to other modules, their
structure has been changed and they're now immutable. Most users won't care
(the main public-facing API is unchanged and behaves the same way). If you
do special operations like isinstance() or some kind of custom
serialization on placeholders, you will have to update your code.
placeholder.Placeholder.traverse() now returns more items than before,
namely also placeholder operators like _ConcatOperator (which is the
implementation of Python's + operator).
The placeholder.RuntimeInfoKey enumeration was removed. Just hard-code the
appropriate string values in your code, and reference the new Literal type placeholder.RuntimeInfoKeys if you want to ensure correctness.
Arguments to @component must now be passed as kwargs and its return type
is changed from being a Type to just being a callable that returns a new
instance (like the type's initializer). This will allow us to instead return
a factory function (which is not a Type) in future. For a given @component def C(), this means:
You should not use C as a type anymore. For instance, replace isinstance(foo, C) with something else. Depending on your use case, if
you just want to know whether it's a component, then use isinstance(foo, tfx.types.BaseComponent) or isinstance(foo, tfx.types.BaseFunctionalComponent).
If you want to know which component it is, check its .id instead.
Existing such checks will break type checking today and may additionally
break at runtime in future, if we migrate to a factory function.
You can continue to use C.test_call() like before, and it will
continue to be supported in future.
Any type declarations using foo: C break and must be replaced with foo: tfx.types.BaseComponent or foo: tfx.types.BaseFunctionalComponent.
Any references to static class members like C.EXECUTOR_SPEC breaks
type checking today and should be migrated away from. In particular, for .EXECUTOR_SPEC.executor_class().Do() in unit tests, use .test_call()
instead.
If your code previously asserted a wrong type declaration on C, this
can now lead to (justified) type checking errors that were previously
hidden due to C being of type Any.
ph.to_list() was renamed to ph.make_list() for consistency.
Deprecations
Deprecated python 3.8
Bug Fixes and Other Changes
Fixed a synchronization bug in google_cloud_ai_platform tuner.
Print best tuning trials only from the chief worker of google_cloud_ai_platform tuner.
Add a kpf dependency in the docker-image extra packages.
Fix BigQueryExampleGen failure without custom_config.