-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What is your issue?
As discussed in #10931 and #10804 the netcdf4 backend currently claims all URLs via it's guess_can_open method.
xarray/xarray/backends/netCDF4_.py
Lines 718 to 731 in 6e82a3a
| if isinstance(filename_or_obj, str): | |
| if is_remote_uri(filename_or_obj): | |
| # For remote URIs, check extension (accounting for query params/fragments) | |
| # Remote netcdf-c can handle both regular URLs and DAP URLs | |
| if _has_netcdf_ext(filename_or_obj, is_remote=True): | |
| return True | |
| elif "zarr" in filename_or_obj.lower(): | |
| return False | |
| # return true for non-zarr URLs so we don't have a breaking change for people relying on this | |
| # netcdf backend guessing true for all remote sources. | |
| # TODO: emit a warning here about deprecation of this behavior | |
| # https://github.com/pydata/xarray/pull/10931 | |
| return True | |
This is overeager and led to issues such as #10801. Therefore this behavior needs to be deprecated in favor of either more explicit options #10931 (comment) or at a minimum a tighter scope for what URLs the backend claims in it's guess method.
However, some users and workflows currently rely on this behavior, which caused failures: #10804 (comment)
Therefore, any changes will require a long deprecation cycle with a warning emitted when a user relies on the to-be-changed guessing behavior.
Questions
- What should the end state be?
Is this a wholesale change of guessing behavior for all backends - potentially removing guessing in favor of always explicit speccificaiton? Or is this a smaller descoping of what URLs the netcdf4 backend claims?
- What time frame?
I propose the 2026.01.0 release
Additional reading
The URL-pipeline syntax originally proposed in ZEP8, that now lives here: https://github.com/jbms/url-pipeline#url-pipeline-specification seems as though it would be able to solve the issues here (e.g. specifying if something is a dap url)