Fix #66: dataclass from Generic now serializes#68
Fix #66: dataclass from Generic now serializes#68rhaps0dy wants to merge 6 commits intoNiklasRosenstein:developfrom
Conversation
|
|
||
| # Continue with the base classes. | ||
| for base in hint.bases or hint.type.__bases__: | ||
| for base in (*hint.bases, *hint.type.__bases__): |
There was a problem hiding this comment.
I'm not 100% sure atm. whether the order plays a role here, i.e. if maybe the two unrolls should be the other way around. 🤔 Will dig a bit when I get the time, unless you want to hash out what it means.
There was a problem hiding this comment.
So, quickly: this comes from typeapi ClassTypeHint .bases , which uses get_type_hint_original_bases. That on purpose accesses __orig_bases__, which seems to be a CPython attribute of classes which is not very well documented.
According to this PEP, __orig_bases__ contains the instantiated generics for a particular type (e.g. A[int]), as opposed to just the bare types (e.g. A). The former is useful because it gives more information to Databind about how to de/serialize a particular field. However, subclasses overrides methods, and it's not clear that that's going to be in the correct order.
We can get the correct method resolution order by calling .mro() on the class that we're dealing with. Given the test case with inheritance class A(Generic[T]) -> class B(A[int]) -> class C(B), calling C.mro() would give us (C, B, A) which would correctly tell where fields come from. But, we'd be losing type information.
So, my recommendation is:
- We should just call
Class.mro()before this entire loop to get the correct iteration order - For every typehint we get, we get its
__orig_bases__to get the instantiated-Generic types. IfTypeHint(t).typematches anything in the MRO list, we use the instantiated-generic type instead of the original bare type.
There was a problem hiding this comment.
If I were to commit a solution like this, would you merge it?
There was a problem hiding this comment.
Hi @rhaps0dy , I'm sorry for the late response.
Thanks for explaining your train of thought so thoroughly. I'm following you, but I think that if this should, it should probably be on the typeapi level instead. Thatis because recursing through ClassTypeHint.bases should allow you to reconstruct the MRO while at the same type getting access to the typed bases. (Just as you can reconstruct the MRO manually by recursing through type.__bases__, only that you don't have the additional type information).
The behaviour we're seeing seems to stem from the fact that __orig_bases__ does not get set on a subclass that inherits from a generic without parameterizing it.
from typing import Generic, TypeVar
T = TypeVar("T")
class Base(Generic[T]):
a: int
class Correct(Base[T]):
b: str
class Incorrect(Base):
b: str
print(Correct.__orig_bases__)
assert "__orig_bases__" in vars(Correct)
print(Incorrect.__orig_bases__)
assert "__orig_bases__" not in vars(Incorrect)This is a relatively old issue (so the requirement to always add Generic[T] into the mix is no longer present), but it hints at the fact that inheriting from a generic should be done with parameterization (even if that is with the same or another type variable). It seems there is no clear semantic assigned to inheriting from a generic without parameterizing it.
This leads me to consider that this is not necessarily a bug in Databind or Typeapi but in your code. :)
There was a problem hiding this comment.
FWIW I think the need to add # type: ignore[type-arg] is a pretty good hint that the way the inheritance is defined is actually invalid in Python's (aka. Mypy's) type system, and thus I don't think we should support the case.
There was a problem hiding this comment.
Co-authored-by: Niklas Rosenstein <rosensteinniklas@gmail.com>
I hit this bug reported in #66 and fixed it. I think the fix is correct, is it?