Skip to content

Conversation

@mriise
Copy link
Contributor

@mriise mriise commented Dec 12, 2025

Overview

This is the start of a series of PRs to build up avatar toggles/expressions (aka Avatar State) in Basis. Feedback and collaboration is desired as this will be fairly impactful for many framework users, end users, and content authors.

Goals of Avatar State

No Complex Behavior

It is a non-goal to handle complex behavior. Complex behavior should be written in a script or some other dynamic, application or future system. It will handle value updates on various aspects of an avatar and not much more.

Readable and Portable Serialization

Avatar authors should easily be able to debug and modify properties on their clothing, textures, scripts, and shaders. This means what is being modified should be easily understood, which unity animation name is not. (see XProp)

Ideally authors should be able to build these directly in the context of third party tools like blender. This means much of the serialized values should be stored as JSON (or easily converted to/from).

Sparse Wire Format

Avatar state is shared sparsely (deltas) to keep bandwidth usage low. Avatar state should include a manifest of what can be synced/modified. Syncing should only be sending updates by reference to that manifest and the new value. This means late joiners or lost events requires server stored state.

XProp

There already are ways in unity (SerializedPropertyName) to point to a property on a script or material. However this is not a unified or particularly readable format for content authors and users.

glTF has a much simpler scheme that boils down to JSON pointers https://datatracker.ietf.org/doc/html/rfc6901 however this is inherently tied to the file format and how it is serialized for that specific file. JSON pointers also inherently have loose typing, a specific path can refer to a single value or an entire object. This introduces complexity in the classic "everything is an object" way. Target values should be a simple scalar (float/bool/int) or very close (float3/rgb) so we can build ui and networking without doing too much work (and maybe squeak out more performance later on by being smart about compression/alignment).

Therefore I'm committing an xkcd 927.

Ok, now what?

What is consistent across many different 3d software is the hierarchy.

/Head/Hair/Hat

With just that its possible to refer to a specific node in the tree easily, The problem is when we want to do something that's either on or related to that node. This is where we only specify what rough interface (aka facet) we can use to point to.

/Head/Hair/Hat::xform

This doesn't get us to a specific value, just what we can start using to read the following. The following can be fairly simple for well known and consistent properties.

/Head/Hair/Hat::xform:scale

Now we can make our hat big, what if we only want a really tall hat? Implementer and the parser doesnt really know how to do that, so lets have some simple types that the user can modify without needing to dig into the runtime to figure out what the target property actually is.

/Head/Hair/Hat::xform<float3>:scale

What about changing its color? Well we can define material properties easily enough through a straight "_MainColor". The issue is that there often more than one material on a given mesh, so lets make a way to specify that.

/Head/Hair/Hat::mat(2)<float3>:_MainColor

Interpolating between values is common enough when changing values, however for some common values (quaternions and colors) behave poorly when linearly interpolated. It would also be nice for the UI to have a color picker, so lets add another type.

/Head/Hair/Hat::mat(2)<rgba>:_MainColor

Lets say we want to work with a unity Monobehavior directly! We already have a way to add qualifiers so let's add another for the type name, and for which one it is.

/Head/Hair/Hat::behavior(ParticleSystem, 0)<bool>:enabled

Is /Head at the scene root? root of the avatar? relative to the script? Lets add some distinction for authors and the sanity of admins.

#/Head/Hair/Hat::behavior(ParticleSystem, 0)<bool>:enabled

Then we can canonicalize it at runtime and insert the path leading up to that avatar! XProp starting with / are now always root of the scene.

/RemoteAvatars/mriise/Head/Hair/Hat::behavior(ParticleSystem, 0)<bool>:enabled

For more info read XPropSpec.md

- add Avatar State as a new package
- introduce XProp addressing
@yewnyx
Copy link
Collaborator

yewnyx commented Dec 12, 2025

glTF has a much simpler scheme that boils down to JSON pointers https://datatracker.ietf.org/doc/html/rfc6901 however this is inherently tied to the file format and how it is serialized for that specific file. JSON pointers also inherently have loose typing, a specific path can refer to a single value or an entire object. This introduces complexity in the classic "everything is an object" way. Target values should be a simple scalar (float/bool/int) or very close (float3/rgb) so we can build ui and networking without doing too much work (and maybe squeak out more performance later on by being smart about compression/alignment).

Your suggestion seems to boil down to

pathselector::type(gameobjectselectors)<implicitcasttype>:typespecificselectors or something like that.

but these are all still selectors and it creates an entirely new semantic universe for one to ask questions about "wait, which type is the important one here? What was the API designer's mental model here?" (and to be wrong about their speculations).

I also don't seen an example demonstrating how ambiguous/redundant names are handled.

Unfortunately, I think everything really is an object, both abstractly and concretely. The suggested language seems difficult to parse and the mental model justifying it is not strongly articulated. It seems more like it is working backwards from some data to describe its typical structure but it may not account comprehensively for weird - but plausible - hierarchies and component setups.

@yewnyx
Copy link
Collaborator

yewnyx commented Dec 12, 2025

It is probably also worth noting that although exact pointers may be resolvable left-to-right, the typical algorithm to resolve ambiguous queries (which in our case could be useful for any avatar gimmick that can attach to different parts of a hierarchy) are typically resolved from the most specific component to the least specific component.

Some kind of component index to accelerate queries may help with that, since traversing a level up the hierarchy can be assumed to be very cheap.

EDIT: silly me, the spec md didn't load for me so I skipped over it. Reading it.

@yewnyx
Copy link
Collaborator

yewnyx commented Dec 12, 2025

Okay, read over the spec. My substantive critique is:

  • An XProp containing an ambiguous path cannot be known ahead of time, i.e. the expression necessarily parses correctly but the object that is queried against may contain duplicate transform names.
    • If the XProp containing an ambiguous path is considered a valid parse but an invalid query at runtime, then it is impossible to determine the semantic validity of a query outside of actually executing it.
      • As such it would seem that the query algorithm would be unimplementable as described, nor would query result caching be reliable unless invalidated on any hierarchy change.
    • If ambiguous paths are considered valid parses as well as valid at runtime, then queries
      • can succeed with multiple results, when all results fully match the query
      • cannot be forward-evaluated in case one result candidate partially matches the query
  • If the hierarchy is assumed to be mutable (which it obviously can be), then the only time the hierarchy can be completely validated is pre-build and immediately post-build in editor. This lends itself to substituting any querying solution with a unique id and as native a serialized pointer as possible (i.e. a top level component with keys pointing to transforms or components by name)

@Toys0125
Copy link
Contributor

Toys0125 commented Dec 12, 2025

  • An XProp containing an ambiguous path cannot be known ahead of time, i.e. the expression necessarily parses correctly but the object that is queried against may contain duplicate transform names.

At build we require basis avatar's and objects to not have duplicate gameobject names. So shouldn't be an issue at build, but if allowed a script could change the name of gameobject which can introduce the duplicate issue.

@mriise
Copy link
Contributor Author

mriise commented Dec 12, 2025

  1. A Not sure what is meant by ambiguous, but correct- its impossible to truly know what is valid until it is actually run. It is however possible to use of what it should look like, this is a classic argument of data typing. By embedding what data should look like and where it is we can do a lot of work optimistically without needing to pull out or call into a full runtime.
  2. B I am making the assumption that hierarchy, components, ect are static for the purposes of caching, however this is not a hard requirement, for authorship tooling where things are changed often it probably shouldn't cache references too much. There are different approaches to how and in what ways references are preserved (invalidate? path rewrite?) and im putting that as out of scope for XProp.
  3. C parsing validity != runtime guarantee. Parsing successfully only means that it conforms to a specific interface (node, facet, type).
  4. yes the scene as a whole is mutable, though in the case of avatars I would prefer to keep their internal hierarchy immutable for most cases at runtime. The avatar root can potentially placed anywhere in the scene, but we already do some bookeeping to find it in the scene. We could still keep things fully mutable we can build up the references once by pointer or GUID and store that- but that is a terrible experience for debugging and authoring.

@Toys0125
Copy link
Contributor

My other comment I have is that couldn't this all be incorporated into cilbox or at least a cilbox script generator that will handle state syncing. Similar to how vrchat only deals with the privatives of ints, floats, and bools. Making an entire underlying system that is separate from our current scripting engine would just add complexity that not needed.

@mriise
Copy link
Contributor Author

mriise commented Dec 15, 2025

My other comment I have is that couldn't this all be incorporated into cilbox or at least a cilbox script generator that will handle state syncing. Similar to how vrchat only deals with the privatives of ints, floats, and bools. Making an entire underlying system that is separate from our current scripting engine would just add complexity that not needed.

It could, yes. Though the intention is for this to sync every avatar in the scene. Cilbox is interpreted CIL which is inherently slower, not to mention non-burst-able. Therefore I am building this as though it is an included package. If avatar state ends up being lightweight enough and someone wants to make a cilbox version that might be the way to go. Ultimately complex behavior and gimicks should be done with dynamicly loaded and sandboxed scripting of some kind, but that is intentionally out of scope. It is my opinion that a moderately flat and simple system will cover 85-95% of what people do with avatars (excluding scripted props like guns/knives/ect)

VRC has different syncing methodology, full state is sent per packet so there isnt much the server or client needs to keep track of other than the most recent message. Parameter compression systems still work with the the most recent packet but do partial syncs per packet that rotate through the full state.

We can do better by doing a bit of work on the server to keep track of avatar state. This will save on bandwidth and client complexity, but sacrifices a bit on server complexity.

@aaronfranke
Copy link
Contributor

however this is inherently tied to the file format and how it is serialized for that specific file.

Yes, but this is not a problem. Basis would import these into whatever runtime format it needs. Users of Blender would still need to go through a common 3D model format to get to Basis, like glTF or FBX.

JSON pointers also inherently have loose typing, a specific path can refer to a single value or an entire object.

The typing is handled by the specification, not by the pointer itself. If you point to glTF /nodes/0/translation then that's a Vector3 and any animation values need to be Vector3. There's no need to encode the fact that it's a Vector3 inside of the pointer like /nodes/0/translation<float3> because in order to handle this you need to know what /nodes/0/translation means anyway, and it can only be a Vector3/float3.

/Head/Hair/Hat::xform:scale

What you are describing here is almost exactly the same as Godot's NodePath syntax. https://docs.godotengine.org/en/stable/classes/class_nodepath.html

To translate your example NodePath, it would be /Head/Hair/Hat:scale or /Head/Hair/Hat:transform:scale (Godot has a scale getter directly on the node, but the transform can also be accessed).

The existence of such a syntax requires that some characters are banned from the names of GameObjects. Godot disallows . : @ / " % in node names, I suggest that Basis bans these in GameObject names as well.

Is /Head at the scene root? root of the avatar? relative to the script?

In Godot's way, Head/Hair is relative, and /Head/Hair is absolute from the root. In Godot this means the entire scene root, but for the purposes of a BEE file, this should be the avatar's root.

At build we require basis avatar's and objects to not have duplicate gameobject names.

This is a good requirement. Blender does this, Godot's glTF importer does this, G4MF does this. This enables us to unambiguously refer to GameObjects by name.

@mriise
Copy link
Contributor Author

mriise commented Dec 15, 2025

Yes, but this is not a problem. Basis would import these into whatever runtime format it needs. Users of Blender would still need to go through a common 3D model format to get to Basis, like glTF or FBX.

My thought is to keep these serialized separately during authorship, since the exact method of embedding and extracting of this meta is inconsistent or inaccessible depending on the tool being used. glTF is the best candidate for embedding it with extras but that is not the universal method for content authoring.

The typing is handled by the specification, not by the pointer itself. If you point to glTF /nodes/0/translation then that's a Vector3 and any animation values need to be Vector3. There's no need to encode the fact that it's a Vector3 inside of the pointer like /nodes/0/translation<float3> because in order to handle this you need to know what /nodes/0/translation means anyway, and it can only be a Vector3/float3.

This is true for known specs, but breaks down if you have systems that are dynamic or custom built. Say for example a unity scripting facet. /Head/Hair/Hat::u_script(MyComponentType, 0):foo.bar[0].baz unless there is an agreed upon spec of MyComponentType (which is impractical for user generated scripts), there is no way of knowing what the input type should be, or what the actual runtime type should be unless there is some reflection or trial/error. Embedding the type info in the pointer means input systems can make some distinction on the data to enforce, and output/writing systems can do type casting or produce type errors helpful in debugging. These are also just type hints at the moment, as much as I prefer to keep data types known through the process.

What you are describing here is almost exactly the same as Godot's NodePath syntax. https://docs.godotengine.org/en/stable/classes/class_nodepath.html

hahaha you are right, thats cool. Interesting to see how similar it is. Good to know since this would map fairly easily over to Godot.

The existence of such a syntax requires that some characters are banned from the names of GameObjects. Godot disallows . : @ / " % in node names, I suggest that Basis bans these in GameObject names as well.

Good practice, yeah. I would get @dooly123 to sign off on that since thats more a BEE file thing. The current spec has support for escape sequences in node names that would prevent most of the ways those would be abused.

In Godot's way, Head/Hair is relative, and /Head/Hair is absolute from the root. In Godot this means the entire scene root, but for the purposes of a BEE file, this should be the avatar's root.

This assumes that what is root is tied to the file/avatar root which isnt always assumed, for example a script on an avatar wanting to write a value to /Box is it referring to the root of the scene or the root of the avatar? if it has the ability to do both is it part of the parameter of the method call, or a separate method? It's ambiguous until specifics of the runtime or file format are exposed to users and content authors.

This is a good requirement. Blender does this, Godot's glTF importer does this, G4MF does this. This enables us to unambiguously refer to GameObjects by name.

agree.

I think XProp could be thought of as more the start of a simplistic IDL/schema system, if you include avatar state as a whole. Authors serialize what an avatar can do and how they expect values to be, then the runtime can compress, morph, overwrite, ect without needing to meddle with the internals of how authors built their avatars.

@Toys0125
Copy link
Contributor

Question on not allowed in path, . is the standard for blender/source rigging for bone children so disallowing it in a game object will break many exports from blender based rigging/armature. It also will break a lot of tooling that designed that gameobjects will keep their name until final build step which will do the consistency check and fix all paths for the xform.
And will need to make a warning system to the user saying that its not allowed and have the creator change to underscore naming instead.

@aaronfranke
Copy link
Contributor

@Toys0125 Why would you want tooling to not perform the rename as soon as possible? I would find it very annoying to have the in-editor names not match the names after building, that sounds horrible to debug.

As for . being used in Source rigging, this may not be a problem in the future, with the convergence on an industry-wide standard rig that doesn't include . in the names: https://github.com/meshula/LabRCSF

Godot replaces . with _ in node names on import, so it's advised to not use . in names. It would be unfortunate if a given model only functions in some apps and gets altered in others, ideally avatars/etc should work everywhere.

@yewnyx
Copy link
Collaborator

yewnyx commented Dec 23, 2025

We can do better by doing a bit of work on the server to keep track of avatar state. This will save on bandwidth and client complexity, but sacrifices a bit on server complexity

Note that performance capture and replay is an explicitly desirable feature and will be unavoidable to implement anyways as a result. The question isn't whether the server complexity is increased by supporting it but whether it increases more when layered on top or when refactoring becomes required to support it.

@mriise
Copy link
Contributor Author

mriise commented Dec 24, 2025

A note about the <> part of the address. the contained exists as a place a strong constraint on what value is expected at this interface boundry. If type information is implicit or included outside of the path then it has no use and should probably be omitted. Type casting is a duty outside of this boundry.

For example a writer and reader both internally represent a value as bool, but the prop has type int. Users of Xprop SHOULD validate that value is int between reader and writer. Therefore reader and writer SHOULD cast bool into int before passing it through OR fail with a type error.

Part of the intent of Xprop is to be able to provide a binding interface and UI generation.

example:

fieldID: <uuid>
displayField: <toggle/slider/ect>
targetAddress: #/Head/Hair/Hat::behavior(ParticleSystem, 0)<bool>:enabled

Even that is fairly unnecessary since a rudimentary UI system could just read just a list of Xprops and generate a usable interface. I realize writing this however that this type info can just as well be included adjacently for ui generation, and is extraneous everywhere else since data types are typically known- especially with facets.

I will be removing typehint from the spec. (it is Christmas eve so not today though.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants