Add `ruby-rbs` crate: Safe Rust wrapper over `ruby-rbs-sys` #2808

alexcrocha · 2026-01-15T01:42:38Z

Description

This PR introduces ruby-rbs, a safe Rust wrapper for the RBS parser. Builds on #2807 (ruby-rbs-sys) — please refer to it for motivation and background on the two-crate approach.

ruby-rbs Overview

build.rs reads config.yml and generates Rust structs matching the C AST node types
Each node struct holds a pointer to the C node with lifetime bounds
Lifetimes ('a) tie all nodes to the parser, preventing use-after-free
SignatureNode implements Drop to free the parser when dropped
Includes a Visit trait for traversing the AST

Changes to config.yml

This PR adds two new fields to node definitions:

rust_name: Specifies the Rust struct name (e.g., BoolNode for RBS::AST::Bool)
optional: Documents which fields can be NULL in the C parser

No impact on existing code. These fields are ignored by the Ruby/C code generators.

This commit introduces the `ruby-rbs` crate, which will provide a safe, high-level Rust API for the RBS C library. It follows the common Rust pattern of separating the safe wrapper from the `*-sys` crate that provides the raw FFI bindings. The `ruby-rbs` crate will depend on `ruby-rbs-sys` for the unsafe C bindings and will expose a safe, idiomatic Rust interface. This commit sets up the foundation for that structure. The initial implementation includes: - The basic crate structure with its own Cargo.toml, declaring a dependency on `ruby-rbs-sys`. - A build script (`build.rs`) that will be responsible for generating safe Rust wrappers from the C API. Currently, it only generates an empty `bindings.rs` file. - The `ruby-rbs` crate is added to the main workspace `Cargo.toml`. While the interaction is not yet implemented, this setup paves the way for providing a robust Rust interface for RBS, which will improve safety and developer experience.

The build script now reads the config.yml file and generates corresponding Rust struct definitions for all RBS AST nodes. Implementation details: - Parse config.yml using serde to extract node definitions - Generate proper Rust module hierarchy from :: namespace separators - Apply Rust naming conventions: - Modules use snake_case - Structs remain PascalCase - Handle Rust reserved keywords (Use -> UseDirective, Self -> SelfType) - Smart PascalCase to snake_case conversion that correctly handles acronyms (e.g., 'AST' -> 'ast', not 'a_s_t') The generated bindings create empty struct definitions organized in the correct module hierarchy, laying the foundation for the safe Rust API that will wrap the ruby-rbs-sys FFI bindings.

Instead of auto-generating nested module paths from RBS nested naming conventions, use explicit `rust_name` fields in `config.yml` and generate flat structs. - Add `rust_name` field to all node definitions in `config.yml` - Remove complex module/path parsing logic from build.rs - Generate flat structs (e.g., `ClassNode`) instead of nested modules - Add `Node` enum to wrap all node types This makes the generated Rust code easier to work with.

Handle rbs_string field types when generating Rust structs from config.yml. The RBSString struct wraps rbs_string_t pointers and provides an as_bytes() method that safely calculates string length using pointer arithmetic.

The `parse` function enables parsing RBS code from Rust. This provides a safe Rust interface to the C parser, handling memory management and encoding setup.

Since `bool` is a primitive type with direct FFI mapping between C and Rust, we don't need a wrapper struct like we do for complex types (`rbs_string_t`, etc.).

Symbol fields in RBS AST nodes store their values as constant IDs that need to be resolved through the parser's constant pool. This safe Rust wrapper (`RBSSymbol`) maintains a reference to the parser and provides access to the symbol's name bytes, similar to how `RBSString` handles string types. The build script now generates accessors for `rbs_ast_symbol` fields that properly pass both the symbol pointer and parser reference to enable constant pool lookups.

Refactor node structs to use pointer-based access and add NodeList iterator Changes node generation from storing individual fields to holding a single pointer to the C struct. This avoids duplicating data in Rust structs and matches the pattern used in Prism's bindings. We just maintain a thin wrapper around the C pointer and dereference it in accessor methods. Adds NodeList/NodeListIter to enable idiomatic Rust iteration over RBS's linked list structures, and implements Node::new() factory method that type-checks the C node pointer and constructs the appropriate Rust variant with proper pointer casting. Also adds convert_name() helper to generate C identifiers from RBS node names (snake_case_t for types, UPPER_CASE for enum constants).

Many AST nodes in `config.yml` have location fields (`rbs_location`, `rbs_location_list`). This change adds the necessary wrapper structs (`RBSLocation`, `RBSLocationList`) and updates `build.rs` to generate accessors for these fields. The `RBSLocation` wrapper includes a reference to the parser to support future functionality like source extraction.

Enable nested AST traversal by exposing rbs_node and rbs_node_list fields. Nested structure traversal (e.g., class members, constant types) depends on access to rbs_node and rbs_node_list fields. Making these fields accessible aligns the Rust bindings with the C API. Fields named "type" are accessible via type_ to avoid a Rust keyword collision.

Adds `test_parse_integer()` which parses an integer literal type alias and traverses the AST (`TypeAlias` -> `LiteralType` -> `Integer`) using pattern matching to verify node types and extract values. This validates that the generated node wrappers enable AST traversal in pure Rust with proper type safety. Also adds `Debug` derives and refactors memory management by returning `SignatureNode` instead of raw pointer, with `Drop` impl to free parser.

Refactor the previous implementation of `Symbol`/`Keyword` handling to treat them as first-class nodes in the build configuration. `Keyword` and `Symbol` represent identifiers (interned strings), not traditional AST nodes. However, the C parser defines them in `rbs_node_type` (as `RBS_KEYWORD` and `RBS_AST_SYMBOL`) and treats them as nodes (`rbs_node_t*`) in many contexts (lists, hashes). Instead of manually defining `RBSSymbol`/`RBSKeyword` structs, we now inject them into the `config.yml` node list in `build.rs`. This allows them to be generated as `SymbolNode`/`KeywordNode` variants in the `Node` enum, enabling polymorphic handling (in Node lists and Hashes)

Add support for RBS hashes (`rbs_hash_t`), which are used in Record types and Function keyword arguments

Enable walking the AST by generating a `Visit` trait with per-node visitor methods. It uses double dispatch to route each node type to its corresponding visitor method. This avoids consumers needing to manually match on Node variants and allows overriding specific visits while inheriting default behaviour for others.

Some C struct pointer fields can be NULL (super_class when no parent class, comment when no doc comment). This metadata allows our Rust codegen to generate Option<T> return types for these accessors instead of unconditionally wrapping potentially NULL pointers.

Read `optional: true` annotations from `config.yml` and generate `Option<T>` return types with null checks, so we don't crash at runtime. The extracted helper function centralizes the accessor generation logic for pointer-based field types.

The Visit trait added in #69 provided the scaffolding for AST traversal, but the visitor functions were empty stubs that didn't recurse into children nodes. Without this, the visitor pattern is incomplete as we'd have to manually write traversal logic every time we want to walk the tree. This commit adds the generation of visitor functions for child node traversal. We handle four field types: - `rbs_node`: single child node - `rbs_node_list`: list of child nodes - `rbs_hash`: key-value pairs of nodes - Wrapper types (`rbs_type_name`, `rbs_namespace`, etc): each with its own visitor method Each case handles optional fields to safely skip NULL pointers

Each node already has location data in its C struct, but it wasn't exposed through the Rust API. This adds a generated `location()` method to every node type, making it easy to get source ranges for any part of the AST. Also removing `parser` from location structs as it is not needed.

Addressing some linting warnings

Adds `location()` accessor to the `Node` enum, delegating to each variant's `location()` method. A previous commit added `location()` to individual node types but missed the enum itself. This allows getting the location of the entire node definition when working with the `Node` enum directly.

Reorder lib.rs structs alphabetically Improve bindings code formatting Remove TODO comments from rust crate Some nodes don't use their parser field, but conditionally omitting it adds significant complexity. Keep parser on all nodes and suppress the warning on the parser field. Remove debug comment from generated bindings

Adds lifetimes to make borrowing relationships clearer so the Rust compiler can validate and enforce them.

Replaced `*mut T` with `NonNull<T>` for the parser pointer to make the ‘never null’ assumption explicit. `NonNull<T>` represents a non-null raw pointer (a wrapper around `*mut T`) that guarantees the pointer is never null.

TypeApplicationAnnotation, InstanceVariableAnnotation, ClassAliasAnnotation, and ModuleAliasAnnotation also need rust_name fields for rust binding code generation.

alexcrocha and others added 30 commits January 14, 2026 14:31

Handle RBSString types

013cd9b

Handle rbs_string field types when generating Rust structs from config.yml. The RBSString struct wraps rbs_string_t pointers and provides an as_bytes() method that safely calculates string length using pointer arithmetic.

Add parse function to Rust RBS bindings

87ce94b

The `parse` function enables parsing RBS code from Rust. This provides a safe Rust interface to the C parser, handling memory management and encoding setup.

Handle bool primitive types

a9da36c

Since `bool` is a primitive type with direct FFI mapping between C and Rust, we don't need a wrapper struct like we do for complex types (`rbs_string_t`, etc.).

Handle RBSKeyword types

431a3e1

Handle CommentNode types

fb492d6

Handle ClassSuperNode types

babe55b

Handle NamespaceNode types

e9e1e4d

Handle TypeNameNode types

3beae9f

Handle BlockTypeNode types

d312d1d

Handle RBSHash types

a188680

Add support for RBS hashes (`rbs_hash_t`), which are used in Record types and Function keyword arguments

Handle optional field types

12afcbe

Read `optional: true` annotations from `config.yml` and generate `Option<T>` return types with null checks, so we don't crash at runtime. The extracted helper function centralizes the accessor generation logic for pointer-based field types.

Use inline format args

2500def

Addressing some linting warnings

Add lifetimes

2839682

Adds lifetimes to make borrowing relationships clearer so the Rust compiler can validate and enforce them.

Use NonNull wrapper for parser pointers

12fe16a

Replaced `*mut T` with `NonNull<T>` for the parser pointer to make the ‘never null’ assumption explicit. `NonNull<T>` represents a non-null raw pointer (a wrapper around `*mut T`) that guarantees the pointer is never null.

Add must_use attributes to accessor methods

548d438

alexcrocha added 2 commits January 14, 2026 14:31

Credit Prism for code generation pattern

c801005

Add missing rust_name to annotation nodes

86fb90a

TypeApplicationAnnotation, InstanceVariableAnnotation, ClassAliasAnnotation, and ModuleAliasAnnotation also need rust_name fields for rust binding code generation.

alexcrocha marked this pull request as ready for review January 15, 2026 01:43

alexcrocha mentioned this pull request Jan 15, 2026

Add ruby-rbs-sys crate: Rust FFI bindings for the RBS parser #2807

Merged

soutaro self-assigned this Jan 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `ruby-rbs` crate: Safe Rust wrapper over `ruby-rbs-sys` #2808

Add `ruby-rbs` crate: Safe Rust wrapper over `ruby-rbs-sys` #2808

alexcrocha commented Jan 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add ruby-rbs crate: Safe Rust wrapper over ruby-rbs-sys #2808

Are you sure you want to change the base?

Add ruby-rbs crate: Safe Rust wrapper over ruby-rbs-sys #2808

Conversation

alexcrocha commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

ruby-rbs Overview

Changes to config.yml

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add `ruby-rbs` crate: Safe Rust wrapper over `ruby-rbs-sys` #2808

Add `ruby-rbs` crate: Safe Rust wrapper over `ruby-rbs-sys` #2808

alexcrocha commented Jan 15, 2026 •

edited

Loading