Skip to content

[Feature #21264] Replace C extension with pure Ruby implementation for Ruby >= 3.3#155

Open
jinroq wants to merge 2 commits intoruby:masterfrom
jinroq:replace_c_to_ruby
Open

[Feature #21264] Replace C extension with pure Ruby implementation for Ruby >= 3.3#155
jinroq wants to merge 2 commits intoruby:masterfrom
jinroq:replace_c_to_ruby

Conversation

@jinroq
Copy link

@jinroq jinroq commented Feb 15, 2026

https://bugs.ruby-lang.org/issues/21264

Summary

Rewrite the Date and DateTime C extension as pure Ruby, targeting Ruby 3.3+.
Ruby < 3.3 continues to use the existing C extension as before.

  • Ruby >= 3.3: Pure Ruby implementation (~9,500 lines across 10 files in lib/date/)
  • Ruby < 3.3: Existing C extension (ext/date/) compiled via rake-compiler

All 143 tests pass with 162,593 assertions on both paths.

Motivation

  • Improves portability: no C compiler required for Ruby 3.3+
  • Makes the codebase easier to read, debug, and contribute to
  • Enables Ractor compatibility without C-level thread safety concerns
  • Aligns with the broader Ruby ecosystem trend toward pure Ruby default gems

Architecture

The version branch (RUBY_VERSION >= "3.3") is applied at three layers:

Layer Ruby >= 3.3 Ruby < 3.3
lib/date.rb require_relative pure Ruby files require 'date_core' (C ext)
ext/date/extconf.rb Generates dummy Makefile (no-op) create_makefile('date_core')
Rakefile task :compile is a no-op Rake::ExtensionTask compiles C ext
C option Purpose Pure Ruby
USE_PACK Bit-pack mon/mday/hour/min/sec into a single integer for memory efficiency Not needed — uses standard instance variables (@nth, @jd, @df, @sf, @of, @sg)
TIGHT_PARSER Stricter Date._parse (disabled by default in C via /* #define TIGHT_PARSER */) Matches C default behavior (loose parser) — TIGHT_PARSER logic is not implemented

Pure Ruby file structure

File Lines Description
lib/date/core.rb 3,693 Date class (civil, ordinal, commercial, JD, arithmetic, comparison)
lib/date/parse.rb 2,607 Date._parse, _iso8601, _rfc3339, _rfc2822, _xmlschema, _jisx0301
lib/date/datetime.rb 826 DateTime subclass (hour, min, sec, offset)
lib/date/strptime.rb 769 strptime parsing
lib/date/strftime.rb 600 strftime formatting
lib/date/zonetab.rb 405 Timezone offset table
lib/date/patterns.rb 403 Regex patterns for parsing
lib/date/constants.rb 182 Calendar reform constants (ITALY, ENGLAND, GREGORIAN, etc.)
lib/date/time.rb 59 Date#to_time, Time#to_date, Time#to_datetime
lib/date/version.rb 5 Date::VERSION

Changes

  • Rakefile: Branch on RUBY_VERSION for compile/test task setup; test depends on compile for Ruby < 3.3
  • date.gemspec: Include both lib/**/*.rb and ext/date/* files; set extensions
  • ext/date/extconf.rb: Generate dummy Makefile on Ruby >= 3.3, build C ext otherwise
  • lib/date.rb: Branch on RUBY_VERSION for require path
  • lib/date/*.rb (new): Pure Ruby implementation (10 files, ~9,500 lines)

Sidenote

It has not been refactored because the goal is to replace C with Ruby. If this PR is merged, it will be refactored.

C implementation has been rewritten as faithfully as possible in pure Ruby.

[Feature #21264]

https://bugs.ruby-lang.org/issues/21264
@jinroq jinroq changed the title Replace C extension with pure Ruby implementation for Ruby >= 3.3 [Feature #21264] Replace C extension with pure Ruby implementation for Ruby >= 3.3 Feb 15, 2026
@jeremyevans
Copy link
Contributor

Date was originally written in Ruby prior to Ruby 1.9.3. It was rewritten in C to significantly increase performance. When Date was written in Ruby, it's low performance made it a common bottleneck in Ruby applications. I think for this to be considered, you need to provide comprehensive benchmarks showing that performance does not decrease significantly.


MONTHNAMES = [nil, "January", "February", "March", "April", "May", "June",
"July", "August", "September", "October", "November", "December"]
.map { |s| s&.encode(Encoding::US_ASCII)&.freeze }.freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put # encoding: US-ASCII at the beginning.

@nobu
Copy link
Member

nobu commented Feb 15, 2026

A simple benchmark to just create objects:

require 'benchmark'
require 'date'

N = 10000
Benchmark.bm do |bm|
  bm.report("Time") {N.times {Time.now}}
  bm.report("Date") {N.times {Date.today}}
end

With ruby 4.1.0dev (2026-02-14T07:03:18Z master 2065b55980) +PRISM [arm64-darwin25], and master:

$ ruby -I./lib bench.rb
          user     system      total        real
Time  0.001656   0.000023   0.001679 (  0.001675)
Date  0.002735   0.000062   0.002797 (  0.002827)

This PR:

$ ruby -I./lib bench.rb
          user     system      total        real
Time  0.001018   0.000013   0.001031 (  0.001031)
Date  0.007624   0.000151   0.007775 (  0.007776)

Interestingly, this PR makes Time.now faster.

@jeremyevans
Copy link
Contributor

@nobu you should probably benchmark with benchmark-driver or benchmark-ips. With a runtime of only ~1ms, it's hard to get statistically valid results. Considering I don't think date modifies the implementation of Time.now, it seems unlikely there would be an significant performance difference.

A benchmark should include most of the methods in the library. When I was working on home_run, I had a set of comprehensive benchmarks to see the differences in performance compared to the (at the time) Ruby implementation. It included a decent set of benchmarks (https://github.com/jeremyevans/home_run/blob/master/bench/cpu_bench.rb), though I would certainly switch the backend to use benchmark-driver or benchmark-ips for this.

@nobu
Copy link
Member

nobu commented Feb 15, 2026

For the mean time, just tried Benchmark.ips.

master:

Warming up --------------------------------------
            Time.now   206.000 i/100ms
          Date.today    46.000 i/100ms
Calculating -------------------------------------
            Time.now      2.096k (± 0.1%) i/s  (477.11 μs/i) -     10.506k in   5.012541s
          Date.today    459.375 (± 0.7%) i/s    (2.18 ms/i) -      2.300k in   5.006967s

This PR:

Warming up --------------------------------------
            Time.now    206.000 i/100ms
          Date.today    16.000 i/100ms
Calculating -------------------------------------
            Time.now      2.143k (± 0.6%) i/s  (466.72 μs/i) -     10.918k in   5.095787s
          Date.today    166.713 (± 0.0%) i/s    (6.00 ms/i) -    848.000 in   5.086612s

Agree there seems to be a lot of room for optimization.
The current extension is line-by-line translation from Ruby to C and not optimized for C.
This PR looks also line-by-line in reverse and doubly non-optimal.

@jeremyevans
Copy link
Contributor

The current extension is line-by-line translation from Ruby to C and not optimized for C.

I don't believe the line-by-line translation part is 100% accurate, though it may be true for large portions of the library. The primary implementation difference between the current C implementation and the previous (pre Ruby 1.9.3) Ruby implementation was that the previous Ruby implementation always eagerly converted from whatever the input format was to ajd (e.g. https://github.com/ruby/ruby/blob/ruby_1_9_2/lib/date.rb#L1621-L1629). That's the primary reason it was so slow. home_run pioneered the idea of not converting eagerly to ajd, only doing the conversion later when it was actually needed. That same basic approach was used by tadf when he rewrote date from Ruby to C. See https://bugs.ruby-lang.org/issues/4068 for background on that change.

I think we'd be willing to accept a small performance decrease to switch the C implementation with a Ruby implementation. However, a ~3x performance decrease is way too much to consider switching, IMO. As I mentioned earlier, Date was often a bottleneck in application code before Ruby 1.9.3, that's the reason I worked on home_run. So performance should be a primary consideration when deciding whether to switch to an alternative implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants