High-performance rolling window metrics for R and Python, implemented in C++17.
robustrolling provides numerically stable, memory-efficient rolling window
algorithms built in C++17 and exposed to both R and Python. All algorithms:
- run in O(1) time per element (O(log n) for median),
- handle NaN / NA transparently,
- support a
min_periodsparameter (pandas-compatible semantics), - share a common CRTP base (
RollingMetric<Derived>) — zero virtual dispatch, flat ring-buffer memory layout, - are compiled with
-O3 -fltoand include ARM NEON / AVX2 SIMD paths forrolling_mean.
| C++ class | Algorithm | Time | R API | Python class |
|---|---|---|---|---|
SlidingMean |
Prefix sum + SIMD (ARM NEON / AVX2) | O(n) batch | rolling_mean() |
SlidingMean |
SlidingWelfordRing |
Welford online + ring buffer | O(1) | rolling_variance() (method="stable") |
SlidingWelford |
SlidingMomentsPrefix |
Prefix sums of raw moments | O(n) batch | rolling_variance/skewness/kurtosis() (method="fast") |
SlidingMomentsPrefix |
MonotonicMax |
Monotonic deque | O(1) amortised | rolling_max() |
MonotonicMax |
MonotonicMin |
Monotonic deque | O(1) amortised | rolling_min() |
MonotonicMin |
MultisetMedian |
std::multiset dual-iterator |
O(log n) | rolling_median() |
MultisetMedian |
SlidingMoments |
Terriberry's 4th-moment recurrence | O(1) | rolling_skewness/kurtosis() (method="stable") |
SlidingMoments |
SlidingCovariance |
Welford 2D online | O(1) | rolling_cov() rolling_cor() |
SlidingCovariance |
remotes::install_github("IgorPtak/rolling_window")Or build from source:
git clone https://github.com/IgorPtak/rolling_window.git
cd rolling_window
make r-buildRequires: R ≥ 4.0, a C++17 compiler.
git clone https://github.com/IgorPtak/rolling_window.git
cd rolling_window
pip install py_package/With pandas support:
pip install "py_package/[pandas]"Requires: Python ≥ 3.10, a C++17 compiler, pybind11.
library(robustrolling)
x <- as.double(c(1, 3, 2, 5, 4))Max / Min
rolling_max(x, 3L)
#> [1] NA NA 3 5 5
rolling_min(x, 3L)
#> [1] NA NA 1 2 2Median
rolling_median(x, 3L)
#> [1] NA NA 2 3 4Variance and mean
y <- as.double(c(1, 2, 3, 4, 5))
rolling_variance(y, 3L)
#> [1] NA NA 1 1 1
rolling_mean(y, 3L)
#> [1] NA NA 2 3 4Higher moments
rolling_skewness(y, 3L)
#> [1] NA NA 0 0 0
rolling_kurtosis(y, 4L)
#> [1] NA NA NA -1.2 -1.2Covariance and Pearson correlation
a <- as.double(c(1, 2, 3, 4, 5))
b <- as.double(c(2, 4, 6, 8, 10))
rolling_cov(a, b, 3L)
#> [1] NA NA 2 2 2
rolling_cor(a, b, 3L)
#> [1] NA NA 1 1 1min_periods — require fewer observations
rolling_max(x, 3L, min_periods = 1L)
#> [1] 1 3 3 5 5Fast methods — prefix-sum acceleration
method = "fast" uses prefix sums of raw moments instead of the online
Welford/Terriberry algorithm. It is 2–4x faster on large arrays but
susceptible to catastrophic cancellation for very large values with small
variance. Use it when numerical precision is not critical.
rolling_variance(y, 3L, method = "fast")
rolling_skewness(y, 3L, method = "fast")
rolling_kurtosis(y, 4L, method = "fast")assume_finite = TRUE enables the SIMD fast path for rolling_mean when the
input is guaranteed to contain no NA values:
rolling_mean(y, 3L, assume_finite = TRUE)All functions accept np.ndarray and pd.Series:
import numpy as np
import robustrolling as rr
x = np.array([1.0, 3.0, 2.0, 5.0, 4.0])
rr.rolling_max(x, 3)
# array([nan, nan, 3., 5., 5.])
rr.rolling_min(x, 3)
# array([nan, nan, 1., 2., 2.])
rr.rolling_median(x, 3)
# array([nan, nan, 2., 3., 4.])y = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
rr.rolling_variance(y, 3)
# array([nan, nan, 1., 1., 1.])
rr.rolling_mean(y, 3)
# array([nan, nan, 2., 3., 4.])
rr.rolling_skewness(y, 3)
# array([nan, nan, 0., 0., 0.])
rr.rolling_kurtosis(y, 4)
# array([nan, nan, nan, -1.2, -1.2])a = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
b = np.array([2.0, 4.0, 6.0, 8.0, 10.0])
rr.rolling_cov(a, b, 3)
# array([nan, nan, 2., 2., 2.])
rr.rolling_cor(a, b, 3)
# array([nan, nan, 1., 1., 1.])Fast methods
# 2–4x faster, less numerically stable
rr.rolling_variance(y, 3, method="fast")
rr.rolling_skewness(y, 3, method="fast")
rr.rolling_kurtosis(y, 4, method="fast")
# SIMD mean — assumes no NaN in input
rr.rolling_mean(y, 3, assume_finite=True)pandas Series — index and name are preserved:
import pandas as pd
prices = pd.Series(
[100.0, 102.0, 98.0, 105.0, 103.0],
index=pd.date_range("2024-01-01", periods=5),
name="close",
)
rr.rolling_max(prices, 3)
# 2024-01-01 NaN
# 2024-01-02 NaN
# 2024-01-03 102.0
# 2024-01-04 105.0
# 2024-01-05 105.0
# Freq: D, Name: close, dtype: float64Direct access to the engine objects for incremental (streaming) use:
from robustrolling import MonotonicMax, SlidingMoments, SlidingCovariance
import numpy as np
# Streaming — one value at a time
engine = MonotonicMax(3)
for v in [1.0, 3.0, 2.0, 5.0]:
engine.update(v)
print(engine.get_value())
# 1.0 → 3.0 → 3.0 → 5.0
# Batch — zero-copy NumPy buffer
engine2 = SlidingMoments(3)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
print(engine2.process_skewness_batch(x))
# [nan, nan, 0., 0., 0.]
# Covariance engine
cov_engine = SlidingCovariance(3)
a = np.array([1.0, 2.0, 3.0, 4.0])
b = np.array([2.0, 4.0, 6.0, 8.0])
print(cov_engine.process_covariance_batch(a, b))
# [nan, nan, 2., 2.]Benchmarked on Apple M-series (ARM), window = 100, n = 1 000 000.
Best robustrolling configuration vs pandas (¹ assume_finite=True, ² method="fast").
| Function | robustrolling | pandas | speedup |
|---|---|---|---|
rolling_mean ¹ |
0.78 ms | 4.58 ms | 5.9x |
rolling_max |
11.5 ms | 12.3 ms | 1.1x |
rolling_min |
11.5 ms | 12.7 ms | 1.1x |
rolling_median |
111 ms | 233 ms | 2.1x |
rolling_variance ² |
4.4 ms | 10.6 ms | 2.4x |
rolling_skewness ² |
10.9 ms | 10.1 ms | ~1.0x |
rolling_kurtosis ² |
8.4 ms | 10.0 ms | 1.2x |
rolling_cov |
16.8 ms | 19.3 ms | 1.2x |
rolling_cor |
16.8 ms | 39.6 ms | 2.4x |
| Function | stable | fast | speedup |
|---|---|---|---|
mean (assume_finite) |
3.5 ms | 0.78 ms | 4.4x |
variance |
16.1 ms | 4.4 ms | 3.7x |
skewness |
23.9 ms | 10.9 ms | 2.2x |
kurtosis |
21.7 ms | 8.4 ms | 2.6x |
| Function | robustrolling | slider | RcppRoll | vs slider | vs RcppRoll |
|---|---|---|---|---|---|
rolling_max |
15.9 ms | 349 ms | 181 ms | 22x | 11x |
rolling_min |
15.2 ms | 353 ms | 181 ms | 23x | 12x |
rolling_mean |
3.2 ms | 1 558 ms | 39.0 ms | 495x | 12x |
rolling_variance |
16.9 ms | 2 578 ms | 320 ms | 152x | 19x |
rolling_median |
114 ms | 10 254 ms | 2 014 ms | 90x | 18x |
| Function | stable | fast | speedup |
|---|---|---|---|
mean (assume_finite) |
3.3 ms | 0.80 ms | 4.2x |
variance |
16.8 ms | 4.4 ms | 3.9x |
skewness |
21.9 ms | 10.6 ms | 2.1x |
kurtosis |
21.6 ms | 8.3 ms | 2.6x |
The C++ core uses CRTP (Curiously Recurring Template Pattern) to share a common interface across all algorithm classes without virtual dispatch:
RollingMetric<Derived>
├── SlidingMean — prefix sum + ARM NEON / AVX2 SIMD
├── MonotonicMax — monotonic deque (max)
├── MonotonicMin — monotonic deque (min)
├── MultisetMedian — std::multiset + dual-iterator median tracking
├── SlidingWelfordRing — Welford variance + ring buffer eviction
├── SlidingMoments — Terriberry's 4th-moment recurrence
└── SlidingCovariance — 2D Welford for covariance and Pearson correlation
SlidingMomentsPrefix — stateless batch engine (prefix sums of raw moments)
Bindings:
| Language | Technology | Notes |
|---|---|---|
| R | Pure R/C API (.Call) |
No Rcpp dependency |
| Python | pybind11 + NumPy buffer protocol | Zero-copy batch processing |
| Tool | Version |
|---|---|
| C++ compiler | C++17 (GCC ≥ 9, Clang ≥ 10, MSVC ≥ 2019) |
| CMake | ≥ 3.14 |
| R | ≥ 4.0 |
| Python | ≥ 3.10 |
# C++ unit tests (gtest)
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target test_core --parallel
ctest --test-dir build --output-on-failure
# R package
make r-build # sync headers + roxygen + R CMD INSTALL
make r-test # tinytest
# Python package
make py-build # editable install
make py-test # pytest
# Benchmarks
Rscript benchmarks/bench_r.R
python benchmarks/bench_python.pyMIT © Igor Ptak