Skip to content

IgorPtak/rolling_window

Repository files navigation

robustrolling

R package check C++ tests Python package

High-performance rolling window metrics for R and Python, implemented in C++17.


Overview

robustrolling provides numerically stable, memory-efficient rolling window algorithms built in C++17 and exposed to both R and Python. All algorithms:

  • run in O(1) time per element (O(log n) for median),
  • handle NaN / NA transparently,
  • support a min_periods parameter (pandas-compatible semantics),
  • share a common CRTP base (RollingMetric<Derived>) — zero virtual dispatch, flat ring-buffer memory layout,
  • are compiled with -O3 -flto and include ARM NEON / AVX2 SIMD paths for rolling_mean.

Features

C++ class Algorithm Time R API Python class
SlidingMean Prefix sum + SIMD (ARM NEON / AVX2) O(n) batch rolling_mean() SlidingMean
SlidingWelfordRing Welford online + ring buffer O(1) rolling_variance() (method="stable") SlidingWelford
SlidingMomentsPrefix Prefix sums of raw moments O(n) batch rolling_variance/skewness/kurtosis() (method="fast") SlidingMomentsPrefix
MonotonicMax Monotonic deque O(1) amortised rolling_max() MonotonicMax
MonotonicMin Monotonic deque O(1) amortised rolling_min() MonotonicMin
MultisetMedian std::multiset dual-iterator O(log n) rolling_median() MultisetMedian
SlidingMoments Terriberry's 4th-moment recurrence O(1) rolling_skewness/kurtosis() (method="stable") SlidingMoments
SlidingCovariance Welford 2D online O(1) rolling_cov() rolling_cor() SlidingCovariance

Installation

R

remotes::install_github("IgorPtak/rolling_window")

Or build from source:

git clone https://github.com/IgorPtak/rolling_window.git
cd rolling_window
make r-build

Requires: R ≥ 4.0, a C++17 compiler.

Python

git clone https://github.com/IgorPtak/rolling_window.git
cd rolling_window
pip install py_package/

With pandas support:

pip install "py_package/[pandas]"

Requires: Python ≥ 3.10, a C++17 compiler, pybind11.


Usage

R

library(robustrolling)

x <- as.double(c(1, 3, 2, 5, 4))

Max / Min

rolling_max(x, 3L)
#> [1] NA NA  3  5  5

rolling_min(x, 3L)
#> [1] NA NA  1  2  2

Median

rolling_median(x, 3L)
#> [1] NA NA  2  3  4

Variance and mean

y <- as.double(c(1, 2, 3, 4, 5))

rolling_variance(y, 3L)
#> [1] NA NA  1  1  1

rolling_mean(y, 3L)
#> [1] NA NA  2  3  4

Higher moments

rolling_skewness(y, 3L)
#> [1] NA NA  0  0  0

rolling_kurtosis(y, 4L)
#> [1] NA NA NA -1.2 -1.2

Covariance and Pearson correlation

a <- as.double(c(1, 2, 3, 4, 5))
b <- as.double(c(2, 4, 6, 8, 10))

rolling_cov(a, b, 3L)
#> [1] NA NA  2  2  2

rolling_cor(a, b, 3L)
#> [1] NA NA  1  1  1

min_periods — require fewer observations

rolling_max(x, 3L, min_periods = 1L)
#> [1]  1  3  3  5  5

Fast methods — prefix-sum acceleration

method = "fast" uses prefix sums of raw moments instead of the online Welford/Terriberry algorithm. It is 2–4x faster on large arrays but susceptible to catastrophic cancellation for very large values with small variance. Use it when numerical precision is not critical.

rolling_variance(y, 3L, method = "fast")
rolling_skewness(y, 3L, method = "fast")
rolling_kurtosis(y, 4L, method = "fast")

assume_finite = TRUE enables the SIMD fast path for rolling_mean when the input is guaranteed to contain no NA values:

rolling_mean(y, 3L, assume_finite = TRUE)

Python — high-level API

All functions accept np.ndarray and pd.Series:

import numpy as np
import robustrolling as rr

x = np.array([1.0, 3.0, 2.0, 5.0, 4.0])

rr.rolling_max(x, 3)
# array([nan, nan,  3.,  5.,  5.])

rr.rolling_min(x, 3)
# array([nan, nan,  1.,  2.,  2.])

rr.rolling_median(x, 3)
# array([nan, nan,  2.,  3.,  4.])
y = np.array([1.0, 2.0, 3.0, 4.0, 5.0])

rr.rolling_variance(y, 3)
# array([nan, nan,  1.,  1.,  1.])

rr.rolling_mean(y, 3)
# array([nan, nan,  2.,  3.,  4.])

rr.rolling_skewness(y, 3)
# array([nan, nan,  0.,  0.,  0.])

rr.rolling_kurtosis(y, 4)
# array([nan, nan,  nan, -1.2, -1.2])
a = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
b = np.array([2.0, 4.0, 6.0, 8.0, 10.0])

rr.rolling_cov(a, b, 3)
# array([nan, nan,  2.,  2.,  2.])

rr.rolling_cor(a, b, 3)
# array([nan, nan,  1.,  1.,  1.])

Fast methods

# 2–4x faster, less numerically stable
rr.rolling_variance(y, 3, method="fast")
rr.rolling_skewness(y, 3, method="fast")
rr.rolling_kurtosis(y, 4, method="fast")

# SIMD mean — assumes no NaN in input
rr.rolling_mean(y, 3, assume_finite=True)

pandas Series — index and name are preserved:

import pandas as pd

prices = pd.Series(
    [100.0, 102.0, 98.0, 105.0, 103.0],
    index=pd.date_range("2024-01-01", periods=5),
    name="close",
)

rr.rolling_max(prices, 3)
# 2024-01-01      NaN
# 2024-01-02      NaN
# 2024-01-03    102.0
# 2024-01-04    105.0
# 2024-01-05    105.0
# Freq: D, Name: close, dtype: float64

Python — low-level C++ classes

Direct access to the engine objects for incremental (streaming) use:

from robustrolling import MonotonicMax, SlidingMoments, SlidingCovariance
import numpy as np

# Streaming — one value at a time
engine = MonotonicMax(3)
for v in [1.0, 3.0, 2.0, 5.0]:
    engine.update(v)
    print(engine.get_value())
# 1.0 → 3.0 → 3.0 → 5.0

# Batch — zero-copy NumPy buffer
engine2 = SlidingMoments(3)
x = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
print(engine2.process_skewness_batch(x))
# [nan, nan, 0., 0., 0.]

# Covariance engine
cov_engine = SlidingCovariance(3)
a = np.array([1.0, 2.0, 3.0, 4.0])
b = np.array([2.0, 4.0, 6.0, 8.0])
print(cov_engine.process_covariance_batch(a, b))
# [nan, nan, 2., 2.]

Performance

Benchmarked on Apple M-series (ARM), window = 100, n = 1 000 000.

Python vs pandas

Best robustrolling configuration vs pandas (¹ assume_finite=True, ² method="fast").

Function robustrolling pandas speedup
rolling_mean ¹ 0.78 ms 4.58 ms 5.9x
rolling_max 11.5 ms 12.3 ms 1.1x
rolling_min 11.5 ms 12.7 ms 1.1x
rolling_median 111 ms 233 ms 2.1x
rolling_variance ² 4.4 ms 10.6 ms 2.4x
rolling_skewness ² 10.9 ms 10.1 ms ~1.0x
rolling_kurtosis ² 8.4 ms 10.0 ms 1.2x
rolling_cov 16.8 ms 19.3 ms 1.2x
rolling_cor 16.8 ms 39.6 ms 2.4x

Python — stable vs fast

Function stable fast speedup
mean (assume_finite) 3.5 ms 0.78 ms 4.4x
variance 16.1 ms 4.4 ms 3.7x
skewness 23.9 ms 10.9 ms 2.2x
kurtosis 21.7 ms 8.4 ms 2.6x

R vs slider vs RcppRoll

Function robustrolling slider RcppRoll vs slider vs RcppRoll
rolling_max 15.9 ms 349 ms 181 ms 22x 11x
rolling_min 15.2 ms 353 ms 181 ms 23x 12x
rolling_mean 3.2 ms 1 558 ms 39.0 ms 495x 12x
rolling_variance 16.9 ms 2 578 ms 320 ms 152x 19x
rolling_median 114 ms 10 254 ms 2 014 ms 90x 18x

R — stable vs fast

Function stable fast speedup
mean (assume_finite) 3.3 ms 0.80 ms 4.2x
variance 16.8 ms 4.4 ms 3.9x
skewness 21.9 ms 10.6 ms 2.1x
kurtosis 21.6 ms 8.3 ms 2.6x

Architecture

The C++ core uses CRTP (Curiously Recurring Template Pattern) to share a common interface across all algorithm classes without virtual dispatch:

RollingMetric<Derived>
├── SlidingMean          — prefix sum + ARM NEON / AVX2 SIMD
├── MonotonicMax         — monotonic deque (max)
├── MonotonicMin         — monotonic deque (min)
├── MultisetMedian       — std::multiset + dual-iterator median tracking
├── SlidingWelfordRing   — Welford variance + ring buffer eviction
├── SlidingMoments       — Terriberry's 4th-moment recurrence
└── SlidingCovariance    — 2D Welford for covariance and Pearson correlation

SlidingMomentsPrefix     — stateless batch engine (prefix sums of raw moments)

Bindings:

Language Technology Notes
R Pure R/C API (.Call) No Rcpp dependency
Python pybind11 + NumPy buffer protocol Zero-copy batch processing

Development

Requirements

Tool Version
C++ compiler C++17 (GCC ≥ 9, Clang ≥ 10, MSVC ≥ 2019)
CMake ≥ 3.14
R ≥ 4.0
Python ≥ 3.10

Build and test

# C++ unit tests (gtest)
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target test_core --parallel
ctest --test-dir build --output-on-failure

# R package
make r-build   # sync headers + roxygen + R CMD INSTALL
make r-test    # tinytest

# Python package
make py-build  # editable install
make py-test   # pytest

# Benchmarks
Rscript benchmarks/bench_r.R
python benchmarks/bench_python.py

License

MIT © Igor Ptak

About

High-performance rolling window metrics for R and Python, backed by C++17

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

 
 
 

Contributors