Rolling Windows for Financial ML: A Complete Guide
If you use financial data and your model does not use a rolling window, the backtest is wrong.
This is the most common mistake quants make — and one of the easiest to fix.
This article explains:
- Why rolling windows are mandatory in time series ML
- Rolling vs expanding windows
- Rolling forecast origin vs sliding windows
- How to construct correct training/test splits
- How volarixs implements rolling windows in all experiments
1. Why Time Series Require Rolling Windows
In traditional ML:
train on random samples → test on random samplesBut in financial time series:
- data is ordered
- distributions shift
- nearby observations are correlated
- regimes change
- leakage is easy
Random splits cause your model to "see the future". This inflates Sharpe, R², and accuracy — sometimes by 3–10×.
2. What Is a Rolling Window?
Given data: X₁, X₂, ..., Xₜ
For each training step, you select:
[Xt−L, ..., Xt] → train
Xt+h → predictWhere:
- L = lookback length
- h = forecast horizon (1d, 5d, 1m)
You slide the window forward: [Xt−L+1, ..., Xt+1]
Rolling Window Visualization
Rolling Window Structure:
Time ────────────────────────────────────────────►
│
├─ Past Data (not used)
│
├─ Training Window [63 days]
│ ┌────────────────────────────┐
│ │ X₁ X₂ ... Xₜ │ ← Fit model
│ └────────────────────────────┘
│
├─ Forecast Horizon [5 days]
│ ┌────────────┐
│ │ Xₜ₊₁ ... │ ← Predict
│ └────────────┘
│
└─ Future (unknown)
Key: Each step slides the window forward by 1 day
Model is refit on new window before predicting
Rolling windows prevent data leakage by ensuring models only use past data to predict future values. This replicates real trading conditions.
Rolling Window Split Visualizer
Time ────────────────────────────────────────────►
│
├─ Past Data (not used)
│
├─ Training Window [63 days]
│ ┌───────────────────────────────┐
│ │ X₁ X₂ ... Xₜ │ ← Fit model
│ └───────────────────────────────┘
│
├─ Forecast Horizon [5 days]
│ ┌──────────┐
│ │ Xₜ₊₁ ... │ ← Predict
│ └──────────┘
│
└─ Future (unknown)
Mode: Rolling (slides forward)Rolling windows prevent data leakage by ensuring models only use past data. Each step slides the window forward.
3. Rolling vs Expanding Windows
Expanding Window
Starts with initial data, keeps growing:
[X₁ ... Xₜ]Risk: old data dominates, bad for regime-shifted data.
Rolling Window
Keeps fixed size:
[Xt−L+1 ... Xt]Better for:
- non-stationary data
- regime adaptation
- ML models with temporal fragility
4. Rolling Forecast Origins (The Gold Standard)
The most correct method:
- For each step: Fit model on past L observations
- Predict the next h observations
- Record prediction
- Slide window
- Refit model (if required)
This replicates real trading.
5. Avoiding Leakage
The 4 ways ML practitioners leak future data:
- Using future volatility to normalise past returns
- Using overlapping windows without care
- Using full-sample scaling (e.g. MinMax fit on whole dataset)
- Using future returns to define features (e.g. realised volatility includes future data if computed incorrectly)
Because every experiment in volarixs is split on time and run walk-forward — train on the past, predict forward, then advance — the split-level leakage in points 1–3 is designed out by default. Feature-level leakage (point 4) still depends on how a feature is defined, which is why the discipline below matters.
6. Lookback Length: How Much History?
Rules of thumb:
- 63–126 days for equities
- 30–90 days for FX
- 90–180 days for commodities
- 21–42 days for crypto (faster regimes)
- 252 days for volatility forecasting
Short windows adapt fast. Long windows stabilise slow cyclical behaviour.
In volarixs the time window is set per experiment in the wizard, alongside the datasets, feature sets, model and target horizon.
7. Cost Considerations
Rolling windows require:
- repeated model fitting
- repeated feature transformation
- repeated scaling
Compute cost grows ~linearly with number of windows. This is why the Prediction Factory uses Prefect + distributed workers.
8. How volarixs Implements Rolling Windows
Every experiment in volarixs:
- splits on time and runs walk-forward, with epochs over the rolling window
- records the time window in the run configuration
- produces multi-horizon predictions (1d / 5d / 21d / 63d and beyond) for the chosen target
- stores results — model, datasets, targets, train/test R² and status — against the run
- carries the regime context the run was generated under
The point of that record is to make experiments:
- reproducible
- comparable across runs on the same data
- regime-aware
- ready to graduate from a manual experiment into the Factory
Conclusion
Rolling windows are the foundation of correct financial ML.
Without them, results are misleading.
With them — and proper diagnostics — you can build robust, regime-aware forecasts. volarixs runs the windowing this way by default, so every experiment starts from a leak-resistant, walk-forward baseline rather than a random split.