volarixs - applied AI & ML to finance

Explore our latest posts on machine learning, market dynamics, strategy architecture and design

Feature Engineering

Shrinking the Feature Space: PCA & Autoencoders

Many features are redundant or noisy. High dimensionality = harder to generalize.

How Asset Managers Can Implement AI & Machine Learning

Part 2: Infrastructure, Governance & Roadmap. What it takes to implement AI in asset management.

AI Implementation

Neural Networks for Market Data: MLPs, CNNs & LSTMs

We are selective with deep learning. Expensive to train, easy to overfit, harder to debug.

Neural Networks

Signal Half-Life and Decay: How Long Do ML Edges Really Last?

If you discover a signal today, how long will it work?

Edge Persistence

How Asset Managers Can Use AI & Machine Learning in Investment Decisions

Part 1: Use Cases & Value. Real-world use cases: idea generation, regime analysis, risk management.

Asset Management

Modeling Market Turbulence: GARCH, EGARCH & HAR

Volatility ≠ returns: heavy tails, clustering, mean reversion. Dedicated volatility models are essential.

ARIMA, SARIMAX & VAR: When Classical Time-Series Still Win

Explicitly model temporal dependence with transparent structure.

Volatility Forecasting Benchmarks: GARCH, HAR, and ML

Compare GARCH, HAR, and ML models for volatility forecasting.

Machine Learning

How Market Regimes Break ML Models

Financial machine learning rarely fails because the model is 'bad'. It fails because the market regime changed.

Boosted Trees for Alpha: XGBoost & LightGBM

Gradient boosting dominates tabular ML. Learn how XGBoost and LightGBM deliver strong performance.

The 19 Most Important Features for Equity Return Forecasting

Most ML performance in finance doesn't come from the model — it comes from the features.

Rolling Windows for Financial ML: A Complete Guide

If you use financial data and your model does not use a rolling window, the backtest is wrong.

Rolling Windows

Beyond Sharpe: A Research Framework for Evaluating ML Trading Strategies

Sharpe ratio is dangerously incomplete for ML strategies.

Random Forests in Finance: Nonlinear Signals Without the Drama

Tree-based ensembles capture nonlinearities and interactions in market data.

From Linear Regression to Lasso: Fast, Interpretable Baselines

Linear and regularized regressions still do serious work in finance.

Linear Regression

Market Regimes, Clusters & HMMs: Teaching Models to Respect the Environment

Episodes where statistical properties are stable enough: high vol vs low vol, risk-on vs risk-off.

Building a Universe-Wide Prediction Grid

An alpha factory needs predictions for every asset at multiple horizons from multiple models.

Prediction Grid

Regime-Conditioned Performance: Measuring ML Robustness

Most backtests report a single Sharpe. But ML models fail by regime.

volarixs - applied AI & ML to finance

Explore our latest posts on machine learning, market dynamics, strategy architecture and design

Feature Engineering

Shrinking the Feature Space: PCA & Autoencoders

Many features are redundant or noisy. High dimensionality = harder to generalize.

How Asset Managers Can Implement AI & Machine Learning

Part 2: Infrastructure, Governance & Roadmap. What it takes to implement AI in asset management.

AI Implementation

Neural Networks for Market Data: MLPs, CNNs & LSTMs

We are selective with deep learning. Expensive to train, easy to overfit, harder to debug.

Neural Networks

Signal Half-Life and Decay: How Long Do ML Edges Really Last?

If you discover a signal today, how long will it work?

Edge Persistence

How Asset Managers Can Use AI & Machine Learning in Investment Decisions

Part 1: Use Cases & Value. Real-world use cases: idea generation, regime analysis, risk management.

Asset Management

Modeling Market Turbulence: GARCH, EGARCH & HAR

Volatility ≠ returns: heavy tails, clustering, mean reversion. Dedicated volatility models are essential.

ARIMA, SARIMAX & VAR: When Classical Time-Series Still Win

Explicitly model temporal dependence with transparent structure.

Volatility Forecasting Benchmarks: GARCH, HAR, and ML

Compare GARCH, HAR, and ML models for volatility forecasting.

Machine Learning

How Market Regimes Break ML Models

Financial machine learning rarely fails because the model is 'bad'. It fails because the market regime changed.

Boosted Trees for Alpha: XGBoost & LightGBM

Gradient boosting dominates tabular ML. Learn how XGBoost and LightGBM deliver strong performance.

The 19 Most Important Features for Equity Return Forecasting

Most ML performance in finance doesn't come from the model — it comes from the features.

Rolling Windows for Financial ML: A Complete Guide

If you use financial data and your model does not use a rolling window, the backtest is wrong.

Rolling Windows

Beyond Sharpe: A Research Framework for Evaluating ML Trading Strategies

Sharpe ratio is dangerously incomplete for ML strategies.

Random Forests in Finance: Nonlinear Signals Without the Drama

Tree-based ensembles capture nonlinearities and interactions in market data.

From Linear Regression to Lasso: Fast, Interpretable Baselines

Linear and regularized regressions still do serious work in finance.

Linear Regression

Market Regimes, Clusters & HMMs: Teaching Models to Respect the Environment

Episodes where statistical properties are stable enough: high vol vs low vol, risk-on vs risk-off.

Building a Universe-Wide Prediction Grid

An alpha factory needs predictions for every asset at multiple horizons from multiple models.

Prediction Grid

Regime-Conditioned Performance: Measuring ML Robustness

Most backtests report a single Sharpe. But ML models fail by regime.

Models

October 8, 2025

10 min read

Random Forests in Finance: Nonlinear Signals Without the Drama

Tree-based ensembles capture nonlinearities and interactions in market data, providing robust predictions without the complexity of deep learning.

1. Why Trees Work Well on Tabular Market Data

Tree-based models excel in financial applications because they:

Capture nonlinearities and interactions (e.g., momentum only matters in low-vol regimes).
Are robust to scaling — no need for careful normalization.
Naturally handle mixed feature types (continuous, categorical, binary).

2. Random Forest vs Extra Trees

Random Forest:

Uses bootstrap samples + random feature subsets at each split.
More variance reduction through averaging.
Generally more accurate but slower to train.

Extra Trees (Extremely Randomized Trees):

More randomness at split thresholds; often faster.
Slightly more bias, less variance.
Good for quick benchmarks and feature importance exploration.

3. Use Cases in volarixs

Random forests are particularly useful for:

Predicting 5d/21d returns with rich feature sets (technical + fundamental + regime).
Quick benchmarking of nonlinear capacity vs linear models.
Useful for feature importance exploration — which factors matter most?

Feature Importance Explorer

Model

Regime

Horizon

Top 3 Features:

1. momentum 21d

35.0%

2. vol 21d

22.0%

3. beta

15.0%

4. Limitations

While powerful, tree-based models have trade-offs:

Harder to interpret than linear regression (but feature importance helps).
Can overfit without depth / min_samples tuning.
Not as smooth in time as explicit TS models.

Random Forest

Extra Trees

Tree Ensembles

Ready to explore nonlinear patterns?

Build tree-based models in volarixs.

Explore More Research

Models

October 8, 2025

10 min read

Random Forests in Finance: Nonlinear Signals Without the Drama

Tree-based ensembles capture nonlinearities and interactions in market data, providing robust predictions without the complexity of deep learning.

1. Why Trees Work Well on Tabular Market Data

Tree-based models excel in financial applications because they:

Capture nonlinearities and interactions (e.g., momentum only matters in low-vol regimes).
Are robust to scaling — no need for careful normalization.
Naturally handle mixed feature types (continuous, categorical, binary).

2. Random Forest vs Extra Trees

Random Forest:

Uses bootstrap samples + random feature subsets at each split.
More variance reduction through averaging.
Generally more accurate but slower to train.

Extra Trees (Extremely Randomized Trees):

More randomness at split thresholds; often faster.
Slightly more bias, less variance.
Good for quick benchmarks and feature importance exploration.

3. Use Cases in volarixs

Random forests are particularly useful for:

Predicting 5d/21d returns with rich feature sets (technical + fundamental + regime).
Quick benchmarking of nonlinear capacity vs linear models.
Useful for feature importance exploration — which factors matter most?

Feature Importance Explorer

Model

Regime

Horizon

Top 3 Features:

1. momentum 21d

35.0%

2. vol 21d

22.0%

3. beta

15.0%

4. Limitations

While powerful, tree-based models have trade-offs:

Harder to interpret than linear regression (but feature importance helps).
Can overfit without depth / min_samples tuning.
Not as smooth in time as explicit TS models.

Random Forest

Extra Trees

Tree Ensembles

Ready to explore nonlinear patterns?

Build tree-based models in volarixs.

Explore More Research