volarixs

volarixs - applied AI & ML to finance

Explore our latest posts on machine learning, market dynamics, strategy architecture and design

Feature Engineering
Jun 2, 2026

Shrinking the Feature Space: PCA & Autoencoders

Many features are redundant or noisy. High dimensionality = harder to generalize.

PCA
Autoencoders
Features
9 min read
Strategy
May 24, 2026

How Asset Managers Can Implement AI & Machine Learning

Part 2: Infrastructure, Governance & Roadmap. What it takes to implement AI in asset management.

AI Implementation
Governance
Roadmap
18 min read
Deep Learning
May 20, 2026

Neural Networks for Market Data: MLPs, CNNs & LSTMs

We are selective with deep learning. Expensive to train, easy to overfit, harder to debug.

Neural Networks
MLP
LSTM
12 min read
Research
May 14, 2026

Signal Half-Life and Decay: How Long Do ML Edges Really Last?

If you discover a signal today, how long will it work?

Signal Decay
Half-Life
Edge Persistence
13 min read
Strategy
May 7, 2026

How Asset Managers Can Use AI & Machine Learning in Investment Decisions

Part 1: Use Cases & Value. Real-world use cases: idea generation, regime analysis, risk management.

Asset Management
AI & ML
Use Cases
15 min read
Volatility
Apr 27, 2026

Modeling Market Turbulence: GARCH, EGARCH & HAR

Volatility ≠ returns: heavy tails, clustering, mean reversion. Dedicated volatility models are essential.

GARCH
EGARCH
HAR
10 min read
Time Series
Apr 9, 2026

ARIMA, SARIMAX & VAR: When Classical Time-Series Still Win

Explicitly model temporal dependence with transparent structure.

ARIMA
SARIMAX
VAR
9 min read
Benchmarks
Mar 31, 2026

Volatility Forecasting Benchmarks: GARCH, HAR, and ML

Compare GARCH, HAR, and ML models for volatility forecasting.

Volatility
GARCH
HAR
11 min read
Machine Learning
Mar 24, 2026

How Market Regimes Break ML Models

Financial machine learning rarely fails because the model is 'bad'. It fails because the market regime changed.

Regimes
ML
Backtesting
8 min read
Models
Mar 17, 2026

Boosted Trees for Alpha: XGBoost & LightGBM

Gradient boosting dominates tabular ML. Learn how XGBoost and LightGBM deliver strong performance.

XGBoost
LightGBM
Boosting
11 min read
Features
Mar 10, 2026

The 19 Most Important Features for Equity Return Forecasting

Most ML performance in finance doesn't come from the model — it comes from the features.

Features
Alpha
Equities
12 min read
Methodology
Feb 27, 2026

Rolling Windows for Financial ML: A Complete Guide

If you use financial data and your model does not use a rolling window, the backtest is wrong.

Rolling Windows
Time Series
Backtesting
10 min read
Evaluation
Feb 16, 2026

Beyond Sharpe: A Research Framework for Evaluating ML Trading Strategies

Sharpe ratio is dangerously incomplete for ML strategies.

Evaluation
Metrics
Sharpe
15 min read
Models
Jan 28, 2026

Random Forests in Finance: Nonlinear Signals Without the Drama

Tree-based ensembles capture nonlinearities and interactions in market data.

Random Forest
Extra Trees
Trees
10 min read
Models
Jan 5, 2026

From Linear Regression to Lasso: Fast, Interpretable Baselines

Linear and regularized regressions still do serious work in finance.

Linear Regression
Ridge
Lasso
12 min read
Regimes
Dec 12, 2025

Market Regimes, Clusters & HMMs: Teaching Models to Respect the Environment

Episodes where statistical properties are stable enough: high vol vs low vol, risk-on vs risk-off.

K-Means
GMM
HMM
11 min read
Architecture
Nov 23, 2025

Building a Universe-Wide Prediction Grid

An alpha factory needs predictions for every asset at multiple horizons from multiple models.

Prediction Grid
Scaling
Alpha Factory
14 min read
Evaluation
Oct 8, 2025

Regime-Conditioned Performance: Measuring ML Robustness

Most backtests report a single Sharpe. But ML models fail by regime.

Regimes
Robustness
Performance
12 min read
Models
January 5, 2026
12 min read

From Linear Regression to Lasso: Fast, Interpretable Baselines for Market Prediction

In a world obsessed with transformers and "foundation models," linear and regularized regressions still do a lot of quiet, serious work in finance.

1. Why Linear Models Still Matter

The hype cycle has moved on to transformers and “foundation models,” but linear and regularized regressions still do a lot of quiet, serious work in finance. Three reasons they earn their keep:

  • They are fast: you can run a full equity universe in seconds.
  • They are interpretable: coefficients read like a factor exposure sheet.
  • They provide a clean benchmark for more complex models.

That last point is why the linear family makes such a useful baseline. Before a tree ensemble or neural net is worth the extra complexity, it should clear what a well-regularized linear model already gives you out-of-sample. In volarixs you run both as experiments on the same datasets, feature sets, and target horizon, so the comparison is apples-to-apples on the metrics each run records.

2. The Core Ideas: From OLS to Elastic Net

We use four core variants:

  • OLS / Linear Regression: minimize squared error, no regularization.
  • Ridge (L2): penalize large coefficients, shrink them towards zero.
  • Lasso (L1): penalize absolute values, force many coefficients to zero (feature selection).
  • Elastic Net: combination of L1 and L2; more stable than pure Lasso in correlated features.

Model Family Overview

The objective functions:

OLS: min ||y - Xβ||²
Ridge: min ||y - Xβ||² + λ||β||²
Lasso: min ||y - Xβ||² + λ||β||₁
Elastic Net: min ||y - Xβ||² + λ₁||β||₁ + λ₂||β||²

Interactive Coefficient Shrinkage Playground

Market Beta
Size
Value
Momentum
Volatility
Quality
Macro
FactorCoefficientSign
Market Beta0.409
Size-0.291
Value0.255
Momentum0.136
Volatility-0.200
Quality0.164
Macro0.109

Example: Suppose you predict 5-day forward returns for AAPL using factors like market beta, size, value, momentum, volatility, macro surprises.

  • • OLS might use everything and overfit.
  • • Ridge shrinks exposures, making the model more robust out-of-sample.
  • • Lasso might drop some factors entirely if they don't help.
  • • Elastic Net gives a balance: selective but stable.

3. Application to Financial Time Series

Key points:

  • We transform price data into cross-sectional features at each date (returns, realized vol, rolling betas, etc.).
  • We run cross-sectional regressions (all tickers at a given date) or time-series regressions (one asset over time) depending on the setup.
  • Regularization helps with:
    • • Noisy factors
    • • High collinearity (value vs quality vs carry, etc.)
    • • High-dimensional feature sets (text, macro, technicals)

4. How volarixs Uses This Family

In volarixs, the linear models sit alongside trees, boosting, neural nets, time-series and volatility models in the experiment wizard:

  • They serve as a baseline you select in an experiment and as cheap, fast candidates for the prediction factory.
  • Each run records:
    • • Train and test R² (r2_train / r2_test), so you can see the gap between in- and out-of-sample fit at a glance
    • • The model, datasets, feature sets and target horizon that produced it, plus run status
    • • The regime context the run was made under — the macro state and historical analogues that let you read a result against the environment it came from

That train-vs-test R² gap is exactly where regularization earns its keep: a Ridge or Elastic Net run that holds up out-of-sample is the kind of baseline a heavier model has to beat to justify itself.

Compare Models on a Ticker & Horizon

OLS
Ridge
Lasso
Elastic Net
Out-of-sample only
Linear Regression
Ridge
Lasso
Elastic Net

Get new research in your inbox

Applied AI & ML for the buy-side — new research on signals, regimes, and strategy design, straight to your inbox. No noise.

Ready to build interpretable models?

Start with linear baselines in volarixs.