Shrinking the Feature Space: PCA & Autoencoders for Market Data
Many features are redundant or noisy. High dimensionality = harder to generalize. We often want a compact factor representation.
1. Why Dimensionality Reduction
- Many features are redundant or noisy.
- High dimensionality = harder to generalize.
- We often want a compact factor representation.
2. PCA (Principal Component Analysis)
Finds directions of maximum variance. In finance, first few components often look like:
- Market factor
- Sector or style tilts
Great for yield curves and cross-sectional equity returns.
3. Autoencoders (Roadmap Feature)
Neural networks that learn compressed representation. Nonlinear analogue of PCA.
Potentially capture structures PCA misses.
4. How volarixs Uses Them
PCA used to:
- Compress correlated features into a few components.
- Feed components into downstream models (linear/trees/boosters).
Autoencoders considered for:
- Complex multi-modal features (prices + fundamentals + text scores).