Feature Engineering for Trading: 200+ Indicators That Actually Matter

AlphaStream computes 200+ technical indicators for every security it analyzes. But most of them are noise. The hard part isn't computing indicators — it's selecting the ones that actually predict future price movement.

The Indicator Categories

I organize indicators into 6 groups:

Trend Indicators (40+): Moving averages (SMA, EMA, WMA, DEMA, TEMA), ADX, Aroon, Ichimoku, Parabolic SAR, SuperTrend. These tell you the direction.

Momentum Indicators (35+): RSI, MACD, Stochastic, Williams %R, CCI, ROC, MFI, Ultimate Oscillator. These tell you the strength.

Volatility Indicators (25+): Bollinger Bands, ATR, Keltner Channels, Donchian Channels, Standard Deviation, Historical Volatility. These tell you the risk.

Volume Indicators (20+): OBV, VWAP, A/D Line, CMF, Force Index, Volume Profile. These tell you the conviction.

Statistical Indicators (30+): Z-Score, Skewness, Kurtosis, Hurst Exponent, Autocorrelation, Cointegration scores. These tell you the regime.

Custom/Engineered (50+): Cross-timeframe features, lag features, rolling statistics, regime indicators. These are where the alpha lives.

The Feature Selection Problem

200+ features with daily data creates a classic p >> n problem. More features than useful data points means overfitting.

My approach:

from sklearn.feature_selection import mutual_info_regression
from sklearn.ensemble import RandomForestRegressor
import numpy as np

def select_features(X, y, n_features=50):
    # Step 1: Remove highly correlated features (>0.95)
    corr_matrix = X.corr().abs()
    upper = corr_matrix.where(
        np.triu(np.ones(corr_matrix.shape), k=1).astype(bool)
    )
    drop_cols = [c for c in upper.columns if any(upper[c] > 0.95)]
    X_filtered = X.drop(columns=drop_cols)

    # Step 2: Mutual information score
    mi_scores = mutual_info_regression(X_filtered, y)

    # Step 3: Random Forest importance (cross-validated)
    rf = RandomForestRegressor(n_estimators=100, random_state=42)
    rf.fit(X_filtered, y)
    rf_importance = rf.feature_importances_

    # Step 4: Combined ranking (average of MI and RF ranks)
    mi_rank = np.argsort(np.argsort(-mi_scores))
    rf_rank = np.argsort(np.argsort(-rf_importance))
    combined_rank = (mi_rank + rf_rank) / 2

    # Return top N features
    top_idx = np.argsort(combined_rank)[:n_features]
    return X_filtered.columns[top_idx].tolist()

Which Indicators Actually Work

After running feature importance across 5 years of futures data (ES, NQ, CL, GC), these consistently rank in the top 20:

ATR (14-period) — Volatility is the most predictive feature, period
RSI divergence from price — Not raw RSI, but the divergence
Volume relative to 20-day average — Conviction confirmation
ADX (14-period) — Trend strength, not direction
VWAP deviation — Institutional positioning proxy
Bollinger Band width — Volatility regime detection
Multi-timeframe RSI agreement — 5m, 15m, 1h RSI all agreeing

The surprising losers: raw MACD (too lagging), Stochastic (too noisy on lower timeframes), most oscillators in trending markets.

The Cross-Timeframe Trick

The single biggest alpha improvement came from multi-timeframe features. Instead of computing indicators on one timeframe, I compute them on 4:

timeframes = ['5min', '15min', '1h', '4h']

for tf in timeframes:
    resampled = df.resample(tf).agg({
        'open': 'first', 'high': 'max',
        'low': 'min', 'close': 'last', 'volume': 'sum'
    })
    features[f'rsi_14_{tf}'] = ta.rsi(resampled.close, length=14)
    features[f'atr_14_{tf}'] = ta.atr(resampled.high, resampled.low,
                                       resampled.close, length=14)

When 5m RSI is oversold but 4h RSI is neutral, that's a dip-buy. When all timeframes agree on overbought, that's a stronger signal. This cross-timeframe agreement feature alone improved prediction accuracy by 8%.

Avoiding Look-Ahead Bias

The most dangerous mistake in feature engineering is accidentally using future data:

Don't use today's close to predict today's direction — use yesterday's close
Don't use indicators computed with today's full candle — compute with the previous candle
Don't normalize with the full dataset — normalize with a rolling window

I use strict walk-forward computation: every feature at time T is computed using only data available at time T-1.

The Reality Check

After all this engineering, my models achieve 55-58% directional accuracy on futures. That sounds low, but in trading:

52% accuracy with proper risk management is profitable
55% accuracy with 2:1 reward-to-risk is very profitable
60%+ accuracy usually means you're overfitting

The goal isn't prediction perfection — it's a statistical edge that compounds over thousands of trades.

Feature Engineering for Trading: 200+ Indicators That Actually Matter

Feature Engineering for Trading: 200+ Indicators That Actually Matter

The Indicator Categories

The Feature Selection Problem

Which Indicators Actually Work

The Cross-Timeframe Trick

Avoiding Look-Ahead Bias

The Reality Check

Want to see this in action?