Five models. Two hundred indicators. No black boxes.
A Python-based ML signal engine with 200+ technical indicators and 5 ensemble models — explainable, auditable, and open source.
The challenge
Most algorithmic trading tools are black boxes — a signal output with no visibility into why it fired, what inputs drove it, or how it would have performed historically. For a technically sophisticated trader, that's not a tool. It's a guess with a UI.
AlphaStream was built around a different premise: every signal should be explainable, every model should be auditable, and the entire system should run in Python on hardware you control.
The challenge: building a signal engine simultaneously comprehensive enough to cover 200+ technical indicators across multiple timeframes, fast enough to process live market data without falling behind, and transparent enough that a practitioner can understand exactly what drove each output.
How we built it
Data layer: market data ingestion from multiple sources, normalized into a unified OHLCV + extended data model. Indicator layer: 200+ technical indicators computed via pandas, TA-Lib, and custom implementations — RSI, MACD, Bollinger Bands, ADX, Ichimoku, custom momentum composites.
ML layer: 5 models trained per instrument/timeframe — XGBoost (gradient boosting, primary signal), LightGBM (secondary signal, speed-optimized), Random Forest (confidence calibration), Ridge Regression (trend baseline), and an Ensemble Voter combining all four with learned weights. Backtesting via walk-forward validation with held-out test sets — no look-ahead bias.
Feature engineering is where the edge lives. The 200+ indicators aren't noise — they're the vocabulary the models learn from. The ensemble architecture ensures no single model dominates, and the agreement score tells you when the models disagree (which is itself a signal).
What shipped
Python package with clean CLI and programmatic API. 200+ indicator implementations (TA-Lib + pandas + custom). 5 trained model pipeline (XGBoost, LightGBM, RF, Ridge, Ensemble). Backtesting engine with walk-forward validation.
Signal output with explainability layer (feature importance, SHAP values). Public GitHub repository: 5★, 2 forks, active maintenance. Full documentation including strategy examples.
Results
5★ GitHub rating from practitioners in the quant/algo trading community. 2 forks by external developers extending the system for their own use cases.
Walk-forward backtests across multiple instruments and timeframes demonstrating consistent signal quality. SHAP explainability output allows practitioners to understand per-signal feature attribution.
ML signal engines for trading don't require a hedge fund infrastructure team. A well-engineered Python package with the right architecture can be built, maintained, and extended by a single practitioner — and released as open source without compromising the core thesis.
Available
- GitHub repository → github.com/jteixeira/alphastream
- Strategy documentation
- Backtesting methodology notes