Machine Learning Estimators ============================ LRDBenchmark provides production-ready machine learning estimators for Long-Range Dependence (LRD) estimation. These estimators achieve **excellent performance** with perfect robustness, with **Gradient Boosting achieving the best ML performance** at 0.193 MAE. Overview -------- The machine learning estimators use advanced feature engineering with 50-70 engineered features per model, including: * **Statistical Features**: Mean, standard deviation, skewness, kurtosis * **Time Series Features**: Autocorrelation at multiple lags, variance of increments * **Spectral Features**: Power spectrum analysis, frequency band ratios, spectral slope * **DFA Features**: Detrended fluctuation analysis with slope calculation * **Wavelet Features**: Wavelet variance at different scales, wavelet slope * **R/S Analysis Features**: Rescaled range analysis with slope calculation * **Additional Features**: Trend analysis, seasonality detection, entropy measures SVR Estimator ------------- Support Vector Regression estimator with RBF kernel and comprehensive feature engineering. .. autoclass:: lrdbenchmark.analysis.machine_learning.svr_estimator_unified.SVREstimator :members: :undoc-members: :show-inheritance: **Performance**: 0.202 MAE, 100% success rate, 0.009s execution time **Key Features**: * RBF kernel with configurable parameters (C, gamma, epsilon) * 50+ engineered features including spectral and DFA analysis * Model persistence with save/load functionality * Robust error handling with fallback to R/S analysis **Example Usage**: .. code-block:: python from lrdbenchmark import SVREstimator import numpy as np # Initialize estimator svr = SVREstimator(kernel='rbf', C=1.0, gamma='scale') # Generate training data X_train = np.random.randn(100, 500) y_train = np.random.uniform(0.2, 0.8, 100) # Train model svr.train(X_train, y_train) # Make prediction new_data = np.random.randn(1, 500) prediction = svr.predict(new_data) Gradient Boosting Estimator --------------------------- Gradient Boosting Regressor with comprehensive feature engineering - **Best Overall Performance**. .. autoclass:: lrdbenchmark.analysis.machine_learning.gradient_boosting_estimator_unified.GradientBoostingEstimator :members: :undoc-members: :show-inheritance: **Performance**: 0.193 MAE (**Best ML**), 100% success rate, 0.013s execution time **Key Features**: * Configurable parameters (n_estimators, learning_rate, max_depth) * 60+ engineered features including advanced spectral and DFA analysis * Feature importance analysis * Model persistence with save/load functionality * Robust error handling with fallback to R/S analysis **Example Usage**: .. code-block:: python from lrdbenchmark import GradientBoostingEstimator import numpy as np # Initialize estimator gb = GradientBoostingEstimator(n_estimators=50, learning_rate=0.1) # Generate training data X_train = np.random.randn(100, 500) y_train = np.random.uniform(0.2, 0.8, 100) # Train model gb.train(X_train, y_train) # Make prediction new_data = np.random.randn(1, 500) prediction = gb.predict(new_data) # Get feature importance importance = gb.get_feature_importance() Random Forest Estimator ----------------------- Random Forest Regressor with comprehensive feature engineering and feature importance analysis. .. autoclass:: lrdbenchmark.analysis.machine_learning.random_forest_estimator_unified.RandomForestEstimator :members: :undoc-members: :show-inheritance: **Performance**: 0.202 MAE, 100% success rate, 2.099s execution time **Key Features**: * Configurable parameters (n_estimators, max_depth, min_samples_split) * 70+ engineered features including fractal dimension and approximate entropy * Feature importance analysis * Model persistence with save/load functionality * Robust error handling with fallback to R/S analysis **Example Usage**: .. code-block:: python from lrdbenchmark import RandomForestEstimator import numpy as np # Initialize estimator rf = RandomForestEstimator(n_estimators=50, max_depth=5) # Generate training data X_train = np.random.randn(100, 500) y_train = np.random.uniform(0.2, 0.8, 100) # Train model rf.train(X_train, y_train) # Make prediction new_data = np.random.randn(1, 500) prediction = rf.predict(new_data) # Get feature importance importance = rf.get_feature_importance() Neural Network Factory ----------------------- For advanced neural network configuration and training, use the Neural Network Factory which provides a comprehensive framework for creating and managing various neural network architectures. .. note:: See :doc:`neural_network_factory` for complete documentation of the Neural Network Factory API, including NNConfig, NNArchitecture, and create_all_benchmark_networks functions. **Example Usage**: .. code-block:: python from lrdbenchmark import NeuralNetworkFactory, FBMModel from lrdbenchmark.analysis.machine_learning.neural_network_factory import NNConfig, NNArchitecture import numpy as np # Create factory factory = NeuralNetworkFactory() # Configure network config = NNConfig( architecture=NNArchitecture.CNN, input_length=500, hidden_dims=[64, 32], learning_rate=0.001, epochs=20 ) # Create network network = factory.create_network(config) # Generate training data X_train = np.random.randn(100, 500) y_train = np.random.uniform(0.2, 0.8, 100) # Train model history = network.train_model(X_train, y_train) # Make prediction new_data = np.random.randn(1, 500) prediction = network.predict(new_data) Performance Comparison ---------------------- | Method | Mean Error | Execution Time | Success Rate | Category | |--------|------------|----------------|--------------|----------| | **LSTM** | **0.097** | 0.0012s | 100% | Neural Networks | | **CNN** | **0.103** | 0.0064s | 100% | Neural Networks | | **Transformer** | **0.106** | 0.0026s | 100% | Neural Networks | | **GRU** | **0.108** | 0.0007s | 100% | Neural Networks | | **R/S** | **0.099** | 0.348s | 100% | Classical | | **GradientBoosting** | **0.193** | 0.013s | 100% | ML | | **SVR** | **0.202** | 0.009s | 100% | ML | | **Whittle** | 0.200 | 0.0002s | 100% | Classical | | **Periodogram** | 0.205 | 0.0005s | 100% | Classical | | **CWT** | 0.269 | 0.063s | 100% | Classical | Key Advantages -------------- * **Excellent Performance**: Strong performance with perfect robustness * **Advanced Feature Engineering**: 50-70 engineered features per model * **Production Ready**: Model persistence, error handling, and deployment capabilities * **Comprehensive Testing**: 100% success rate across all test cases * **Research Quality**: Publication-ready results with detailed performance metrics Best Practices -------------- 1. **For Highest Accuracy**: Use LSTM Neural Network (0.097 MAE) 2. **For Fast ML Performance**: Use SVR (0.009s execution time) 3. **For Feature Analysis**: Use Random Forest (feature importance available) 4. **For Production Deployment**: Use Production ML System with train-once, apply-many workflow 5. **For Real-time Applications**: Use GRU Neural Network (0.0007s execution time) See Also -------- * :doc:`../examples/comprehensive_adaptive_demo` - Complete usage examples * :doc:`../research/theory` - Theoretical foundations * :doc:`../research/validation` - Validation methodology