Estimators API ============= lrdbenchmark provides a comprehensive suite of 20 estimators for detecting and quantifying long-range dependence in time series data. Base Estimator ------------- .. autoclass:: lrdbenchmark.analysis.base_estimator.BaseEstimator :members: :undoc-members: :show-inheritance: Temporal Estimators ------------------ Detrended Fluctuation Analysis (DFA) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.temporal.dfa_estimator.DFAEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Detrended Moving Average (DMA) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.temporal.dma_estimator.DMAEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Higuchi Method ~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.temporal.higuchi_estimator.HiguchiEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Generalised Hurst Exponent (GHE) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.temporal.ghe_estimator.GHEEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate R/S Analysis ~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.temporal.rs_estimator.RSEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Spectral Estimators ------------------ Geweke-Porter-Hudak (GPH) ~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.spectral.gph_estimator.GPHEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Periodogram ~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.spectral.periodogram_estimator.PeriodogramEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Whittle Estimator ~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.spectral.whittle_estimator.WhittleEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Wavelet Estimators ----------------- Continuous Wavelet Transform (CWT) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.wavelet.cwt_estimator.CWTEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Wavelet Variance ~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.wavelet.variance_estimator.WaveletVarianceEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Wavelet Log-Variance ~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.wavelet.log_variance_estimator.WaveletLogVarianceEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Wavelet Whittle ~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.wavelet.whittle_estimator.WaveletWhittleEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Multifractal Estimators ----------------------- Multifractal Detrended Fluctuation Analysis (MFDFA) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.multifractal.mfdfa_estimator.MFDFAEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Multifractal Wavelet Leaders ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autoclass:: lrdbenchmark.analysis.multifractal.wavelet_leaders_estimator.MultifractalWaveletLeadersEstimator :members: :undoc-members: :show-inheritance: .. automethod:: __init__ .. automethod:: estimate Machine Learning Estimators --------------------------- For detailed documentation of machine learning estimators, see :doc:`machine_learning_estimators`. The following ML estimators are available: * **Random Forest**: Ensemble tree-based estimation with feature importance * **Support Vector Regression**: SVM-based estimation with RBF kernel * **Gradient Boosting**: Boosted tree estimation with comprehensive feature engineering See the dedicated :doc:`machine_learning_estimators` page for complete API documentation, performance metrics, and usage examples. Neural Network Estimators ------------------------- For detailed documentation of neural network estimators, see :doc:`neural_network_factory`. The following neural network architectures are available: * **CNN**: Convolutional Neural Networks for spatial pattern recognition * **LSTM**: Long Short-Term Memory networks for temporal sequences * **GRU**: Gated Recurrent Units for efficient temporal modeling * **Transformer**: Attention-based architectures for complex patterns See the dedicated :doc:`neural_network_factory` page for complete API documentation, architecture details, and usage examples. High Performance Estimators --------------------------- .. note:: The unified estimators automatically select optimal computation frameworks (JAX, Numba, or NumPy) based on data characteristics and hardware availability. For advanced users, the high-performance modules in ``lrdbenchmark.analysis.high_performance.jax`` and ``lrdbenchmark.analysis.high_performance.numba`` provide direct access to optimized implementations (e.g., ``DFAEstimatorJAX``, ``DFAEstimatorNumba``), but the unified estimators are recommended for most use cases as they automatically select the best framework based on data characteristics and hardware availability. Backend Modules (Strategy Pattern) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The following backend modules provide modular implementations for each estimator family, enabling JAX GPU acceleration or Numba JIT compilation: **Temporal Backends** * ``lrdbenchmark.analysis.temporal.dfa_backends``: NumPy, JAX, Numba implementations for DFA. * ``lrdbenchmark.analysis.temporal.rs_backends``: NumPy, JAX, Numba implementations for R/S. **Spectral Backends** * ``lrdbenchmark.analysis.spectral.spectral_backends``: NumPy, JAX implementations for Periodogram, Welch, Whittle. **Wavelet Backends** * ``lrdbenchmark.analysis.wavelet.wavelet_backends``: NumPy, JAX implementations for DWT-based variance estimators. **Multifractal Backends** * ``lrdbenchmark.analysis.multifractal.mfdfa_backends``: NumPy, JAX implementations for MFDFA. * ``lrdbenchmark.analysis.multifractal.wavelet_leaders_backends``: NumPy, JAX implementations for Wavelet Leaders. **Usage Example (Backend Selection)** .. code-block:: python from lrdbenchmark import DFAEstimator # Auto-select best backend (default) estimator = DFAEstimator(use_optimization='auto') # Force JAX for GPU acceleration estimator_jax = DFAEstimator(use_optimization='jax') # Force NumPy for maximum compatibility estimator_numpy = DFAEstimator(use_optimization='numpy') Usage Examples -------------- Basic Estimator Usage ~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python from lrdbenchmark import DFAEstimator from lrdbenchmark import GPHEstimator from lrdbenchmark import FBMModel # Generate test data with known Hurst parameter model = FBMModel(H=0.7, sigma=1.0) data = model.generate(1000, seed=42) print(f"Generated FBM data with true H = 0.7") print(f"Data length: {len(data)}") print(f"Data mean: {data.mean():.3f}, std: {data.std():.3f}") # Use DFA estimator dfa = DFAEstimator() H_dfa = dfa.estimate(data) print(f"DFA H estimate: {H_dfa:.3f}") print(f"DFA error: {abs(H_dfa - 0.7):.3f}") # Use GPH estimator gph = GPHEstimator() H_gph = gph.estimate(data) print(f"GPH H estimate: {H_gph:.3f}") print(f"GPH error: {abs(H_gph - 0.7):.3f}") # Compare estimates print(f"\nEstimate comparison:") print(f"True H: 0.700") print(f"DFA: {H_dfa:.3f} (error: {abs(H_dfa - 0.7):.3f})") print(f"GPH: {H_gph:.3f} (error: {abs(H_gph - 0.7):.3f})") Multiple Estimators Comparison ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python from lrdbenchmark import DFAEstimator from lrdbenchmark import RSEstimator from lrdbenchmark import GPHEstimator from lrdbenchmark import WaveletVarianceEstimator from lrdbenchmark import HiguchiEstimator from lrdbenchmark import FBMModel, FGNModel import pandas as pd # Define estimators to test estimators = { 'DFA': DFAEstimator(), 'R/S': RSEstimator(), 'GPH': GPHEstimator(), 'Wavelet Variance': WaveletVarianceEstimator(), 'Higuchi': HiguchiEstimator() } # Test on different data models test_cases = { 'FBM (H=0.7)': FBMModel(H=0.7, sigma=1.0), 'FBM (H=0.3)': FBMModel(H=0.3, sigma=1.0), 'FGN (H=0.8)': FGNModel(H=0.8, sigma=1.0) } # Store results all_results = [] for case_name, model in test_cases.items(): print(f"\n=== Testing {case_name} ===") data = model.generate(1000, seed=42) true_H = model.H case_results = {'Case': case_name, 'True_H': true_H} for name, estimator in estimators.items(): try: H_est = estimator.estimate(data) error = abs(H_est - true_H) case_results[name] = H_est case_results[f'{name}_error'] = error print(f" {name}: H = {H_est:.3f} (error: {error:.3f})") except Exception as e: print(f" {name}: Error - {e}") case_results[name] = None case_results[f'{name}_error'] = None all_results.append(case_results) # Create summary DataFrame df = pd.DataFrame(all_results) print(f"\n=== SUMMARY ===") print(df.round(3)) # Calculate average errors error_columns = [col for col in df.columns if col.endswith('_error')] avg_errors = df[error_columns].mean() print(f"\n=== AVERAGE ERRORS ===") for col in error_columns: estimator = col.replace('_error', '') print(f"{estimator}: {avg_errors[col]:.3f}") Machine Learning Estimators ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python from lrdbenchmark import RandomForestEstimator from lrdbenchmark import GradientBoostingEstimator from lrdbenchmark import CNNEstimator from lrdbenchmark import FBMModel, FGNModel, ARFIMAModel import numpy as np from sklearn.model_selection import train_test_split # Generate comprehensive training dataset print("Generating training dataset...") training_data = [] training_labels = [] # Create diverse training data H_values = np.linspace(0.3, 0.9, 15) # 15 different H values models = { 'FBM': FBMModel, 'FGN': FGNModel, 'ARFIMA': lambda H: ARFIMAModel(d=H-0.5, p=1, q=1) } for H in H_values: for model_name, model_class in models.items(): if model_name == 'ARFIMA': model = model_class(H) else: model = model_class(H=H, sigma=1.0) # Generate multiple realizations for i in range(20): data = model.generate(1000, seed=int(H*1000 + i)) training_data.append(data) training_labels.append(H) print(f"Generated {len(training_data)} training samples") print(f"H range: {min(training_labels):.1f} to {max(training_labels):.1f}") # Split into training and validation sets X_train, X_val, y_train, y_val = train_test_split( training_data, training_labels, test_size=0.2, random_state=42 ) # Train Random Forest estimator print("\nTraining Random Forest estimator...") rf_estimator = RandomForestEstimator( n_estimators=100, max_depth=10, random_state=42 ) rf_estimator.fit(X_train, y_train) # Train Gradient Boosting estimator print("Training Gradient Boosting estimator...") gb_estimator = GradientBoostingEstimator( n_estimators=100, learning_rate=0.1, max_depth=5, random_state=42 ) gb_estimator.fit(X_train, y_train) # Evaluate on validation set print("\n=== Validation Results ===") rf_val_pred = rf_estimator.estimate(X_val) gb_val_pred = gb_estimator.estimate(X_val) rf_mae = np.mean(np.abs(np.array(rf_val_pred) - np.array(y_val))) gb_mae = np.mean(np.abs(np.array(gb_val_pred) - np.array(y_val))) print(f"Random Forest MAE: {rf_mae:.3f}") print(f"Gradient Boosting MAE: {gb_mae:.3f}") # Test on new data print("\n=== Test on New Data ===") test_cases = [ ('FBM (H=0.6)', FBMModel(H=0.6, sigma=1.0)), ('FGN (H=0.4)', FGNModel(H=0.4, sigma=1.0)), ('ARFIMA (H=0.75)', ARFIMAModel(d=0.25, p=1, q=1)) ] for test_name, test_model in test_cases: test_data = test_model.generate(1000, seed=999) H_rf = rf_estimator.estimate([test_data])[0] H_gb = gb_estimator.estimate([test_data])[0] if 'H=' in test_name: true_H = float(test_name.split('H=')[1].split(')')[0]) else: true_H = 0.75 # For ARFIMA print(f"{test_name}:") print(f" True H: {true_H:.3f}") print(f" RF estimate: {H_rf:.3f} (error: {abs(H_rf - true_H):.3f})") print(f" GB estimate: {H_gb:.3f} (error: {abs(H_gb - true_H):.3f})") High Performance Estimators ~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python from lrdbenchmark import DFAEstimator, RSEstimator, FBMModel import numpy as np # Generate large dataset model = FBMModel(H=0.7, sigma=1.0) data = model.generate(10000, seed=42) # Unified estimators automatically use optimal backend (JAX/Numba/NumPy) dfa = DFAEstimator(use_optimization='auto') # Auto-selects best framework result = dfa.estimate(data) print(f"DFA H estimate: {result['hurst_parameter']:.3f}") # R/S estimator with automatic optimization rs = RSEstimator(use_optimization='auto') result = rs.estimate(data) print(f"R/S H estimate: {result['hurst_parameter']:.3f}") Parameter Tuning ~~~~~~~~~~~~~~~~ .. code-block:: python from lrdbenchmark import DFAEstimator from lrdbenchmark import GPHEstimator from lrdbenchmark import FBMModel # Generate test data model = FBMModel(H=0.7, sigma=1.0) data = model.generate(1000, seed=42) # DFA with custom parameters dfa = DFAEstimator( min_scale=4, max_scale=100, num_scales=20, polynomial_order=2 ) H_dfa = dfa.estimate(data) # GPH with custom parameters gph = GPHEstimator( num_frequencies=50, min_frequency=0.01, max_frequency=0.5 ) H_gph = gph.estimate(data) print(f"DFA (custom): H = {H_dfa:.3f}") print(f"GPH (custom): H = {H_gph:.3f}") # Note: All estimators are documented above in their respective sections # No duplicate documentation needed Error Handling -------------- .. code-block:: python from lrdbenchmark import DFAEstimator from lrdbenchmark import GPHEstimator # Test with insufficient data short_data = [1, 2, 3, 4, 5] # Too short for most estimators dfa = DFAEstimator() try: H_dfa = dfa.estimate(short_data) print(f"DFA H estimate: {H_dfa:.3f}") except ValueError as e: print(f"DFA error: {e}") gph = GPHEstimator() try: H_gph = gph.estimate(short_data) print(f"GPH H estimate: {H_gph:.3f}") except ValueError as e: print(f"GPH error: {e}") Performance Comparison ---------------------- .. code-block:: python import time from lrdbenchmark import DFAEstimator, FBMModel # Generate test data model = FBMModel(H=0.7, sigma=1.0) data = model.generate(5000, seed=42) # DFA with automatic optimization (will use JAX/Numba if available) dfa = DFAEstimator(use_optimization='auto') start_time = time.time() result = dfa.estimate(data) dfa_time = time.time() - start_time # DFA with explicit NumPy backend (no optimization) dfa_numpy = DFAEstimator(use_optimization='numpy') start_time = time.time() result_numpy = dfa_numpy.estimate(data) dfa_numpy_time = time.time() - start_time print(f"Optimized DFA: H = {result['hurst_parameter']:.3f}, Time = {dfa_time:.4f}s") print(f"NumPy DFA: H = {result_numpy['hurst_parameter']:.3f}, Time = {dfa_numpy_time:.4f}s") if dfa_numpy_time > 0: print(f"Speedup: {dfa_numpy_time/dfa_time:.2f}x") Best Practices -------------- 1. **Data Length**: Use at least 1000 samples for reliable estimates 2. **Parameter Selection**: Choose appropriate scale ranges for your data 3. **Multiple Estimators**: Compare results from different estimator types 4. **Error Handling**: Always handle potential estimation errors 5. **Performance**: Use high-performance estimators for large datasets 6. **Validation**: Test on synthetic data with known Hurst parameters .. note:: Different estimators may give slightly different results due to their underlying assumptions and methodologies. It's recommended to use multiple estimators and compare their results. .. warning:: Some estimators require specific data characteristics (e.g., stationarity for spectral methods). Always check the estimator documentation for requirements and limitations.