Quick Start Guide

This guide will get you up and running with lrdbenchmark in minutes.

📓 For comprehensive examples, see the 5 demonstration notebooks under ``notebooks/markdown/``: - Data Generation & Visualization: All stochastic models with comprehensive plots - Estimation & Validation: All estimator categories with statistical validation - Custom Models & Estimators: Library extensibility and custom implementations - Comprehensive Benchmarking: Full benchmarking system with contamination testing - Leaderboard Generation: Performance rankings and comparative analysis

Basic Usage

Generate synthetic data and run a benchmark:

import numpy as np
from lrdbenchmark import FBMModel, RSEstimator

# Generate Fractional Brownian Motion data
model = FBMModel(H=0.7, sigma=1.0)
data = model.generate(length=1000, seed=42)

# Estimate Hurst parameter using R/S analysis
rs_estimator = RSEstimator()
result = rs_estimator.estimate(data)
hurst_estimate = result["hurst_parameter"]

print(f"True Hurst: 0.7, Estimated: {hurst_estimate:.3f}")

Neural Network Usage

lrdbenchmark provides a comprehensive neural network factory with 4 architectures that achieve excellent speed-accuracy trade-offs:

from lrdbenchmark import NeuralNetworkFactory
from lrdbenchmark.analysis.machine_learning.neural_network_factory import NNArchitecture, NNConfig, create_all_benchmark_networks
import numpy as np

# Create neural network factory
factory = NeuralNetworkFactory()

# Create a specific network
config = NNConfig(
    architecture=NNArchitecture.TRANSFORMER,
    input_length=500,
    hidden_dims=[64, 32],
    learning_rate=0.001,
    epochs=50
)
network = factory.create_network(config)

# Generate training data
X_train = np.random.randn(100, 500)  # 100 samples of length 500
y_train = np.random.uniform(0.2, 0.8, 100)  # True Hurst parameters

# Train the network (train-once, apply-many workflow)
history = network.train_model(X_train, y_train)

# Make predictions on new data
new_data = np.random.randn(1, 500)
prediction = network.predict(new_data)

print(f"Neural Network Prediction: {prediction[0]:.3f}")

# Create all benchmark networks
all_networks = create_all_benchmark_networks(input_length=500)
for name, network in all_networks.items():
    print(f"Created {name} network")

Machine Learning Usage

lrdbenchmark provides production-ready machine learning estimators:

from lrdbenchmark import SVREstimator, GradientBoostingEstimator, RandomForestEstimator
import numpy as np

# Generate training data
X_train = np.random.randn(100, 500)  # 100 samples of length 500
y_train = np.random.uniform(0.2, 0.8, 100)  # True Hurst parameters

# Train ML models
svr = SVREstimator(kernel='rbf', C=1.0)
svr.train(X_train, y_train)

gb = GradientBoostingEstimator(n_estimators=50, learning_rate=0.1)
gb.train(X_train, y_train)

rf = RandomForestEstimator(n_estimators=50, max_depth=5)
rf.train(X_train, y_train)

# Make predictions on new data
new_data = np.random.randn(1, 500)
svr_pred = svr.predict(new_data)
gb_pred = gb.predict(new_data)
rf_pred = rf.predict(new_data)

print(f"SVR: {svr_pred:.3f}, Gradient Boosting: {gb_pred:.3f}, Random Forest: {rf_pred:.3f}")

Advanced Neural Network Usage

For production deployment with neural networks, use the Neural Network Factory:

from lrdbenchmark import NeuralNetworkFactory, FBMModel
from lrdbenchmark.analysis.machine_learning.neural_network_factory import NNConfig, NNArchitecture
import numpy as np

# Create factory
factory = NeuralNetworkFactory()

# Configure network
config = NNConfig(
    architecture=NNArchitecture.CNN,
    input_length=500,
    hidden_dims=[64, 32],
    learning_rate=0.001,
    epochs=20
)

# Create and train network
network = factory.create_network(config)
X_train = np.random.randn(100, 500)
y_train = np.random.uniform(0.2, 0.8, 100)
history = network.train_model(X_train, y_train)

# Make prediction
new_data = np.random.randn(1, 500)
prediction = network.predict(new_data)

print(f"CNN Prediction: {prediction[0]:.3f}")

Data Models

lrdbenchmark provides several synthetic data models:

from lrdbenchmark import FBMModel
from lrdbenchmark import FGNModel, ARFIMAModel, MRWModel

# Fractional Brownian Motion
fbm = FBMModel(H=0.7, sigma=1.0)
fbm_data = fbm.generate(1000)

# Fractional Gaussian Noise
fgn = FGNModel(H=0.6, sigma=1.0)
fgn_data = fgn.generate(1000)

# ARFIMA process
arfima = ARFIMAModel(d=0.3, sigma=1.0)
arfima_data = arfima.generate(1000)

# Multifractal Random Walk
mrw = MRWModel(H=0.7, lambda_param=0.1, sigma=1.0)
mrw_data = mrw.generate(1000)

Individual Estimators

Use specific estimators directly:

from lrdbenchmark import DFAEstimator, GPHEstimator

# Detrended Fluctuation Analysis
dfa = DFAEstimator()
dfa_result = dfa.estimate(data)
H_dfa = dfa_result["hurst_parameter"]

# Geweke-Porter-Hudak estimator
gph = GPHEstimator()
gph_result = gph.estimate(data)
H_gph = gph_result["hurst_parameter"]

print(f"DFA H estimate: {H_dfa:.3f}")
print(f"GPH H estimate: {H_gph:.3f}")

Analytics System

Track usage and performance:

from lrdbenchmark import FBMModel, RSEstimator

# Generate data and run analysis
model = FBMModel(H=0.7)
data = model.generate(1000)

# Estimate Hurst parameter
rs_estimator = RSEstimator()
result = rs_estimator.estimate(data)
hurst_estimate = result["hurst_parameter"]

print(f"Hurst estimate: {hurst_estimate:.3f}")

Enhanced ML and Neural Network Estimators

Use the new enhanced estimators with pre-trained models:

from lrdbenchmark import (
    CNNEstimator, LSTMEstimator, GRUEstimator, TransformerEstimator,
    RandomForestEstimator, SVREstimator, GradientBoostingEstimator
)

# Enhanced CNN with residual connections and attention
cnn = CNNEstimator()
cnn_result = cnn.estimate(data)
H_cnn = cnn_result["hurst_parameter"]

# Enhanced LSTM with bidirectional architecture
lstm = LSTMEstimator()
lstm_result = lstm.estimate(data)
H_lstm = lstm_result["hurst_parameter"]

# Enhanced GRU with attention mechanisms
gru = GRUEstimator()
gru_result = gru.estimate(data)
H_gru = gru_result["hurst_parameter"]

# Enhanced Transformer with self-attention
transformer = TransformerEstimator()
transformer_result = transformer.estimate(data)
H_transformer = transformer_result["hurst_parameter"]

# Traditional ML estimators
rf = RandomForestEstimator()
rf_result = rf.estimate(data)
H_rf = rf_result["hurst_parameter"]

svr = SVREstimator()
svr_result = svr.estimate(data)
H_svr = svr_result["hurst_parameter"]

gb = GradientBoostingEstimator()
gb_result = gb.estimate(data)
H_gb = gb_result["hurst_parameter"]

print(f"CNN H estimate: {H_cnn:.3f}")
print(f"LSTM H estimate: {H_lstm:.3f}")
print(f"GRU H estimate: {H_gru:.3f}")
print(f"Transformer H estimate: {H_transformer:.3f}")

Advanced Usage

Custom benchmark configuration:

from lrdbenchmark import FBMModel, RSEstimator, DFAEstimator

# Generate data
model = FBMModel(H=0.7)
data = model.generate(2000)

# Test multiple estimators
rs_estimator = RSEstimator()
dfa_estimator = DFAEstimator()

rs_result = rs_estimator.estimate(data)
dfa_result = dfa_estimator.estimate(data)

print(f"R/S estimate: {rs_result['hurst_parameter']:.3f}")
print(f"DFA estimate: {dfa_result['hurst_parameter']:.3f}")

Integration note: HPFracc

Optional HPFracc integration is not required for core lrdbenchmark usage. If you rely on HPFracc, pin versions against its upstream documentation and adapt any legacy glue code to the current estimator APIs in this package.

Visualization

Plot results and data:

import matplotlib.pyplot as plt
from lrdbenchmark import FBMModel

# Generate data with different H values
H_values = [0.3, 0.5, 0.7, 0.9]
datasets = {}

for H in H_values:
    model = FBMModel(H=H, sigma=1.0)
    datasets[f'H={H}'] = model.generate(1000)

# Plot
plt.figure(figsize=(12, 8))
for name, data in datasets.items():
    plt.plot(data[:200], label=name, alpha=0.7)

plt.title('Fractional Brownian Motion with Different H Values')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()

Performance Tips

Use GPU acceleration when available
Batch processing for large datasets
Enable analytics for monitoring
Use appropriate data lengths (1000+ samples recommended)

Nonstationarity Testing

Test estimator robustness under nonstationarity conditions:

from lrdbenchmark.generation import (
    RegimeSwitchingProcess,
    ContinuousDriftProcess,
    StructuralBreakProcess
)

# Regime switching: H jumps from 0.3 to 0.8 at midpoint
gen = RegimeSwitchingProcess(h_regimes=[0.3, 0.8], change_points=[0.5])
result = gen.generate(1000)
signal = result['signal']
h_trajectory = result['h_trajectory']  # True H at each timepoint

# Continuous linear drift from H=0.3 to H=0.8
gen = ContinuousDriftProcess(h_start=0.3, h_end=0.8, drift_type='linear')
result = gen.generate(1000)

# Structural break with level shift
gen = StructuralBreakProcess(h_before=0.7, h_after=0.4, break_severity=0.3)
result = gen.generate(1000)

Critical Regime Models

Test estimators in physics-motivated critical regimes:

from lrdbenchmark.generation import (
    OrnsteinUhlenbeckProcess,
    FractionalLevyMotion,
    SOCAvalancheModel
)

# OU with time-varying friction (transient criticality)
gen = OrnsteinUhlenbeckProcess(theta_start=0.1, theta_end=1.0)
result = gen.generate(1000)

# Heavy-tailed fractional Lévy motion (α<2 stable)
gen = FractionalLevyMotion(H=0.7, alpha=1.5)
result = gen.generate(1000)

# Self-organized criticality avalanche model
gen = SOCAvalancheModel(grid_size=32)
result = gen.generate(500)

Structural Break Detection

Detect stationarity violations before running classical estimators:

from lrdbenchmark.analysis.diagnostics import StructuralBreakDetector

detector = StructuralBreakDetector(significance_level=0.05)
result = detector.detect_all(data)

if result['any_break_detected']:
    print("⚠️ Warning: Stationarity violated!")
    print(result['warnings'])
else:
    print("Data appears stationary; proceed with classical estimation")

Surrogate Data Testing

Generate surrogates for hypothesis testing:

from lrdbenchmark.generation import IAFFTSurrogate, PhaseRandomizedSurrogate

# IAAFT (class name IAFFTSurrogate): preserve spectrum AND amplitude distribution
gen = IAFFTSurrogate()
result = gen.generate(original_data, n_surrogates=100)
surrogates = result['surrogates']

# Phase randomization: preserve spectrum only
gen = PhaseRandomizedSurrogate()
result = gen.generate(original_data, n_surrogates=100)

Running Failure Benchmarks

Systematically test classical estimators under nonstationarity:

# Quick screening (~5 min)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile quick

# Standard analysis (~1 hour)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile standard

# Full publication run (~8-10 hours)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile full

Next Steps

Demonstration Notebooks Overview - Start here: Comprehensive demonstration notebooks
Installation Guide - Detailed installation guide
Data Models API - Learn about data models
Estimators API - Explore available estimators
Comprehensive LRDBench Demonstration - More examples and use cases

Recommended Learning Path:

Follow the tutorials: Begin with Data Generation and Visualisation (or open notebooks/markdown/01_data_generation_and_visualisation.md)
Explore API: Use the quickstart examples above
Advanced Usage: Try the comprehensive benchmarking examples
Custom Development: Learn extensibility from the custom models notebook