Quick Start Guide
This guide will get you up and running with lrdbenchmark in minutes.
📓 For comprehensive examples, see the 5 demonstration notebooks in the `notebooks/` directory: - Data Generation & Visualization: All stochastic models with comprehensive plots - Estimation & Validation: All estimator categories with statistical validation - Custom Models & Estimators: Library extensibility and custom implementations - Comprehensive Benchmarking: Full benchmarking system with contamination testing - Leaderboard Generation: Performance rankings and comparative analysis
Basic Usage
Generate synthetic data and run a benchmark:
import numpy as np
from lrdbenchmark import FBMModel, RSEstimator
# Generate Fractional Brownian Motion data
model = FBMModel(H=0.7, sigma=1.0)
data = model.generate(length=1000, seed=42)
# Estimate Hurst parameter using R/S analysis
rs_estimator = RSEstimator()
result = rs_estimator.estimate(data)
hurst_estimate = result["hurst_parameter"]
print(f"True Hurst: 0.7, Estimated: {hurst_estimate:.3f}")
Neural Network Usage
lrdbenchmark provides a comprehensive neural network factory with 4 architectures that achieve excellent speed-accuracy trade-offs:
from lrdbenchmark import NeuralNetworkFactory
from lrdbenchmark.analysis.machine_learning.neural_network_factory import NNArchitecture, NNConfig, create_all_benchmark_networks
import numpy as np
# Create neural network factory
factory = NeuralNetworkFactory()
# Create a specific network
config = NNConfig(
architecture=NNArchitecture.TRANSFORMER,
input_length=500,
hidden_dims=[64, 32],
learning_rate=0.001,
epochs=50
)
network = factory.create_network(config)
# Generate training data
X_train = np.random.randn(100, 500) # 100 samples of length 500
y_train = np.random.uniform(0.2, 0.8, 100) # True Hurst parameters
# Train the network (train-once, apply-many workflow)
history = network.train_model(X_train, y_train)
# Make predictions on new data
new_data = np.random.randn(1, 500)
prediction = network.predict(new_data)
print(f"Neural Network Prediction: {prediction[0]:.3f}")
# Create all benchmark networks
all_networks = create_all_benchmark_networks(input_length=500)
for name, network in all_networks.items():
print(f"Created {name} network")
Machine Learning Usage
lrdbenchmark provides production-ready machine learning estimators:
from lrdbenchmark import SVREstimator, GradientBoostingEstimator, RandomForestEstimator
import numpy as np
# Generate training data
X_train = np.random.randn(100, 500) # 100 samples of length 500
y_train = np.random.uniform(0.2, 0.8, 100) # True Hurst parameters
# Train ML models
svr = SVREstimator(kernel='rbf', C=1.0)
svr.train(X_train, y_train)
gb = GradientBoostingEstimator(n_estimators=50, learning_rate=0.1)
gb.train(X_train, y_train)
rf = RandomForestEstimator(n_estimators=50, max_depth=5)
rf.train(X_train, y_train)
# Make predictions on new data
new_data = np.random.randn(1, 500)
svr_pred = svr.predict(new_data)
gb_pred = gb.predict(new_data)
rf_pred = rf.predict(new_data)
print(f"SVR: {svr_pred:.3f}, Gradient Boosting: {gb_pred:.3f}, Random Forest: {rf_pred:.3f}")
Advanced Neural Network Usage
For production deployment with neural networks, use the Neural Network Factory:
from lrdbenchmark import NeuralNetworkFactory, FBMModel
from lrdbenchmark.analysis.machine_learning.neural_network_factory import NNConfig, NNArchitecture
import numpy as np
# Create factory
factory = NeuralNetworkFactory()
# Configure network
config = NNConfig(
architecture=NNArchitecture.CNN,
input_length=500,
hidden_dims=[64, 32],
learning_rate=0.001,
epochs=20
)
# Create and train network
network = factory.create_network(config)
X_train = np.random.randn(100, 500)
y_train = np.random.uniform(0.2, 0.8, 100)
history = network.train_model(X_train, y_train)
# Make prediction
new_data = np.random.randn(1, 500)
prediction = network.predict(new_data)
print(f"CNN Prediction: {prediction[0]:.3f}")
Data Models
lrdbenchmark provides several synthetic data models:
from lrdbenchmark import FBMModel
from lrdbenchmark import FGNModel, ARFIMAModel, MRWModel
# Fractional Brownian Motion
fbm = FBMModel(H=0.7, sigma=1.0)
fbm_data = fbm.generate(1000)
# Fractional Gaussian Noise
fgn = FGNModel(H=0.6, sigma=1.0)
fgn_data = fgn.generate(1000)
# ARFIMA process
arfima = ARFIMAModel(d=0.3, sigma=1.0)
arfima_data = arfima.generate(1000)
# Multifractal Random Walk
mrw = MRWModel(H=0.7, lambda_param=0.1, sigma=1.0)
mrw_data = mrw.generate(1000)
Individual Estimators
Use specific estimators directly:
from lrdbenchmark import DFAEstimator, GPHEstimator
# Detrended Fluctuation Analysis
dfa = DFAEstimator()
dfa_result = dfa.estimate(data)
H_dfa = dfa_result["hurst_parameter"]
# Geweke-Porter-Hudak estimator
gph = GPHEstimator()
gph_result = gph.estimate(data)
H_gph = gph_result["hurst_parameter"]
print(f"DFA H estimate: {H_dfa:.3f}")
print(f"GPH H estimate: {H_gph:.3f}")
Analytics System
Track usage and performance:
from lrdbenchmark import FBMModel, RSEstimator
# Generate data and run analysis
model = FBMModel(H=0.7)
data = model.generate(1000)
# Estimate Hurst parameter
rs_estimator = RSEstimator()
result = rs_estimator.estimate(data)
hurst_estimate = result["hurst_parameter"]
print(f"Hurst estimate: {hurst_estimate:.3f}")
Enhanced ML and Neural Network Estimators
Use the new enhanced estimators with pre-trained models:
from lrdbenchmark import (
CNNEstimator, LSTMEstimator, GRUEstimator, TransformerEstimator,
RandomForestEstimator, SVREstimator, GradientBoostingEstimator
)
# Enhanced CNN with residual connections and attention
cnn = CNNEstimator()
cnn_result = cnn.estimate(data)
H_cnn = cnn_result["hurst_parameter"]
# Enhanced LSTM with bidirectional architecture
lstm = LSTMEstimator()
lstm_result = lstm.estimate(data)
H_lstm = lstm_result["hurst_parameter"]
# Enhanced GRU with attention mechanisms
gru = GRUEstimator()
gru_result = gru.estimate(data)
H_gru = gru_result["hurst_parameter"]
# Enhanced Transformer with self-attention
transformer = TransformerEstimator()
transformer_result = transformer.estimate(data)
H_transformer = transformer_result["hurst_parameter"]
# Traditional ML estimators
rf = RandomForestEstimator()
rf_result = rf.estimate(data)
H_rf = rf_result["hurst_parameter"]
svr = SVREstimator()
svr_result = svr.estimate(data)
H_svr = svr_result["hurst_parameter"]
gb = GradientBoostingEstimator()
gb_result = gb.estimate(data)
H_gb = gb_result["hurst_parameter"]
print(f"CNN H estimate: {H_cnn:.3f}")
print(f"LSTM H estimate: {H_lstm:.3f}")
print(f"GRU H estimate: {H_gru:.3f}")
print(f"Transformer H estimate: {H_transformer:.3f}")
Advanced Usage
Custom benchmark configuration:
from lrdbenchmark import FBMModel, RSEstimator, DFAEstimator
# Generate data
model = FBMModel(H=0.7)
data = model.generate(2000)
# Test multiple estimators
rs_estimator = RSEstimator()
dfa_estimator = DFAEstimator()
rs_result = rs_estimator.estimate(data)
dfa_result = dfa_estimator.estimate(data)
print(f"R/S estimate: {rs_result['hurst_parameter']:.3f}")
print(f"DFA estimate: {dfa_result['hurst_parameter']:.3f}")
Integration note: HPFracc
An updated HPFracc API is available; see documentation_summaries/PROJECT_CLEANUP_SUMMARY.md for the current reference and adapt example code accordingly. The integration is optional and not required for core lrdbenchmark usage.
Visualization
Plot results and data:
import matplotlib.pyplot as plt
from lrdbenchmark import FBMModel
# Generate data with different H values
H_values = [0.3, 0.5, 0.7, 0.9]
datasets = {}
for H in H_values:
model = FBMModel(H=H, sigma=1.0)
datasets[f'H={H}'] = model.generate(1000)
# Plot
plt.figure(figsize=(12, 8))
for name, data in datasets.items():
plt.plot(data[:200], label=name, alpha=0.7)
plt.title('Fractional Brownian Motion with Different H Values')
plt.xlabel('Time')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()
Performance Tips
Use GPU acceleration when available
Batch processing for large datasets
Enable analytics for monitoring
Use appropriate data lengths (1000+ samples recommended)
Nonstationarity Testing
Test estimator robustness under nonstationarity conditions:
from lrdbenchmark.generation import (
RegimeSwitchingProcess,
ContinuousDriftProcess,
StructuralBreakProcess
)
# Regime switching: H jumps from 0.3 to 0.8 at midpoint
gen = RegimeSwitchingProcess(h_regimes=[0.3, 0.8], change_points=[0.5])
result = gen.generate(1000)
signal = result['signal']
h_trajectory = result['h_trajectory'] # True H at each timepoint
# Continuous linear drift from H=0.3 to H=0.8
gen = ContinuousDriftProcess(h_start=0.3, h_end=0.8, drift_type='linear')
result = gen.generate(1000)
# Structural break with level shift
gen = StructuralBreakProcess(h_before=0.7, h_after=0.4, break_severity=0.3)
result = gen.generate(1000)
Critical Regime Models
Test estimators in physics-motivated critical regimes:
from lrdbenchmark.generation import (
OrnsteinUhlenbeckProcess,
FractionalLevyMotion,
SOCAvalancheModel
)
# OU with time-varying friction (transient criticality)
gen = OrnsteinUhlenbeckProcess(theta_start=0.1, theta_end=1.0)
result = gen.generate(1000)
# Heavy-tailed fractional Lévy motion (α<2 stable)
gen = FractionalLevyMotion(H=0.7, alpha=1.5)
result = gen.generate(1000)
# Self-organized criticality avalanche model
gen = SOCAvalancheModel(grid_size=32)
result = gen.generate(500)
Structural Break Detection
Detect stationarity violations before running classical estimators:
from lrdbenchmark.analysis.diagnostics import StructuralBreakDetector
detector = StructuralBreakDetector(significance_level=0.05)
result = detector.detect_all(data)
if result['any_break_detected']:
print("⚠️ Warning: Stationarity violated!")
print(result['warnings'])
else:
print("Data appears stationary; proceed with classical estimation")
Surrogate Data Testing
Generate surrogates for hypothesis testing:
from lrdbenchmark.generation import IAFFTSurrogate, PhaseRandomizedSurrogate
# IAAFT: preserve spectrum AND amplitude distribution
gen = IAFFTSurrogate()
result = gen.generate(original_data, n_surrogates=100)
surrogates = result['surrogates']
# Phase randomization: preserve spectrum only
gen = PhaseRandomizedSurrogate()
result = gen.generate(original_data, n_surrogates=100)
Running Failure Benchmarks
Systematically test classical estimators under nonstationarity:
# Quick screening (~5 min)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile quick
# Standard analysis (~1 hour)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile standard
# Full publication run (~8-10 hours)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile full
Next Steps
Demonstration Notebooks Overview - Start here: Comprehensive demonstration notebooks
Installation Guide - Detailed installation guide
Data Models API - Learn about data models
Estimators API - Explore available estimators
Comprehensive LRDBench Demonstration - More examples and use cases
Recommended Learning Path:
Follow the tutorials: Begin with Data Generation and Visualisation (or open notebooks/markdown/01_data_generation_and_visualisation.md)
Explore API: Use the quickstart examples above
Advanced Usage: Try the comprehensive benchmarking examples
Custom Development: Learn extensibility from the custom models notebook