Analytics API

lrdbenchmark provides a comprehensive analytics system for tracking usage, monitoring performance, analyzing errors, and understanding user workflows.

Analytics Dashboard

class lrdbenchmark.analytics.dashboard.AnalyticsDashboard(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive analytics dashboard for LRDBench

Provides easy access to all analytics data and generates comprehensive reports and visualizations, including stratified summaries.

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the analytics dashboard

get_comprehensive_summary(days: int = 30) → Dict[str, Any][source]

Get comprehensive summary of all analytics data

Parameters:: days – Number of days to analyze
Returns:: Dictionary containing all analytics summaries

generate_usage_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive usage report

generate_performance_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive performance report

generate_reliability_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive reliability report

generate_workflow_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive workflow report

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the analytics dashboard

get_comprehensive_summary(days: int = 30) → Dict[str, Any][source]

Get comprehensive summary of all analytics data

Parameters:: days – Number of days to analyze
Returns:: Dictionary containing all analytics summaries

generate_usage_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive usage report

generate_performance_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive performance report

generate_reliability_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive reliability report

generate_workflow_report(days: int = 30, output_path: str | None = None) → str[source]: Generate comprehensive workflow report

generate_comprehensive_report(days: int = 30, output_dir: str | None = None) → str[source]: Generate comprehensive analytics report with all sections

generate_stratified_report(results_path: str, output_path: str | None = None) → str[source]: Generate a stratified benchmark report from a saved comprehensive benchmark JSON.

create_advanced_diagnostics_visuals(advanced_results_path: str, output_dir: str | None = None) → Dict[str, str][source]: Create scaling and robustness visualisations from advanced benchmark artefacts.

create_visualizations(days: int = 30, output_dir: str | None = None) → Dict[str, str][source]: Create visualizations for analytics data

export_all_data(output_dir: str | None = None, days: int = 30) → Dict[str, str][source]: Export all analytics data to files

Usage Tracking

class lrdbenchmark.analytics.usage_tracker.UsageTracker(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Bases: object

Comprehensive usage tracking system for LRDBench

Features: - Real-time event tracking - Privacy-preserving user identification - Performance monitoring - Error analysis - Usage pattern detection

__init__(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Initialize the usage tracker

Parameters:

storage_path – Directory to store analytics data
enable_tracking – Whether to enable usage tracking
privacy_mode – Enable privacy-preserving features

track_estimator_usage(estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None = None, data_length: int = 0, user_id: str | None = None) → None[source]

Track usage of an estimator

Parameters:

estimator_name – Name of the estimator used
parameters – Parameters passed to the estimator
execution_time – Time taken for execution
success – Whether the estimation was successful
error_message – Error message if failed
data_length – Length of input data
user_id – Optional user identifier

track_benchmark_run(benchmark_type: str, estimators_used: List[str], total_time: float, success_count: int, total_count: int, data_models: List[str]) → None[source]

Track benchmark execution

Parameters:

benchmark_type – Type of benchmark run
estimators_used – List of estimators used
total_time – Total execution time
success_count – Number of successful runs
total_count – Total number of runs
data_models – Data models tested

get_usage_summary(days: int = 30) → UsageSummary[source]

Get usage summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: UsageSummary object with aggregated statistics

__init__(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Initialize the usage tracker

Parameters:

storage_path – Directory to store analytics data
enable_tracking – Whether to enable usage tracking
privacy_mode – Enable privacy-preserving features

_generate_session_id() → str[source]: Generate a unique session ID

_load_existing_data()[source]: Load existing analytics data from storage

_start_background_processing()[source]: Start background thread for data processing

track_estimator_usage(estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None = None, data_length: int = 0, user_id: str | None = None) → None[source]

Track usage of an estimator

Parameters:

estimator_name – Name of the estimator used
parameters – Parameters passed to the estimator
execution_time – Time taken for execution
success – Whether the estimation was successful
error_message – Error message if failed
data_length – Length of input data
user_id – Optional user identifier

track_benchmark_run(benchmark_type: str, estimators_used: List[str], total_time: float, success_count: int, total_count: int, data_models: List[str]) → None[source]

Track benchmark execution

Parameters:

benchmark_type – Type of benchmark run
estimators_used – List of estimators used
total_time – Total execution time
success_count – Number of successful runs
total_count – Total number of runs
data_models – Data models tested

_sanitize_parameters(params: Dict[str, Any]) → Dict[str, Any][source]: Sanitize parameters for privacy and storage

_hash_user_id(user_id: str) → str[source]: Hash user ID for privacy

get_usage_summary(days: int = 30) → UsageSummary[source]

Get usage summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: UsageSummary object with aggregated statistics

_get_length_range(length: int) → str[source]: Convert data length to range category

_save_data()[source]: Save analytics data to storage

_cleanup_old_data(max_age_days: int = 90)[source]: Remove old analytics data

export_summary(output_path: str, days: int = 30) → None[source]

Export usage summary to file

Parameters:

output_path – Path to save the summary
days – Number of days to analyze

get_popular_estimators(top_n: int = 10) → List[tuple][source]: Get top N most popular estimators

get_performance_trends(days: int = 7) → Dict[str, List[float]][source]: Get performance trends over time

class lrdbenchmark.analytics.usage_tracker.UsageEvent(timestamp: str, event_type: str, estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None, data_length: int, user_id: str | None, session_id: str)[source]

Bases: object

Represents a single usage event

timestamp: str

event_type: str

estimator_name: str

parameters: Dict[str, Any]

execution_time: float

success: bool

error_message: str | None

data_length: int

user_id: str | None

session_id: str

__init__(timestamp: str, event_type: str, estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None, data_length: int, user_id: str | None, session_id: str) → None

class lrdbenchmark.analytics.usage_tracker.UsageSummary(total_events: int, unique_users: int, estimator_usage: Dict[str, int], parameter_frequency: Dict[str, Dict[str, int]], success_rate: float, avg_execution_time: float, common_errors: Dict[str, int], data_length_distribution: Dict[str, int])[source]

Bases: object

Aggregated usage statistics

total_events: int

unique_users: int

estimator_usage: Dict[str, int]

parameter_frequency: Dict[str, Dict[str, int]]

success_rate: float

avg_execution_time: float

common_errors: Dict[str, int]

data_length_distribution: Dict[str, int]

__init__(total_events: int, unique_users: int, estimator_usage: Dict[str, int], parameter_frequency: Dict[str, Dict[str, int]], success_rate: float, avg_execution_time: float, common_errors: Dict[str, int], data_length_distribution: Dict[str, int]) → None

Performance Monitoring

class lrdbenchmark.analytics.performance_monitor.PerformanceMonitor(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive performance monitoring system

Features: - Real-time performance tracking - Memory usage monitoring - CPU utilization tracking - Performance trend analysis - Bottleneck identification

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the performance monitor

start_monitoring(estimator_name: str, data_length: int, parameters: Dict[str, str]) → str[source]

Start monitoring a new execution

Parameters:

estimator_name – Name of the estimator
data_length – Length of input data
parameters – Estimator parameters

Returns:

Monitoring session ID

stop_monitoring(session_id: str) → None[source]

Stop monitoring and record metrics

Parameters:: session_id – Monitoring session ID

get_performance_summary(days: int = 30) → PerformanceSummary[source]

Get performance summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: PerformanceSummary object

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the performance monitor

_load_existing_data()[source]: Load existing performance data

start_monitoring(estimator_name: str, data_length: int, parameters: Dict[str, str]) → str[source]

Start monitoring a new execution

Parameters:

estimator_name – Name of the estimator
data_length – Length of input data
parameters – Estimator parameters

Returns:

Monitoring session ID

stop_monitoring(session_id: str) → None[source]

Stop monitoring and record metrics

Parameters:: session_id – Monitoring session ID

timer(name: str)[source]

Context manager for timing code blocks.

Parameters:: name – Name of the timer

Usage:

with monitor.timer(‘my_operation’):: # code to time pass

get_stats() → Dict[str, Dict[str, float]][source]

Get statistics for all timers.

Returns:: Dictionary mapping timer names to statistics (mean, std, min, max, count)

get_performance_summary(days: int = 30) → PerformanceSummary[source]

Get performance summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: PerformanceSummary object

_analyze_performance_trend(metrics: List[PerformanceMetrics]) → str[source]: Analyze performance trend over time

_identify_bottlenecks(metrics: List[PerformanceMetrics]) → List[str][source]: Identify estimators with performance bottlenecks

get_estimator_performance(estimator_name: str, days: int = 30) → Dict[str, float][source]: Get performance metrics for a specific estimator

export_metrics(output_path: str, days: int = 30) → None[source]: Export performance metrics to file

get_memory_trends(days: int = 7) → Dict[str, List[float]][source]: Get memory usage trends over time

class lrdbenchmark.analytics.performance_monitor.PerformanceMetrics(timestamp: str, estimator_name: str, execution_time: float, memory_before: float, memory_after: float, memory_peak: float, cpu_percent: float, data_length: int, parameters: Dict[str, str])[source]

Bases: object

Performance metrics for a single execution

timestamp: str

estimator_name: str

execution_time: float

memory_before: float

memory_after: float

memory_peak: float

cpu_percent: float

data_length: int

parameters: Dict[str, str]

__init__(timestamp: str, estimator_name: str, execution_time: float, memory_before: float, memory_after: float, memory_peak: float, cpu_percent: float, data_length: int, parameters: Dict[str, str]) → None

class lrdbenchmark.analytics.performance_monitor.PerformanceSummary(total_executions: int, avg_execution_time: float, std_execution_time: float, min_execution_time: float, max_execution_time: float, avg_memory_usage: float, memory_efficiency: float, performance_trend: str, bottleneck_estimators: List[str])[source]

Bases: object

Aggregated performance statistics

total_executions: int

avg_execution_time: float

std_execution_time: float

min_execution_time: float

max_execution_time: float

avg_memory_usage: float

memory_efficiency: float

performance_trend: str

bottleneck_estimators: List[str]

__init__(total_executions: int, avg_execution_time: float, std_execution_time: float, min_execution_time: float, max_execution_time: float, avg_memory_usage: float, memory_efficiency: float, performance_trend: str, bottleneck_estimators: List[str]) → None

Error Analysis

class lrdbenchmark.analytics.error_analyzer.ErrorAnalyzer(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive error analysis system

Features: - Error pattern recognition - Failure mode analysis - Reliability scoring - Trend analysis - Improvement recommendations

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the error analyzer

record_error(estimator_name: str, error_message: str, stack_trace: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None, session_id: str | None = None) → None[source]

Record a new error event

Parameters:

estimator_name – Name of the estimator that failed
error_message – Error message
stack_trace – Optional stack trace
parameters – Estimator parameters
data_length – Length of input data
user_id – Optional user identifier
session_id – Optional session identifier

get_error_summary(days: int = 30) → ErrorSummary[source]

Get error summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: ErrorSummary object

get_improvement_recommendations(days: int = 30) → List[str][source]: Get recommendations for improving reliability

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the error analyzer

_load_existing_data()[source]: Load existing error data

record_error(estimator_name: str, error_message: str, stack_trace: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None, session_id: str | None = None) → None[source]

Record a new error event

Parameters:

estimator_name – Name of the estimator that failed
error_message – Error message
stack_trace – Optional stack trace
parameters – Estimator parameters
data_length – Length of input data
user_id – Optional user identifier
session_id – Optional session identifier

_categorize_error(error_message: str) → str[source]: Categorize error based on message patterns

get_error_summary(days: int = 30) → ErrorSummary[source]

Get error summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: ErrorSummary object

record_uncertainty_calibration(estimator_name: str, data_model: str | None, ci_lower: float | None, ci_upper: float | None, estimate: float | None, true_value: float | None, method: str | None, coverage_flag: bool | None, metadata: Dict[str, Any] | None = None) → None[source]: Record a new uncertainty calibration event.

_persist_uncertainty_events() → None[source]: Persist uncertainty events to disk.

_analyze_error_trends(errors: List[ErrorEvent]) → Dict[str, str][source]: Analyze error trends over time

get_estimator_reliability(estimator_name: str, days: int = 30) → Dict[str, float][source]: Get reliability metrics for a specific estimator

get_improvement_recommendations(days: int = 30) → List[str][source]: Get recommendations for improving reliability

export_errors(output_path: str, days: int = 30) → None[source]: Export error data to file

get_error_correlation(days: int = 30) → Dict[str, Dict[str, float]][source]: Analyze correlations between different error types

get_uncertainty_summary(days: int = 30) → Dict[str, Any][source]: Summarise uncertainty coverage over the requested horizon.

export_uncertainty_calibration(output_path: str, days: int = 30) → None[source]: Export uncertainty calibration events to a JSON file.

summarise_uncertainty_calibration(days: int = 30, min_samples: int = 3) → List[Dict[str, Any]][source]: Aggregate empirical coverage rates per estimator/method.

plot_uncertainty_calibration(output_path: str, days: int = 30, min_samples: int = 3) → str | None[source]: Create a nominal vs empirical coverage plot from calibration records.

_calculate_correlation(session_errors: Dict, error_type1: str, error_type2: str) → float[source]: Calculate correlation between two error types

class lrdbenchmark.analytics.error_analyzer.ErrorEvent(timestamp: str, estimator_name: str, error_type: str, error_message: str, stack_trace: str | None, parameters: Dict[str, str], data_length: int, user_id: str | None, session_id: str)[source]

Bases: object

Represents a single error event

timestamp: str

estimator_name: str

error_type: str

error_message: str

stack_trace: str | None

parameters: Dict[str, str]

data_length: int

user_id: str | None

session_id: str

__init__(timestamp: str, estimator_name: str, error_type: str, error_message: str, stack_trace: str | None, parameters: Dict[str, str], data_length: int, user_id: str | None, session_id: str) → None

class lrdbenchmark.analytics.error_analyzer.ErrorSummary(total_errors: int, unique_errors: int, error_rate: float, most_common_errors: List[Tuple[str, int]], error_by_estimator: Dict[str, int], error_by_type: Dict[str, int], error_trends: Dict[str, str], reliability_score: float)[source]

Bases: object

Aggregated error statistics

total_errors: int

unique_errors: int

error_rate: float

most_common_errors: List[Tuple[str, int]]

error_by_estimator: Dict[str, int]

error_by_type: Dict[str, int]

error_trends: Dict[str, str]

reliability_score: float

__init__(total_errors: int, unique_errors: int, error_rate: float, most_common_errors: List[Tuple[str, int]], error_by_estimator: Dict[str, int], error_by_type: Dict[str, int], error_trends: Dict[str, str], reliability_score: float) → None

Workflow Analysis

class lrdbenchmark.analytics.workflow_analyzer.WorkflowAnalyzer(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive workflow analysis system

Features: - Workflow pattern recognition - Sequence analysis - User behavior modeling - Optimization recommendations - Feature usage analysis

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the workflow analyzer

get_workflow_summary(days: int = 30) → WorkflowSummary[source]

Get workflow summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: WorkflowSummary object

__init__(storage_path: str = '~/.lrdbench/analytics')[source]: Initialize the workflow analyzer

_load_existing_data()[source]: Load existing workflow data

_reconstruct_workflow(workflow_data: Dict) → Workflow | None[source]: Reconstruct workflow object from stored data

start_workflow_session(session_id: str, user_id: str | None = None) → None[source]: Start tracking a new workflow session

add_workflow_step(session_id: str, step_type: str, estimator_name: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None) → None[source]

Add a step to the current workflow session

Parameters:

session_id – Session identifier
step_type – Type of workflow step
estimator_name – Name of estimator used (if applicable)
parameters – Parameters for the step
data_length – Length of input data
user_id – Optional user identifier

end_workflow_session(session_id: str) → str | None[source]

End a workflow session and create workflow record

Parameters:: session_id – Session identifier
Returns:: Workflow ID if successful, None otherwise

get_workflow_summary(days: int = 30) → WorkflowSummary[source]

Get workflow summary for the specified time period

Parameters:: days – Number of days to analyze
Returns:: WorkflowSummary object

_analyze_workflow_patterns(workflows: List[Workflow]) → List[Tuple[List[str], int]][source]: Analyze common workflow patterns

_analyze_estimator_sequences(workflows: List[Workflow]) → List[Tuple[List[str], int]][source]: Analyze popular estimator sequences

_analyze_workflow_complexity(workflows: List[Workflow]) → Dict[str, int][source]: Analyze workflow complexity distribution

get_user_workflow_patterns(user_id: str, days: int = 30) → Dict[str, Any][source]: Get workflow patterns for a specific user

get_workflow_optimization_recommendations(days: int = 30) → List[str][source]: Get recommendations for workflow optimization

export_workflows(output_path: str, days: int = 30) → None[source]: Export workflow data to file

get_feature_usage_analysis(days: int = 30) → Dict[str, Any][source]: Analyze feature usage patterns

_get_length_range(length: int) → str[source]: Convert data length to range category

class lrdbenchmark.analytics.workflow_analyzer.WorkflowStep(timestamp: str, step_type: str, estimator_name: str | None, parameters: Dict[str, str], data_length: int, session_id: str, user_id: str | None)[source]

Bases: object

Represents a single step in a user workflow

timestamp: str

step_type: str

estimator_name: str | None

parameters: Dict[str, str]

data_length: int

session_id: str

user_id: str | None

__init__(timestamp: str, step_type: str, estimator_name: str | None, parameters: Dict[str, str], data_length: int, session_id: str, user_id: str | None) → None

class lrdbenchmark.analytics.workflow_analyzer.Workflow(workflow_id: str, session_id: str, user_id: str | None, steps: List[WorkflowStep], start_time: str, end_time: str, total_duration: float, step_count: int)[source]

Bases: object

Represents a complete user workflow

workflow_id: str

session_id: str

user_id: str | None

steps: List[WorkflowStep]

start_time: str

end_time: str

total_duration: float

step_count: int

__init__(workflow_id: str, session_id: str, user_id: str | None, steps: List[WorkflowStep], start_time: str, end_time: str, total_duration: float, step_count: int) → None

Conveneince Functions

Note

Convenience functions are provided via the analytics submodule. Import from lrdbenchmark.analytics rather than top-level lrdbenchmark.

Usage Examples

Basic Analytics Setup

from lrdbenchmark import enable_analytics, get_analytics_summary
from lrdbenchmark import AnalyticsDashboard

# Enable analytics system
print("Enabling LRDBench analytics system...")
enable_analytics()

# Your analysis code here
from lrdbenchmark import FBMModel, FGNModel, ComprehensiveBenchmark
import time

print("Running analysis with analytics tracking...")

# Generate data with different models
models = {
    'FBM (H=0.7)': FBMModel(H=0.7, sigma=1.0),
    'FBM (H=0.3)': FBMModel(H=0.3, sigma=1.0),
    'FGN (H=0.8)': FGNModel(H=0.8, sigma=1.0)
}

for model_name, model in models.items():
    print(f"Generating {model_name} data...")
    data = model.generate(1000, seed=42)

    # Run benchmark
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=5
    )

    print(f"Completed benchmark for {model_name}")

# Get comprehensive analytics summary
print("\n=== ANALYTICS SUMMARY ===")
summary = get_analytics_summary()
print(summary)

# Create dashboard for detailed analysis
dashboard = AnalyticsDashboard()

# Generate specific reports
print("\n=== USAGE REPORT ===")
usage_report = dashboard.generate_usage_report()
print(usage_report)

print("\n=== PERFORMANCE REPORT ===")
performance_report = dashboard.generate_performance_report()
print(performance_report)

print("\n=== RELIABILITY REPORT ===")
reliability_report = dashboard.generate_reliability_report()
print(reliability_report)

Usage Tracking with Decorators

from lrdbenchmark import track_usage, FBMModel

@track_usage
def analyze_fbm_data(H=0.7, length=1000):
    """Analyze FBM data with given parameters."""
    model = FBMModel(H=H, sigma=1.0)
    data = model.generate(length, seed=42)

    # Perform analysis
    return data.mean(), data.std()

# Function calls will be automatically tracked
mean_val, std_val = analyze_fbm_data(H=0.8, length=2000)
mean_val2, std_val2 = analyze_fbm_data(H=0.6, length=1000)

Performance Monitoring

from lrdbenchmark import monitor_performance, ComprehensiveBenchmark

@monitor_performance
def run_benchmark_analysis():
    """Run comprehensive benchmark analysis."""
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=10
    )
    return results

# Performance will be automatically monitored
results = run_benchmark_analysis()

# Get performance summary
from lrdbenchmark import PerformanceMonitor
monitor = PerformanceMonitor()
perf_summary = monitor.get_performance_summary()
print(f"Average execution time: {perf_summary.avg_execution_time:.2f}s")

Error Tracking

from lrdbenchmark import track_errors, ComprehensiveBenchmark

@track_errors
def run_estimator_analysis():
    """Run estimator analysis with error tracking."""
    benchmark = ComprehensiveBenchmark()

    try:
        results = benchmark.run_comprehensive_benchmark(
            data_length=1000,
            n_runs=5
        )
        return results
    except Exception as e:
        # Errors will be automatically tracked
        raise e

# Run analysis
try:
    results = run_estimator_analysis()
except Exception as e:
    print(f"Analysis failed: {e}")

# Get error summary
from lrdbenchmark import ErrorAnalyzer
error_analyzer = ErrorAnalyzer()
error_summary = error_analyzer.get_error_summary()
print(f"Total errors: {error_summary.total_errors}")

Workflow Tracking

from lrdbenchmark import track_workflow, FBMModel, ComprehensiveBenchmark

@track_workflow
def complete_analysis_workflow():
    """Complete analysis workflow with tracking."""
    # Step 1: Data generation
    model = FBMModel(H=0.7, sigma=1.0)
    data = model.generate(1000, seed=42)

    # Step 2: Benchmark execution
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=5
    )

    # Step 3: Results analysis
    summary = results.get_summary()

    return summary

# Workflow will be automatically tracked
summary = complete_analysis_workflow()

# Get workflow summary
from lrdbenchmark import WorkflowAnalyzer
workflow_analyzer = WorkflowAnalyzer()
workflow_summary = workflow_analyzer.get_workflow_summary()
print(f"Workflows completed: {workflow_summary.total_workflows}")

Advanced Analytics Dashboard

from lrdbenchmark import AnalyticsDashboard

# Create analytics dashboard
dashboard = AnalyticsDashboard()

# Generate comprehensive analytics report
report = dashboard.get_comprehensive_summary()
print("=== COMPREHENSIVE ANALYTICS REPORT ===")
print(report)

# Generate specific reports
usage_report = dashboard.generate_usage_report()
performance_report = dashboard.generate_performance_report()
reliability_report = dashboard.generate_reliability_report()
workflow_report = dashboard.generate_workflow_report()

print("\n=== USAGE REPORT ===")
print(usage_report)

print("\n=== PERFORMANCE REPORT ===")
print(performance_report)

print("\n=== RELIABILITY REPORT ===")
print(reliability_report)

print("\n=== WORKFLOW REPORT ===")
print(workflow_report)

Custom Analytics Configuration

from lrdbenchmark.analytics import (
    UsageTracker, PerformanceMonitor, ErrorAnalyzer, WorkflowAnalyzer
)

# Create custom analytics components
usage_tracker = UsageTracker(
    track_user_id=True,
    track_parameters=True,
    track_timing=True
)

performance_monitor = PerformanceMonitor(
    track_memory=True,
    track_cpu=True,
    track_gpu=True
)

error_analyzer = ErrorAnalyzer(
    categorize_errors=True,
    track_stack_traces=True,
    generate_recommendations=True
)

workflow_analyzer = WorkflowAnalyzer(
    track_step_dependencies=True,
    analyze_patterns=True,
    generate_optimizations=True
)

# Use custom components
usage_tracker.track_estimator_usage(
    estimator_name='dfa',
    parameters={'min_scale': 4, 'max_scale': 100},
    execution_time=1.23,
    success=True
)

performance_monitor.start_monitoring()
# ... your code here ...
performance_monitor.stop_monitoring()

error_analyzer.record_error(
    error_type='ValueError',
    error_message='Invalid parameter value',
    context={'estimator': 'dfa', 'parameters': {'H': 1.5}},
    timestamp='2024-01-15T10:30:00'
)

Data Export and Visualization

from lrdbenchmark import AnalyticsDashboard
import pandas as pd
import matplotlib.pyplot as plt

dashboard = AnalyticsDashboard()

# Export analytics data
analytics_data = dashboard.export_analytics_data()

# Convert to pandas DataFrame
df = pd.DataFrame(analytics_data['usage_events'])

# Create visualizations
plt.figure(figsize=(12, 8))

# Usage by estimator
plt.subplot(2, 2, 1)
estimator_counts = df['estimator_name'].value_counts()
estimator_counts.plot(kind='bar')
plt.title('Usage by Estimator')
plt.xticks(rotation=45)

# Execution time distribution
plt.subplot(2, 2, 2)
plt.hist(df['execution_time'], bins=20, alpha=0.7)
plt.title('Execution Time Distribution')
plt.xlabel('Time (seconds)')

# Success rate over time
plt.subplot(2, 2, 3)
df['date'] = pd.to_datetime(df['timestamp']).dt.date
success_rate = df.groupby('date')['success'].mean()
success_rate.plot(kind='line')
plt.title('Success Rate Over Time')
plt.ylabel('Success Rate')

# Parameter usage heatmap
plt.subplot(2, 2, 4)
# Create heatmap of parameter usage
plt.title('Parameter Usage Heatmap')

plt.tight_layout()
plt.show()

Real-time Analytics Monitoring

from lrdbenchmark import AnalyticsDashboard
import time
import threading

dashboard = AnalyticsDashboard()

def monitor_analytics():
    """Monitor analytics in real-time."""
    while True:
        summary = dashboard.get_comprehensive_summary()
        print("\n" + "="*50)
        print("REAL-TIME ANALYTICS UPDATE")
        print("="*50)
        print(summary)
        time.sleep(60)  # Update every minute

# Start monitoring in background
monitor_thread = threading.Thread(target=monitor_analytics, daemon=True)
monitor_thread.start()

# Your analysis code here
from lrdbenchmark import FBMModel, ComprehensiveBenchmark

for i in range(5):
    model = FBMModel(H=0.5 + i*0.1, sigma=1.0)
    data = model.generate(1000, seed=i)

    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=2
    )

    time.sleep(30)  # Wait between runs

Analytics Integration with Benchmarks

from lrdbenchmark import ComprehensiveBenchmark, enable_analytics
from lrdbenchmark import AnalyticsDashboard

# Enable analytics
enable_analytics()

# Create benchmark with analytics integration
benchmark = ComprehensiveBenchmark()

# Run benchmark with analytics tracking
results = benchmark.run_comprehensive_benchmark(
    data_length=1000,
    n_runs=10,
    enable_analytics=True  # Enable analytics tracking
)

# Get analytics dashboard
dashboard = AnalyticsDashboard()

# Generate integrated report
integrated_report = dashboard.generate_integrated_report(
    benchmark_results=results,
    include_performance=True,
    include_reliability=True,
    include_workflow=True
)

print("=== INTEGRATED BENCHMARK & ANALYTICS REPORT ===")
print(integrated_report)

Best Practices

Enable Early: Enable analytics at the start of your analysis
Use Decorators: Use the provided decorators for automatic tracking
Monitor Performance: Track execution times for optimization
Error Handling: Always track errors for debugging
Workflow Analysis: Track complete workflows for optimization
Regular Reports: Generate regular analytics reports
Data Export: Export analytics data for external analysis
Privacy: Be mindful of sensitive data in analytics

Configuration Options

Analytics Configuration

from lrdbenchmark.analytics import AnalyticsConfig

# Configure analytics system
config = AnalyticsConfig(
    # Usage tracking
    track_user_id=True,
    track_parameters=True,
    track_timing=True,

    # Performance monitoring
    track_memory=True,
    track_cpu=True,
    track_gpu=True,

    # Error analysis
    categorize_errors=True,
    track_stack_traces=True,
    generate_recommendations=True,

    # Workflow analysis
    track_step_dependencies=True,
    analyze_patterns=True,
    generate_optimizations=True,

    # Data retention
    max_events=10000,
    retention_days=30,

    # Privacy
    anonymize_user_ids=True,
    sanitize_parameters=True
)

# Apply configuration
from lrdbenchmark.analytics import configure_analytics
configure_analytics(config)

Privacy and Security

from lrdbenchmark import UsageTracker

# Create privacy-aware usage tracker
usage_tracker = UsageTracker(
    track_user_id=False,  # Don't track user IDs
    sanitize_parameters=True,  # Remove sensitive parameters
    anonymize_data=True  # Anonymize all data
)

# Track usage with privacy protection
usage_tracker.track_estimator_usage(
    estimator_name='dfa',
    parameters={'min_scale': 4, 'max_scale': 100},  # Will be sanitized
    execution_time=1.23,
    success=True
)

Note

The analytics system is designed to be privacy-aware and can be configured to protect sensitive information while still providing valuable insights.

Warning

When using analytics in production environments, ensure compliance with data protection regulations and implement appropriate privacy controls.