Analytics API

lrdbenchmark provides a comprehensive analytics system for tracking usage, monitoring performance, analyzing errors, and understanding user workflows.

Analytics Dashboard

class lrdbenchmark.analytics.dashboard.AnalyticsDashboard(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive analytics dashboard for LRDBench

Provides easy access to all analytics data and generates comprehensive reports and visualizations, including stratified summaries.

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the analytics dashboard

get_comprehensive_summary(days: int = 30) Dict[str, Any][source]

Get comprehensive summary of all analytics data

Parameters:

days – Number of days to analyze

Returns:

Dictionary containing all analytics summaries

generate_usage_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive usage report

generate_performance_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive performance report

generate_reliability_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive reliability report

generate_workflow_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive workflow report

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the analytics dashboard

get_comprehensive_summary(days: int = 30) Dict[str, Any][source]

Get comprehensive summary of all analytics data

Parameters:

days – Number of days to analyze

Returns:

Dictionary containing all analytics summaries

generate_usage_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive usage report

generate_performance_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive performance report

generate_reliability_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive reliability report

generate_workflow_report(days: int = 30, output_path: str | None = None) str[source]

Generate comprehensive workflow report

generate_comprehensive_report(days: int = 30, output_dir: str | None = None) str[source]

Generate comprehensive analytics report with all sections

generate_stratified_report(results_path: str, output_path: str | None = None) str[source]

Generate a stratified benchmark report from a saved comprehensive benchmark JSON.

create_advanced_diagnostics_visuals(advanced_results_path: str, output_dir: str | None = None) Dict[str, str][source]

Create scaling and robustness visualisations from advanced benchmark artefacts.

create_visualizations(days: int = 30, output_dir: str | None = None) Dict[str, str][source]

Create visualizations for analytics data

export_all_data(output_dir: str | None = None, days: int = 30) Dict[str, str][source]

Export all analytics data to files

Usage Tracking

class lrdbenchmark.analytics.usage_tracker.UsageTracker(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Bases: object

Comprehensive usage tracking system for LRDBench

Features: - Real-time event tracking - Privacy-preserving user identification - Performance monitoring - Error analysis - Usage pattern detection

__init__(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Initialize the usage tracker

Parameters:
  • storage_path – Directory to store analytics data

  • enable_tracking – Whether to enable usage tracking

  • privacy_mode – Enable privacy-preserving features

track_estimator_usage(estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None = None, data_length: int = 0, user_id: str | None = None) None[source]

Track usage of an estimator

Parameters:
  • estimator_name – Name of the estimator used

  • parameters – Parameters passed to the estimator

  • execution_time – Time taken for execution

  • success – Whether the estimation was successful

  • error_message – Error message if failed

  • data_length – Length of input data

  • user_id – Optional user identifier

track_benchmark_run(benchmark_type: str, estimators_used: List[str], total_time: float, success_count: int, total_count: int, data_models: List[str]) None[source]

Track benchmark execution

Parameters:
  • benchmark_type – Type of benchmark run

  • estimators_used – List of estimators used

  • total_time – Total execution time

  • success_count – Number of successful runs

  • total_count – Total number of runs

  • data_models – Data models tested

get_usage_summary(days: int = 30) UsageSummary[source]

Get usage summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

UsageSummary object with aggregated statistics

__init__(storage_path: str = '~/.lrdbench/analytics', enable_tracking: bool = True, privacy_mode: bool = True)[source]

Initialize the usage tracker

Parameters:
  • storage_path – Directory to store analytics data

  • enable_tracking – Whether to enable usage tracking

  • privacy_mode – Enable privacy-preserving features

_generate_session_id() str[source]

Generate a unique session ID

_load_existing_data()[source]

Load existing analytics data from storage

_start_background_processing()[source]

Start background thread for data processing

track_estimator_usage(estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None = None, data_length: int = 0, user_id: str | None = None) None[source]

Track usage of an estimator

Parameters:
  • estimator_name – Name of the estimator used

  • parameters – Parameters passed to the estimator

  • execution_time – Time taken for execution

  • success – Whether the estimation was successful

  • error_message – Error message if failed

  • data_length – Length of input data

  • user_id – Optional user identifier

track_benchmark_run(benchmark_type: str, estimators_used: List[str], total_time: float, success_count: int, total_count: int, data_models: List[str]) None[source]

Track benchmark execution

Parameters:
  • benchmark_type – Type of benchmark run

  • estimators_used – List of estimators used

  • total_time – Total execution time

  • success_count – Number of successful runs

  • total_count – Total number of runs

  • data_models – Data models tested

_sanitize_parameters(params: Dict[str, Any]) Dict[str, Any][source]

Sanitize parameters for privacy and storage

_hash_user_id(user_id: str) str[source]

Hash user ID for privacy

get_usage_summary(days: int = 30) UsageSummary[source]

Get usage summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

UsageSummary object with aggregated statistics

_get_length_range(length: int) str[source]

Convert data length to range category

_save_data()[source]

Save analytics data to storage

_cleanup_old_data(max_age_days: int = 90)[source]

Remove old analytics data

export_summary(output_path: str, days: int = 30) None[source]

Export usage summary to file

Parameters:
  • output_path – Path to save the summary

  • days – Number of days to analyze

get_popular_estimators(top_n: int = 10) List[tuple][source]

Get top N most popular estimators

get_performance_trends(days: int = 7) Dict[str, List[float]][source]

Get performance trends over time

class lrdbenchmark.analytics.usage_tracker.UsageEvent(timestamp: str, event_type: str, estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None, data_length: int, user_id: str | None, session_id: str)[source]

Bases: object

Represents a single usage event

timestamp: str
event_type: str
estimator_name: str
parameters: Dict[str, Any]
execution_time: float
success: bool
error_message: str | None
data_length: int
user_id: str | None
session_id: str
__init__(timestamp: str, event_type: str, estimator_name: str, parameters: Dict[str, Any], execution_time: float, success: bool, error_message: str | None, data_length: int, user_id: str | None, session_id: str) None
class lrdbenchmark.analytics.usage_tracker.UsageSummary(total_events: int, unique_users: int, estimator_usage: Dict[str, int], parameter_frequency: Dict[str, Dict[str, int]], success_rate: float, avg_execution_time: float, common_errors: Dict[str, int], data_length_distribution: Dict[str, int])[source]

Bases: object

Aggregated usage statistics

total_events: int
unique_users: int
estimator_usage: Dict[str, int]
parameter_frequency: Dict[str, Dict[str, int]]
success_rate: float
avg_execution_time: float
common_errors: Dict[str, int]
data_length_distribution: Dict[str, int]
__init__(total_events: int, unique_users: int, estimator_usage: Dict[str, int], parameter_frequency: Dict[str, Dict[str, int]], success_rate: float, avg_execution_time: float, common_errors: Dict[str, int], data_length_distribution: Dict[str, int]) None

Performance Monitoring

class lrdbenchmark.analytics.performance_monitor.PerformanceMonitor(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive performance monitoring system

Features: - Real-time performance tracking - Memory usage monitoring - CPU utilization tracking - Performance trend analysis - Bottleneck identification

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the performance monitor

start_monitoring(estimator_name: str, data_length: int, parameters: Dict[str, str]) str[source]

Start monitoring a new execution

Parameters:
  • estimator_name – Name of the estimator

  • data_length – Length of input data

  • parameters – Estimator parameters

Returns:

Monitoring session ID

stop_monitoring(session_id: str) None[source]

Stop monitoring and record metrics

Parameters:

session_id – Monitoring session ID

get_performance_summary(days: int = 30) PerformanceSummary[source]

Get performance summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

PerformanceSummary object

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the performance monitor

_load_existing_data()[source]

Load existing performance data

start_monitoring(estimator_name: str, data_length: int, parameters: Dict[str, str]) str[source]

Start monitoring a new execution

Parameters:
  • estimator_name – Name of the estimator

  • data_length – Length of input data

  • parameters – Estimator parameters

Returns:

Monitoring session ID

stop_monitoring(session_id: str) None[source]

Stop monitoring and record metrics

Parameters:

session_id – Monitoring session ID

timer(name: str)[source]

Context manager for timing code blocks.

Parameters:

name – Name of the timer

Usage:
with monitor.timer(‘my_operation’):

# code to time pass

get_stats() Dict[str, Dict[str, float]][source]

Get statistics for all timers.

Returns:

Dictionary mapping timer names to statistics (mean, std, min, max, count)

get_performance_summary(days: int = 30) PerformanceSummary[source]

Get performance summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

PerformanceSummary object

_analyze_performance_trend(metrics: List[PerformanceMetrics]) str[source]

Analyze performance trend over time

_identify_bottlenecks(metrics: List[PerformanceMetrics]) List[str][source]

Identify estimators with performance bottlenecks

get_estimator_performance(estimator_name: str, days: int = 30) Dict[str, float][source]

Get performance metrics for a specific estimator

export_metrics(output_path: str, days: int = 30) None[source]

Export performance metrics to file

get_memory_trends(days: int = 7) Dict[str, List[float]][source]

Get memory usage trends over time

class lrdbenchmark.analytics.performance_monitor.PerformanceMetrics(timestamp: str, estimator_name: str, execution_time: float, memory_before: float, memory_after: float, memory_peak: float, cpu_percent: float, data_length: int, parameters: Dict[str, str])[source]

Bases: object

Performance metrics for a single execution

timestamp: str
estimator_name: str
execution_time: float
memory_before: float
memory_after: float
memory_peak: float
cpu_percent: float
data_length: int
parameters: Dict[str, str]
__init__(timestamp: str, estimator_name: str, execution_time: float, memory_before: float, memory_after: float, memory_peak: float, cpu_percent: float, data_length: int, parameters: Dict[str, str]) None
class lrdbenchmark.analytics.performance_monitor.PerformanceSummary(total_executions: int, avg_execution_time: float, std_execution_time: float, min_execution_time: float, max_execution_time: float, avg_memory_usage: float, memory_efficiency: float, performance_trend: str, bottleneck_estimators: List[str])[source]

Bases: object

Aggregated performance statistics

total_executions: int
avg_execution_time: float
std_execution_time: float
min_execution_time: float
max_execution_time: float
avg_memory_usage: float
memory_efficiency: float
performance_trend: str
bottleneck_estimators: List[str]
__init__(total_executions: int, avg_execution_time: float, std_execution_time: float, min_execution_time: float, max_execution_time: float, avg_memory_usage: float, memory_efficiency: float, performance_trend: str, bottleneck_estimators: List[str]) None

Error Analysis

class lrdbenchmark.analytics.error_analyzer.ErrorAnalyzer(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive error analysis system

Features: - Error pattern recognition - Failure mode analysis - Reliability scoring - Trend analysis - Improvement recommendations

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the error analyzer

record_error(estimator_name: str, error_message: str, stack_trace: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None, session_id: str | None = None) None[source]

Record a new error event

Parameters:
  • estimator_name – Name of the estimator that failed

  • error_message – Error message

  • stack_trace – Optional stack trace

  • parameters – Estimator parameters

  • data_length – Length of input data

  • user_id – Optional user identifier

  • session_id – Optional session identifier

get_error_summary(days: int = 30) ErrorSummary[source]

Get error summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

ErrorSummary object

get_improvement_recommendations(days: int = 30) List[str][source]

Get recommendations for improving reliability

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the error analyzer

_load_existing_data()[source]

Load existing error data

record_error(estimator_name: str, error_message: str, stack_trace: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None, session_id: str | None = None) None[source]

Record a new error event

Parameters:
  • estimator_name – Name of the estimator that failed

  • error_message – Error message

  • stack_trace – Optional stack trace

  • parameters – Estimator parameters

  • data_length – Length of input data

  • user_id – Optional user identifier

  • session_id – Optional session identifier

_categorize_error(error_message: str) str[source]

Categorize error based on message patterns

get_error_summary(days: int = 30) ErrorSummary[source]

Get error summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

ErrorSummary object

record_uncertainty_calibration(estimator_name: str, data_model: str | None, ci_lower: float | None, ci_upper: float | None, estimate: float | None, true_value: float | None, method: str | None, coverage_flag: bool | None, metadata: Dict[str, Any] | None = None) None[source]

Record a new uncertainty calibration event.

_persist_uncertainty_events() None[source]

Persist uncertainty events to disk.

_analyze_error_trends(errors: List[ErrorEvent]) Dict[str, str][source]

Analyze error trends over time

get_estimator_reliability(estimator_name: str, days: int = 30) Dict[str, float][source]

Get reliability metrics for a specific estimator

get_improvement_recommendations(days: int = 30) List[str][source]

Get recommendations for improving reliability

export_errors(output_path: str, days: int = 30) None[source]

Export error data to file

get_error_correlation(days: int = 30) Dict[str, Dict[str, float]][source]

Analyze correlations between different error types

get_uncertainty_summary(days: int = 30) Dict[str, Any][source]

Summarise uncertainty coverage over the requested horizon.

export_uncertainty_calibration(output_path: str, days: int = 30) None[source]

Export uncertainty calibration events to a JSON file.

summarise_uncertainty_calibration(days: int = 30, min_samples: int = 3) List[Dict[str, Any]][source]

Aggregate empirical coverage rates per estimator/method.

plot_uncertainty_calibration(output_path: str, days: int = 30, min_samples: int = 3) str | None[source]

Create a nominal vs empirical coverage plot from calibration records.

_calculate_correlation(session_errors: Dict, error_type1: str, error_type2: str) float[source]

Calculate correlation between two error types

class lrdbenchmark.analytics.error_analyzer.ErrorEvent(timestamp: str, estimator_name: str, error_type: str, error_message: str, stack_trace: str | None, parameters: Dict[str, str], data_length: int, user_id: str | None, session_id: str)[source]

Bases: object

Represents a single error event

timestamp: str
estimator_name: str
error_type: str
error_message: str
stack_trace: str | None
parameters: Dict[str, str]
data_length: int
user_id: str | None
session_id: str
__init__(timestamp: str, estimator_name: str, error_type: str, error_message: str, stack_trace: str | None, parameters: Dict[str, str], data_length: int, user_id: str | None, session_id: str) None
class lrdbenchmark.analytics.error_analyzer.ErrorSummary(total_errors: int, unique_errors: int, error_rate: float, most_common_errors: List[Tuple[str, int]], error_by_estimator: Dict[str, int], error_by_type: Dict[str, int], error_trends: Dict[str, str], reliability_score: float)[source]

Bases: object

Aggregated error statistics

total_errors: int
unique_errors: int
error_rate: float
most_common_errors: List[Tuple[str, int]]
error_by_estimator: Dict[str, int]
error_by_type: Dict[str, int]
reliability_score: float
__init__(total_errors: int, unique_errors: int, error_rate: float, most_common_errors: List[Tuple[str, int]], error_by_estimator: Dict[str, int], error_by_type: Dict[str, int], error_trends: Dict[str, str], reliability_score: float) None

Workflow Analysis

class lrdbenchmark.analytics.workflow_analyzer.WorkflowAnalyzer(storage_path: str = '~/.lrdbench/analytics')[source]

Bases: object

Comprehensive workflow analysis system

Features: - Workflow pattern recognition - Sequence analysis - User behavior modeling - Optimization recommendations - Feature usage analysis

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the workflow analyzer

get_workflow_summary(days: int = 30) WorkflowSummary[source]

Get workflow summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

WorkflowSummary object

__init__(storage_path: str = '~/.lrdbench/analytics')[source]

Initialize the workflow analyzer

_load_existing_data()[source]

Load existing workflow data

_reconstruct_workflow(workflow_data: Dict) Workflow | None[source]

Reconstruct workflow object from stored data

start_workflow_session(session_id: str, user_id: str | None = None) None[source]

Start tracking a new workflow session

add_workflow_step(session_id: str, step_type: str, estimator_name: str | None = None, parameters: Dict[str, str] | None = None, data_length: int = 0, user_id: str | None = None) None[source]

Add a step to the current workflow session

Parameters:
  • session_id – Session identifier

  • step_type – Type of workflow step

  • estimator_name – Name of estimator used (if applicable)

  • parameters – Parameters for the step

  • data_length – Length of input data

  • user_id – Optional user identifier

end_workflow_session(session_id: str) str | None[source]

End a workflow session and create workflow record

Parameters:

session_id – Session identifier

Returns:

Workflow ID if successful, None otherwise

get_workflow_summary(days: int = 30) WorkflowSummary[source]

Get workflow summary for the specified time period

Parameters:

days – Number of days to analyze

Returns:

WorkflowSummary object

_analyze_workflow_patterns(workflows: List[Workflow]) List[Tuple[List[str], int]][source]

Analyze common workflow patterns

_analyze_estimator_sequences(workflows: List[Workflow]) List[Tuple[List[str], int]][source]

Analyze popular estimator sequences

_analyze_workflow_complexity(workflows: List[Workflow]) Dict[str, int][source]

Analyze workflow complexity distribution

get_user_workflow_patterns(user_id: str, days: int = 30) Dict[str, Any][source]

Get workflow patterns for a specific user

get_workflow_optimization_recommendations(days: int = 30) List[str][source]

Get recommendations for workflow optimization

export_workflows(output_path: str, days: int = 30) None[source]

Export workflow data to file

get_feature_usage_analysis(days: int = 30) Dict[str, Any][source]

Analyze feature usage patterns

_get_length_range(length: int) str[source]

Convert data length to range category

class lrdbenchmark.analytics.workflow_analyzer.WorkflowStep(timestamp: str, step_type: str, estimator_name: str | None, parameters: Dict[str, str], data_length: int, session_id: str, user_id: str | None)[source]

Bases: object

Represents a single step in a user workflow

timestamp: str
step_type: str
estimator_name: str | None
parameters: Dict[str, str]
data_length: int
session_id: str
user_id: str | None
__init__(timestamp: str, step_type: str, estimator_name: str | None, parameters: Dict[str, str], data_length: int, session_id: str, user_id: str | None) None
class lrdbenchmark.analytics.workflow_analyzer.Workflow(workflow_id: str, session_id: str, user_id: str | None, steps: List[WorkflowStep], start_time: str, end_time: str, total_duration: float, step_count: int)[source]

Bases: object

Represents a complete user workflow

workflow_id: str
session_id: str
user_id: str | None
steps: List[WorkflowStep]
start_time: str
end_time: str
total_duration: float
step_count: int
__init__(workflow_id: str, session_id: str, user_id: str | None, steps: List[WorkflowStep], start_time: str, end_time: str, total_duration: float, step_count: int) None

Conveneince Functions

Note

Convenience functions are provided via the analytics submodule. Import from lrdbenchmark.analytics rather than top-level lrdbenchmark.

Usage Examples

Basic Analytics Setup

from lrdbenchmark import enable_analytics, get_analytics_summary
from lrdbenchmark import AnalyticsDashboard

# Enable analytics system
print("Enabling LRDBench analytics system...")
enable_analytics()

# Your analysis code here
from lrdbenchmark import FBMModel, FGNModel, ComprehensiveBenchmark
import time

print("Running analysis with analytics tracking...")

# Generate data with different models
models = {
    'FBM (H=0.7)': FBMModel(H=0.7, sigma=1.0),
    'FBM (H=0.3)': FBMModel(H=0.3, sigma=1.0),
    'FGN (H=0.8)': FGNModel(H=0.8, sigma=1.0)
}

for model_name, model in models.items():
    print(f"Generating {model_name} data...")
    data = model.generate(1000, seed=42)

    # Run benchmark
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=5
    )

    print(f"Completed benchmark for {model_name}")

# Get comprehensive analytics summary
print("\n=== ANALYTICS SUMMARY ===")
summary = get_analytics_summary()
print(summary)

# Create dashboard for detailed analysis
dashboard = AnalyticsDashboard()

# Generate specific reports
print("\n=== USAGE REPORT ===")
usage_report = dashboard.generate_usage_report()
print(usage_report)

print("\n=== PERFORMANCE REPORT ===")
performance_report = dashboard.generate_performance_report()
print(performance_report)

print("\n=== RELIABILITY REPORT ===")
reliability_report = dashboard.generate_reliability_report()
print(reliability_report)

Usage Tracking with Decorators

from lrdbenchmark import track_usage, FBMModel

@track_usage
def analyze_fbm_data(H=0.7, length=1000):
    """Analyze FBM data with given parameters."""
    model = FBMModel(H=H, sigma=1.0)
    data = model.generate(length, seed=42)

    # Perform analysis
    return data.mean(), data.std()

# Function calls will be automatically tracked
mean_val, std_val = analyze_fbm_data(H=0.8, length=2000)
mean_val2, std_val2 = analyze_fbm_data(H=0.6, length=1000)

Performance Monitoring

from lrdbenchmark import monitor_performance, ComprehensiveBenchmark

@monitor_performance
def run_benchmark_analysis():
    """Run comprehensive benchmark analysis."""
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=10
    )
    return results

# Performance will be automatically monitored
results = run_benchmark_analysis()

# Get performance summary
from lrdbenchmark import PerformanceMonitor
monitor = PerformanceMonitor()
perf_summary = monitor.get_performance_summary()
print(f"Average execution time: {perf_summary.avg_execution_time:.2f}s")

Error Tracking

from lrdbenchmark import track_errors, ComprehensiveBenchmark

@track_errors
def run_estimator_analysis():
    """Run estimator analysis with error tracking."""
    benchmark = ComprehensiveBenchmark()

    try:
        results = benchmark.run_comprehensive_benchmark(
            data_length=1000,
            n_runs=5
        )
        return results
    except Exception as e:
        # Errors will be automatically tracked
        raise e

# Run analysis
try:
    results = run_estimator_analysis()
except Exception as e:
    print(f"Analysis failed: {e}")

# Get error summary
from lrdbenchmark import ErrorAnalyzer
error_analyzer = ErrorAnalyzer()
error_summary = error_analyzer.get_error_summary()
print(f"Total errors: {error_summary.total_errors}")

Workflow Tracking

from lrdbenchmark import track_workflow, FBMModel, ComprehensiveBenchmark

@track_workflow
def complete_analysis_workflow():
    """Complete analysis workflow with tracking."""
    # Step 1: Data generation
    model = FBMModel(H=0.7, sigma=1.0)
    data = model.generate(1000, seed=42)

    # Step 2: Benchmark execution
    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=5
    )

    # Step 3: Results analysis
    summary = results.get_summary()

    return summary

# Workflow will be automatically tracked
summary = complete_analysis_workflow()

# Get workflow summary
from lrdbenchmark import WorkflowAnalyzer
workflow_analyzer = WorkflowAnalyzer()
workflow_summary = workflow_analyzer.get_workflow_summary()
print(f"Workflows completed: {workflow_summary.total_workflows}")

Advanced Analytics Dashboard

from lrdbenchmark import AnalyticsDashboard

# Create analytics dashboard
dashboard = AnalyticsDashboard()

# Generate comprehensive analytics report
report = dashboard.get_comprehensive_summary()
print("=== COMPREHENSIVE ANALYTICS REPORT ===")
print(report)

# Generate specific reports
usage_report = dashboard.generate_usage_report()
performance_report = dashboard.generate_performance_report()
reliability_report = dashboard.generate_reliability_report()
workflow_report = dashboard.generate_workflow_report()

print("\n=== USAGE REPORT ===")
print(usage_report)

print("\n=== PERFORMANCE REPORT ===")
print(performance_report)

print("\n=== RELIABILITY REPORT ===")
print(reliability_report)

print("\n=== WORKFLOW REPORT ===")
print(workflow_report)

Custom Analytics Configuration

from lrdbenchmark.analytics import (
    UsageTracker, PerformanceMonitor, ErrorAnalyzer, WorkflowAnalyzer
)

# Create custom analytics components
usage_tracker = UsageTracker(
    track_user_id=True,
    track_parameters=True,
    track_timing=True
)

performance_monitor = PerformanceMonitor(
    track_memory=True,
    track_cpu=True,
    track_gpu=True
)

error_analyzer = ErrorAnalyzer(
    categorize_errors=True,
    track_stack_traces=True,
    generate_recommendations=True
)

workflow_analyzer = WorkflowAnalyzer(
    track_step_dependencies=True,
    analyze_patterns=True,
    generate_optimizations=True
)

# Use custom components
usage_tracker.track_estimator_usage(
    estimator_name='dfa',
    parameters={'min_scale': 4, 'max_scale': 100},
    execution_time=1.23,
    success=True
)

performance_monitor.start_monitoring()
# ... your code here ...
performance_monitor.stop_monitoring()

error_analyzer.record_error(
    error_type='ValueError',
    error_message='Invalid parameter value',
    context={'estimator': 'dfa', 'parameters': {'H': 1.5}},
    timestamp='2024-01-15T10:30:00'
)

Data Export and Visualization

from lrdbenchmark import AnalyticsDashboard
import pandas as pd
import matplotlib.pyplot as plt

dashboard = AnalyticsDashboard()

# Export analytics data
analytics_data = dashboard.export_analytics_data()

# Convert to pandas DataFrame
df = pd.DataFrame(analytics_data['usage_events'])

# Create visualizations
plt.figure(figsize=(12, 8))

# Usage by estimator
plt.subplot(2, 2, 1)
estimator_counts = df['estimator_name'].value_counts()
estimator_counts.plot(kind='bar')
plt.title('Usage by Estimator')
plt.xticks(rotation=45)

# Execution time distribution
plt.subplot(2, 2, 2)
plt.hist(df['execution_time'], bins=20, alpha=0.7)
plt.title('Execution Time Distribution')
plt.xlabel('Time (seconds)')

# Success rate over time
plt.subplot(2, 2, 3)
df['date'] = pd.to_datetime(df['timestamp']).dt.date
success_rate = df.groupby('date')['success'].mean()
success_rate.plot(kind='line')
plt.title('Success Rate Over Time')
plt.ylabel('Success Rate')

# Parameter usage heatmap
plt.subplot(2, 2, 4)
# Create heatmap of parameter usage
plt.title('Parameter Usage Heatmap')

plt.tight_layout()
plt.show()

Real-time Analytics Monitoring

from lrdbenchmark import AnalyticsDashboard
import time
import threading

dashboard = AnalyticsDashboard()

def monitor_analytics():
    """Monitor analytics in real-time."""
    while True:
        summary = dashboard.get_comprehensive_summary()
        print("\n" + "="*50)
        print("REAL-TIME ANALYTICS UPDATE")
        print("="*50)
        print(summary)
        time.sleep(60)  # Update every minute

# Start monitoring in background
monitor_thread = threading.Thread(target=monitor_analytics, daemon=True)
monitor_thread.start()

# Your analysis code here
from lrdbenchmark import FBMModel, ComprehensiveBenchmark

for i in range(5):
    model = FBMModel(H=0.5 + i*0.1, sigma=1.0)
    data = model.generate(1000, seed=i)

    benchmark = ComprehensiveBenchmark()
    results = benchmark.run_comprehensive_benchmark(
        data_length=1000,
        n_runs=2
    )

    time.sleep(30)  # Wait between runs

Analytics Integration with Benchmarks

from lrdbenchmark import ComprehensiveBenchmark, enable_analytics
from lrdbenchmark import AnalyticsDashboard

# Enable analytics
enable_analytics()

# Create benchmark with analytics integration
benchmark = ComprehensiveBenchmark()

# Run benchmark with analytics tracking
results = benchmark.run_comprehensive_benchmark(
    data_length=1000,
    n_runs=10,
    enable_analytics=True  # Enable analytics tracking
)

# Get analytics dashboard
dashboard = AnalyticsDashboard()

# Generate integrated report
integrated_report = dashboard.generate_integrated_report(
    benchmark_results=results,
    include_performance=True,
    include_reliability=True,
    include_workflow=True
)

print("=== INTEGRATED BENCHMARK & ANALYTICS REPORT ===")
print(integrated_report)

Best Practices

  1. Enable Early: Enable analytics at the start of your analysis

  2. Use Decorators: Use the provided decorators for automatic tracking

  3. Monitor Performance: Track execution times for optimization

  4. Error Handling: Always track errors for debugging

  5. Workflow Analysis: Track complete workflows for optimization

  6. Regular Reports: Generate regular analytics reports

  7. Data Export: Export analytics data for external analysis

  8. Privacy: Be mindful of sensitive data in analytics

Configuration Options

Analytics Configuration

from lrdbenchmark.analytics import AnalyticsConfig

# Configure analytics system
config = AnalyticsConfig(
    # Usage tracking
    track_user_id=True,
    track_parameters=True,
    track_timing=True,

    # Performance monitoring
    track_memory=True,
    track_cpu=True,
    track_gpu=True,

    # Error analysis
    categorize_errors=True,
    track_stack_traces=True,
    generate_recommendations=True,

    # Workflow analysis
    track_step_dependencies=True,
    analyze_patterns=True,
    generate_optimizations=True,

    # Data retention
    max_events=10000,
    retention_days=30,

    # Privacy
    anonymize_user_ids=True,
    sanitize_parameters=True
)

# Apply configuration
from lrdbenchmark.analytics import configure_analytics
configure_analytics(config)

Privacy and Security

from lrdbenchmark import UsageTracker

# Create privacy-aware usage tracker
usage_tracker = UsageTracker(
    track_user_id=False,  # Don't track user IDs
    sanitize_parameters=True,  # Remove sensitive parameters
    anonymize_data=True  # Anonymize all data
)

# Track usage with privacy protection
usage_tracker.track_estimator_usage(
    estimator_name='dfa',
    parameters={'min_scale': 4, 'max_scale': 100},  # Will be sanitized
    execution_time=1.23,
    success=True
)

Note

The analytics system is designed to be privacy-aware and can be configured to protect sensitive information while still providing valuable insights.

Warning

When using analytics in production environments, ensure compliance with data protection regulations and implement appropriate privacy controls.