Stats Module

Overview

At a Glance

Purpose:: Modern, domain-driven statistics collection for simulations
Location:: fusion/stats/
Key Files:: collector.py
Depends On:: fusion.domain (Request, AllocationResult, SimulationConfig)
Used By:: New simulation engine (v6.x+)

The stats module provides StatsCollector, a modern statistics aggregation class designed to work with domain objects (Request, AllocationResult) and produce metrics compatible with analysis tools.

Important

This module is part of the v6.x refactoring effort to consolidate statistics collection into a clean, domain-driven design. It will eventually replace the legacy statistics scattered across the codebase.

Understanding the Statistics Landscape

Warning

This is confusing - and we know it!

FUSION has four different statistics-related components that evolved over time. This section explains what each does and the migration plan.

The Statistics Confusion Matrix

Statistics Components in FUSION
Component	Location	Purpose	Status	When to Use
`StatsCollector`	`fusion/stats/`	Modern domain-driven collection	New (v6.x)	New code, orchestrator path
`SimStats`	`fusion/core/metrics.py`	Legacy comprehensive statistics	Legacy (v5.x)	Legacy simulation engine
`StatsProps`	`fusion/core/properties.py`	Legacy statistics container	Legacy (v5.x)	Legacy data structures
`GroomingStatistics`	`fusion/reporting/statistics.py`	Grooming-specific metrics	Active	Traffic grooming features

Visual Comparison

LEGACY ARCHITECTURE (v5.x)
==========================

SimulationEngine
      |
      +---> SimStats (fusion/core/metrics.py)
      |     - 2000+ lines, monolithic
      |     - Tightly coupled to engine
      |     - Computes during simulation
      |
      +---> StatsProps (fusion/core/properties.py)
            - Data container class
            - Stores computed statistics
            - Passed between components


NEW ARCHITECTURE (v6.x+)
========================

SimulationEngine + SDNOrchestrator
      |
      +---> StatsCollector (fusion/stats/collector.py)
            - ~800 lines, focused
            - Consumes domain objects
            - Clean separation of concerns
            - Exports to multiple formats

Why So Many Statistics Classes?

Historical Evolution:

v4.x: StatsProps was a simple data container in properties.py
v5.x: SimStats in metrics.py grew to handle all computation, becoming a 2000+ line monolith tightly coupled to the simulation engine
v6.0 (current): StatsCollector introduced as part of domain-driven refactoring, designed to consume domain objects (Request, AllocationResult) cleanly. GroomingStatistics added to reporting/ for specialized traffic grooming metrics.

The Plan:

Note

Migration Roadmap (Future Releases)

StatsCollector becomes the primary statistics interface
SimStats deprecated, functionality migrated to StatsCollector
StatsProps deprecated, replaced by StatsCollector export methods
GroomingStatistics remains for grooming-specific features

Detailed Comparison

Feature Comparison
Feature	StatsCollector (new)	SimStats (legacy)	Notes
Lines of code	~800	~2000+	New design is more focused
Input	Domain objects (Request, Result)	Raw dicts, engine internals	Domain objects are cleaner
Output	Multiple export formats	Single dict format	New supports comparison format
Coupling	Loosely coupled	Tightly coupled to engine	New can be used standalone
Grooming stats	Basic tracking	Embedded in class	Separate GroomingStatistics class
SNR tracking	List of values + aggregates	Complex per-link tracking	Simplified in new design
Protection stats	Switchover tracking	Limited	New has better protection support

Which One Should I Use?

Use StatsCollector when:

Writing new code using SDNOrchestrator
Working with domain objects (Request, AllocationResult)
Need clean export to comparison format
Building new analysis pipelines

Use SimStats when:

Maintaining legacy simulation engine code
Working with code that expects legacy format
Need detailed per-link SNR tracking (not yet in StatsCollector)

Use GroomingStatistics when:

Working specifically with traffic grooming features
Need grooming rate, lightpath utilization, bandwidth savings

Module Architecture

StatsCollector Design

+-------------------+
| SimulationEngine  |
+-------------------+
         |
         | record_arrival(request, result)
         | record_release(request)
         v
+-------------------+
|  StatsCollector   |
+-------------------+
| - total_requests  |
| - blocked_requests|
| - snr_values[]    |
| - hop_counts[]    |
| - modulations{}   |
+-------------------+
         |
         +---> to_comparison_format()  --> Analysis scripts
         +---> to_legacy_stats_dict()  --> Legacy compatibility
         +---> blocking_probability    --> Real-time access

Data Flow

Request Processing:
-------------------

1. Request arrives
         |
         v
2. SDNOrchestrator.handle_arrival()
         |
         v
3. AllocationResult returned (success/blocked)
         |
         v
4. StatsCollector.record_arrival(request, result)
         |
         +---> Updates counters (total, success, blocked)
         +---> Tracks bandwidth (requested, allocated)
         +---> Records modulation used
         +---> Records SNR values
         +---> Records path metrics (hops, length)
         +---> Updates block reasons (if blocked)

5. Request departs
         |
         v
6. StatsCollector.record_release(request)
         |
         +---> (Future: utilization tracking)

Key Data Structures

Request Counters:

total_requests: int = 0
successful_requests: int = 0
blocked_requests: int = 0

Tracking Dictionaries:

block_reasons: dict[str, int]      # "NO_PATH" -> 5, "CONGESTION" -> 12
modulations_used: dict[str, int]   # "QPSK" -> 100, "16-QAM" -> 50
core_usage: dict[int, int]         # 0 -> 45, 1 -> 38, 2 -> 42
band_usage: dict[str, int]         # "c" -> 80, "l" -> 20

Measurement Lists:

snr_values: list[float]       # [18.5, 19.2, 17.8, ...]
hop_counts: list[int]         # [3, 4, 2, 5, ...]
path_lengths_km: list[float]  # [450.0, 320.5, 680.2, ...]
xt_values: list[float]        # [-30.5, -28.0, ...]

Components

collector.py

Purpose:: Centralized statistics collection for simulation runs
Key Class:: StatsCollector

Initialization:

from fusion.stats import StatsCollector
from fusion.domain.config import SimulationConfig

config = SimulationConfig(network_name="NSFNet", erlang=300.0, ...)
collector = StatsCollector(config)

Recording Events:

# After each request is processed
collector.record_arrival(request, allocation_result)

# After request departs
collector.record_release(request)

# Manual SNR/XT recording (if needed)
collector.record_snr(18.5)
collector.record_xt(-30.0)

Accessing Metrics:

# Computed properties (real-time)
bp = collector.blocking_probability      # 0.0 to 1.0
sr = collector.success_rate              # 1.0 - bp
avg_snr = collector.average_snr          # Mean SNR in dB
avg_hops = collector.average_hop_count   # Mean path hops

# Feature ratios
grooming_ratio = collector.grooming_ratio
slicing_ratio = collector.slicing_ratio
protection_ratio = collector.protection_ratio

Exporting Results:

# For analysis scripts (run_comparison.py compatible)
results = collector.to_comparison_format()

# For legacy code compatibility
legacy_dict = collector.to_legacy_stats_dict()

Merging (for parallel runs):

# Combine results from parallel simulation runs
main_collector.merge(worker_collector_1)
main_collector.merge(worker_collector_2)

Reset (between iterations):

# Clear all statistics for new iteration
collector.reset()

Computed Properties

Property	Description	Type
`blocking_probability`	blocked / total requests	float
`success_rate`	1.0 - blocking_probability	float
`average_snr`	Mean of snr_values	float
`min_snr`	Minimum SNR recorded	float
`max_snr`	Maximum SNR recorded	float
`grooming_ratio`	groomed / successful requests	float
`slicing_ratio`	sliced / successful requests	float
`protection_ratio`	protected / successful requests	float
`bandwidth_utilization`	allocated / requested bandwidth	float
`average_hop_count`	Mean of hop_counts	float
`average_path_length_km`	Mean of path_lengths_km	float

Integration with Other Modules

Relationship to fusion/core/metrics.py

SimStats in fusion/core/metrics.py is the legacy statistics class:

fusion/core/metrics.py (LEGACY)
===============================
- SimStats class (~2000 lines)
- Tightly coupled to simulation engine
- Computes statistics during simulation
- Uses raw dicts and engine internals

fusion/stats/collector.py (NEW)
===============================
- StatsCollector class (~800 lines)
- Loosely coupled, domain-driven
- Records from domain objects
- Clean export to multiple formats

Key Difference: SimStats computes during simulation. StatsCollector records from domain objects after the fact.

Relationship to fusion/reporting/statistics.py

GroomingStatistics in fusion/reporting/statistics.py handles grooming-specific metrics:

StatsCollector (fusion/stats/)
==============================
- General simulation statistics
- Request counts, SNR, paths
- Used by simulation engine

GroomingStatistics (fusion/reporting/)
======================================
- Grooming-specific only
- Lightpath utilization, bandwidth savings
- Used when grooming is enabled

Why separate? Grooming statistics are specialized and only relevant when is_grooming_enabled=True. Keeping them separate avoids bloating StatsCollector.

Relationship to fusion/core/properties.py

StatsProps in fusion/core/properties.py is a legacy data container:

StatsProps (LEGACY)
===================
- Data container class
- Stores computed statistics
- Passed between components
- No computation logic

StatsCollector (NEW)
====================
- Active recording and computation
- to_legacy_stats_dict() for compatibility
- Will replace StatsProps over time

Development Guide

Adding New Metrics

Add the field to StatsCollector dataclass:

# In collector.py
new_metric_values: list[float] = field(default_factory=list)

Update recording method (_record_success or _record_block):

# In _record_success or record_arrival
if result.new_metric is not None:
    self.new_metric_values.append(result.new_metric)

Add computed property if needed:

@property
def average_new_metric(self) -> float:
    """Calculate average of new metric."""
    if not self.new_metric_values:
        return 0.0
    return sum(self.new_metric_values) / len(self.new_metric_values)

Add to export methods (to_comparison_format, to_legacy_stats_dict)
Update reset() and merge() methods
Add tests in fusion/tests/stats/test_collector.py

Best Practices

Use domain objects: Record from Request/AllocationResult, not raw dicts
Keep it focused: General stats here, specialized stats elsewhere
Immutable inputs: Don’t modify request or result in recording methods
Thread safety: StatsCollector is NOT thread-safe; use one per process
Reset between iterations: Call reset() before each new iteration

Testing

Test Location:: fusion/tests/stats/
Run Tests:: pytest fusion/tests/stats/ -v

# Run stats module tests
pytest fusion/tests/stats/ -v

# Run with coverage
pytest --cov=fusion.stats fusion/tests/stats/

Migration Guide

Migrating from SimStats to StatsCollector

Before (Legacy):

from fusion.core.metrics import SimStats

stats = SimStats(engine_props, sim_info, stats_props)
# ... simulation runs ...
stats.iter_update(request_data, sdn_data, network_spectrum)
blocking = stats.get_blocking_probability()

After (New):

from fusion.stats import StatsCollector
from fusion.domain.config import SimulationConfig

config = SimulationConfig.from_engine_props(engine_props)
collector = StatsCollector(config)

# After each request
collector.record_arrival(request, result)

# Access metrics
blocking = collector.blocking_probability

# Export for analysis
results = collector.to_comparison_format()