RL Visualization Plugin

Note

Status: BETA

This module is currently in BETA and is actively being developed. The API may evolve in future releases.

At a Glance

Purpose:: Plugin extension providing RL-specific visualization capabilities
Location:: fusion/modules/rl/visualization/
Key Class:: RLVisualizationPlugin
Relationship:: Extends fusion/visualization/ core system via plugin interface

Understanding the Architecture

This module is a plugin, not a standalone visualization system. To understand how it works, you need to understand FUSION’s visualization architecture.

The Core Visualization System

FUSION has a centralized visualization system at fusion/visualization/ built using Domain-Driven Design (DDD) principles:

fusion/visualization/                    <-- Core system
|-- domain/                              <-- Domain entities, value objects
|   |-- entities/metric.py              <-- MetricDefinition
|   `-- strategies/processing_strategies.py
|-- infrastructure/
|   |-- renderers/base_renderer.py      <-- BaseRenderer interface
|   `-- processors/                     <-- Data processors
|-- plugins/
|   |-- base_plugin.py                  <-- Plugin interface (BasePlugin)
|   `-- plugin_registry.py              <-- Discovery and loading
`-- application/
    `-- use_cases/generate_plot.py      <-- Main entry point

This core system handles:

Data loading and version adaptation
Metric aggregation and statistics
Plot rendering infrastructure
Caching and performance optimization

The RL Plugin Extension

This module extends the core system by implementing the BasePlugin interface:

fusion/modules/rl/visualization/         <-- This module (plugin)
|-- rl_plugin.py                        <-- Implements BasePlugin
|-- rl_metrics.py                       <-- RL-specific MetricDefinitions
|-- rl_plots.py                         <-- RL-specific BaseRenderer subclasses
`-- rl_processors.py                    <-- RL-specific processing strategies

Key Point: This module does not work independently. It registers its components with the core visualization system, which then handles the actual plot generation.

How Plugin Loading Works

1. Core system starts
       |
       v
2. Plugin registry discovers plugins
       |
       v
3. registry.load_plugin("rl") called
       |
       v
4. RLVisualizationPlugin instantiated
       |
       v
5. Plugin registers:
   - Metrics (episode_reward, q_values, etc.)
   - Plot types (reward_learning_curve, etc.)
   - Processors (RewardProcessingStrategy, etc.)
       |
       v
6. User calls generate_plot(plot_type="reward_learning_curve")
       |
       v
7. Core system uses registered RL components to render plot

Usage

Loading the Plugin

The RL visualization plugin must be loaded before use:

from fusion.visualization.plugins import get_global_registry

# Discover and load plugins
registry = get_global_registry()
registry.discover_plugins()
registry.load_plugin("rl")

# Now RL plot types are available
print(registry.get_available_plot_types())
# ['blocking_probability', 'reward_learning_curve', 'q_value_heatmap', ...]

Generating RL Plots

Once loaded, use the standard visualization API:

from fusion.visualization.application.use_cases.generate_plot import generate_plot

# Generate a reward learning curve
result = generate_plot(
    config_path="my_rl_experiment.yml",
    plot_type="reward_learning_curve",
    output_path="plots/learning_curve.png",
)

# Generate a convergence analysis
result = generate_plot(
    config_path="my_rl_experiment.yml",
    plot_type="convergence_plot",
    output_path="plots/convergence.png",
)

Available Plot Types

The RL plugin registers four specialized plot types:

reward_learning_curve

Learning curve showing episode rewards over training with smoothing and confidence intervals.

# Default configuration
{
    "window_size": 100,        # Smoothing window
    "confidence_level": 0.95,  # CI level
    "show_ci": True,           # Show confidence bands
}

Output: Line plot with mean reward and shaded confidence intervals.

q_value_heatmap

Heatmap visualization of Q-values across states and actions.

# Default configuration
{
    "colormap": "viridis",
    "annotate": False,
}

Output: Heatmap showing Q-value magnitude by state/action.

convergence_plot

Training convergence analysis showing when metrics stabilize.

# Default configuration
{
    "window_size": 100,
    "threshold": 0.01,          # Relative change threshold
    "show_convergence_point": True,
}

Output: Two-panel plot with metric progression and convergence statistics.

rl_dashboard

Comprehensive multi-metric dashboard for RL training analysis.

# Default configuration
{
    "layout": "3x2",
    "window_size": 100,
}

Output: 6-panel dashboard with rewards, losses, entropy, and Q-values.

Registered Metrics

The plugin registers these RL-specific metrics:

Metric Name	Data Type	Description
`episode_reward`	FLOAT	Total reward accumulated in an episode
`episode_reward_mean`	FLOAT	Moving average of episode rewards
`td_error`	ARRAY	Temporal difference prediction errors
`q_values`	ARRAY	Action-value function estimates
`policy_entropy`	FLOAT	Entropy of the policy distribution
`policy_loss`	FLOAT	Policy gradient loss
`value_loss`	FLOAT	Value function loss
`epsilon`	FLOAT	Epsilon-greedy exploration rate
`learning_rate`	FLOAT	Current learning rate

Processing Strategies

The plugin provides three specialized processing strategies:

RewardProcessingStrategy

Processes episode rewards with smoothing and statistical aggregation.

Features:

Moving average smoothing via scipy.ndimage.uniform_filter1d
Confidence interval calculation
Multi-seed aggregation

from fusion.modules.rl.visualization.rl_processors import RewardProcessingStrategy

processor = RewardProcessingStrategy(
    window_size=100,
    confidence_level=0.95,
)

QValueProcessingStrategy

Processes Q-value data for heatmap visualization.

Features:

Action-wise aggregation
Normalization for visualization
State grouping

ConvergenceDetectionStrategy

Detects when training metrics have converged.

Algorithm:

Compute mean in sliding window
Compare adjacent windows
Flag convergence when relative change < threshold

from fusion.modules.rl.visualization.rl_processors import ConvergenceDetectionStrategy

detector = ConvergenceDetectionStrategy(
    window_size=100,
    threshold=0.01,  # 1% relative change
)

Custom Renderers

The plugin provides four renderer classes extending BaseRenderer:

Renderer	Description
`RewardLearningCurveRenderer`	Line plots with confidence bands for reward curves
`QValueHeatmapRenderer`	Seaborn heatmaps for Q-value visualization
`ConvergencePlotRenderer`	Dual-panel convergence analysis plots
`MultiMetricDashboardRenderer`	6-panel comprehensive training dashboard

All renderers support PNG, PDF, SVG, and JPG output formats.

File Reference

fusion/modules/rl/visualization/
|-- __init__.py          # Exports RLVisualizationPlugin
|-- rl_plugin.py         # Main plugin class
|-- rl_metrics.py        # Metric definitions
|-- rl_plots.py          # Plot renderers
`-- rl_processors.py     # Processing strategies

Public API:

from fusion.modules.rl.visualization import RLVisualizationPlugin

# Or access components directly
from fusion.modules.rl.visualization.rl_metrics import get_rl_metrics
from fusion.modules.rl.visualization.rl_plots import (
    RewardLearningCurveRenderer,
    QValueHeatmapRenderer,
    ConvergencePlotRenderer,
    MultiMetricDashboardRenderer,
)
from fusion.modules.rl.visualization.rl_processors import (
    RewardProcessingStrategy,
    QValueProcessingStrategy,
    ConvergenceDetectionStrategy,
)

Dependencies

This plugin requires additional packages beyond the core visualization system:

scipy: For smoothing filters (uniform_filter1d)
seaborn: For heatmap visualization

These are checked at plugin load time via _check_dependencies().