.. _rl-visualization: ============================== RL Visualization Plugin ============================== .. note:: **Status: BETA** This module is currently in **BETA** and is actively being developed. The API may evolve in future releases. .. admonition:: At a Glance :class: tip :Purpose: Plugin extension providing RL-specific visualization capabilities :Location: ``fusion/modules/rl/visualization/`` :Key Class: ``RLVisualizationPlugin`` :Relationship: Extends ``fusion/visualization/`` core system via plugin interface Understanding the Architecture ============================== **This module is a plugin**, not a standalone visualization system. To understand how it works, you need to understand FUSION's visualization architecture. The Core Visualization System ----------------------------- FUSION has a centralized visualization system at ``fusion/visualization/`` built using Domain-Driven Design (DDD) principles: .. code-block:: text fusion/visualization/ <-- Core system |-- domain/ <-- Domain entities, value objects | |-- entities/metric.py <-- MetricDefinition | `-- strategies/processing_strategies.py |-- infrastructure/ | |-- renderers/base_renderer.py <-- BaseRenderer interface | `-- processors/ <-- Data processors |-- plugins/ | |-- base_plugin.py <-- Plugin interface (BasePlugin) | `-- plugin_registry.py <-- Discovery and loading `-- application/ `-- use_cases/generate_plot.py <-- Main entry point This core system handles: - Data loading and version adaptation - Metric aggregation and statistics - Plot rendering infrastructure - Caching and performance optimization The RL Plugin Extension ----------------------- This module extends the core system by implementing the ``BasePlugin`` interface: .. code-block:: text fusion/modules/rl/visualization/ <-- This module (plugin) |-- rl_plugin.py <-- Implements BasePlugin |-- rl_metrics.py <-- RL-specific MetricDefinitions |-- rl_plots.py <-- RL-specific BaseRenderer subclasses `-- rl_processors.py <-- RL-specific processing strategies **Key Point**: This module does not work independently. It registers its components with the core visualization system, which then handles the actual plot generation. How Plugin Loading Works ------------------------ .. code-block:: text 1. Core system starts | v 2. Plugin registry discovers plugins | v 3. registry.load_plugin("rl") called | v 4. RLVisualizationPlugin instantiated | v 5. Plugin registers: - Metrics (episode_reward, q_values, etc.) - Plot types (reward_learning_curve, etc.) - Processors (RewardProcessingStrategy, etc.) | v 6. User calls generate_plot(plot_type="reward_learning_curve") | v 7. Core system uses registered RL components to render plot Usage ===== Loading the Plugin ------------------ The RL visualization plugin must be loaded before use: .. code-block:: python from fusion.visualization.plugins import get_global_registry # Discover and load plugins registry = get_global_registry() registry.discover_plugins() registry.load_plugin("rl") # Now RL plot types are available print(registry.get_available_plot_types()) # ['blocking_probability', 'reward_learning_curve', 'q_value_heatmap', ...] Generating RL Plots ------------------- Once loaded, use the standard visualization API: .. code-block:: python from fusion.visualization.application.use_cases.generate_plot import generate_plot # Generate a reward learning curve result = generate_plot( config_path="my_rl_experiment.yml", plot_type="reward_learning_curve", output_path="plots/learning_curve.png", ) # Generate a convergence analysis result = generate_plot( config_path="my_rl_experiment.yml", plot_type="convergence_plot", output_path="plots/convergence.png", ) Available Plot Types ==================== The RL plugin registers four specialized plot types: reward_learning_curve --------------------- Learning curve showing episode rewards over training with smoothing and confidence intervals. .. code-block:: python # Default configuration { "window_size": 100, # Smoothing window "confidence_level": 0.95, # CI level "show_ci": True, # Show confidence bands } **Output**: Line plot with mean reward and shaded confidence intervals. q_value_heatmap --------------- Heatmap visualization of Q-values across states and actions. .. code-block:: python # Default configuration { "colormap": "viridis", "annotate": False, } **Output**: Heatmap showing Q-value magnitude by state/action. convergence_plot ---------------- Training convergence analysis showing when metrics stabilize. .. code-block:: python # Default configuration { "window_size": 100, "threshold": 0.01, # Relative change threshold "show_convergence_point": True, } **Output**: Two-panel plot with metric progression and convergence statistics. rl_dashboard ------------ Comprehensive multi-metric dashboard for RL training analysis. .. code-block:: python # Default configuration { "layout": "3x2", "window_size": 100, } **Output**: 6-panel dashboard with rewards, losses, entropy, and Q-values. Registered Metrics ================== The plugin registers these RL-specific metrics: .. list-table:: :header-rows: 1 :widths: 25 20 55 * - Metric Name - Data Type - Description * - ``episode_reward`` - FLOAT - Total reward accumulated in an episode * - ``episode_reward_mean`` - FLOAT - Moving average of episode rewards * - ``td_error`` - ARRAY - Temporal difference prediction errors * - ``q_values`` - ARRAY - Action-value function estimates * - ``policy_entropy`` - FLOAT - Entropy of the policy distribution * - ``policy_loss`` - FLOAT - Policy gradient loss * - ``value_loss`` - FLOAT - Value function loss * - ``epsilon`` - FLOAT - Epsilon-greedy exploration rate * - ``learning_rate`` - FLOAT - Current learning rate Processing Strategies ===================== The plugin provides three specialized processing strategies: RewardProcessingStrategy ------------------------ Processes episode rewards with smoothing and statistical aggregation. **Features:** - Moving average smoothing via ``scipy.ndimage.uniform_filter1d`` - Confidence interval calculation - Multi-seed aggregation .. code-block:: python from fusion.modules.rl.visualization.rl_processors import RewardProcessingStrategy processor = RewardProcessingStrategy( window_size=100, confidence_level=0.95, ) QValueProcessingStrategy ------------------------ Processes Q-value data for heatmap visualization. **Features:** - Action-wise aggregation - Normalization for visualization - State grouping ConvergenceDetectionStrategy ---------------------------- Detects when training metrics have converged. **Algorithm:** 1. Compute mean in sliding window 2. Compare adjacent windows 3. Flag convergence when relative change < threshold .. code-block:: python from fusion.modules.rl.visualization.rl_processors import ConvergenceDetectionStrategy detector = ConvergenceDetectionStrategy( window_size=100, threshold=0.01, # 1% relative change ) Custom Renderers ================ The plugin provides four renderer classes extending ``BaseRenderer``: .. list-table:: :header-rows: 1 :widths: 35 65 * - Renderer - Description * - ``RewardLearningCurveRenderer`` - Line plots with confidence bands for reward curves * - ``QValueHeatmapRenderer`` - Seaborn heatmaps for Q-value visualization * - ``ConvergencePlotRenderer`` - Dual-panel convergence analysis plots * - ``MultiMetricDashboardRenderer`` - 6-panel comprehensive training dashboard All renderers support PNG, PDF, SVG, and JPG output formats. File Reference ============== .. code-block:: text fusion/modules/rl/visualization/ |-- __init__.py # Exports RLVisualizationPlugin |-- rl_plugin.py # Main plugin class |-- rl_metrics.py # Metric definitions |-- rl_plots.py # Plot renderers `-- rl_processors.py # Processing strategies **Public API:** .. code-block:: python from fusion.modules.rl.visualization import RLVisualizationPlugin # Or access components directly from fusion.modules.rl.visualization.rl_metrics import get_rl_metrics from fusion.modules.rl.visualization.rl_plots import ( RewardLearningCurveRenderer, QValueHeatmapRenderer, ConvergencePlotRenderer, MultiMetricDashboardRenderer, ) from fusion.modules.rl.visualization.rl_processors import ( RewardProcessingStrategy, QValueProcessingStrategy, ConvergenceDetectionStrategy, ) Dependencies ============ This plugin requires additional packages beyond the core visualization system: - **scipy**: For smoothing filters (``uniform_filter1d``) - **seaborn**: For heatmap visualization These are checked at plugin load time via ``_check_dependencies()``. Related Documentation ===================== - :ref:`rl-module` - Parent RL module documentation - :ref:`rl-utils` - Custom callbacks for training metrics collection .. seealso:: - Core visualization: ``fusion/visualization/README.md`` - Plugin interface: ``fusion/visualization/plugins/base_plugin.py`` - `Matplotlib Documentation `_ - `Seaborn Documentation `_