Comparison Data Helper

GitHub Link to Code.

Helper class for comparison data processing.

This module provides the ComparisonDataHelper class with static methods for processing comparison data without creating circular dependencies. Used by PipelineData to provide comparison data access functionality.

class mdxplain.pipeline.helper.comparison_data_helper.ComparisonDataHelper

Static helper class for comparison data processing.

Provides static methods to process comparison data by combining ComparisonData metadata with PipelineData’s get_selected_data() method. Avoids circular dependencies by being part of the pipeline module.

Examples

>>> # Used internally by PipelineData
>>> X, y = ComparisonDataHelper.get_sub_comparison_data(
...     pipeline_data, comparison_data, "folded_vs_rest"
... )
static get_sub_comparison_data(pipeline_data: PipelineData, comparison_data: ComparisonData, sub_comparison_name: str) Tuple[np.ndarray, np.ndarray]

Get X (features) and y (labels) for a specific sub-comparison.

This method combines ComparisonData metadata with PipelineData’s data processing capabilities to create ML-ready datasets.

Parameters

pipeline_dataPipelineData

Pipeline data object containing feature and data selections

comparison_dataComparisonData

Comparison metadata container

sub_comparison_namestr

Name of the sub-comparison to process

Returns

Tuple[np.ndarray, np.ndarray]

Tuple of (X, y) where X is feature matrix and y is label array

Raises

ValueError

If sub-comparison not found or required data missing

Examples

>>> X, y = ComparisonDataHelper.get_sub_comparison_data(
...     pipeline_data, comp_data, "folded_vs_rest"
... )
>>> print(f"Features shape: {X.shape}")
>>> print(f"Labels shape: {y.shape}")