Comparison Manager

GitHub Link to Code.

Comparison manager for creating data comparisons.

This module provides the ComparisonManager class that creates comparisons between different data selections for further analysis. It supports various comparison modes and automatically generates appropriate sub-comparisons.

class mdxplain.comparison.manager.comparison_manager.ComparisonManager

Manager for creating and managing data comparisons.

This class provides methods to create comparisons between different data selections (created by DataSelectorManager) for further analysis. It supports various comparison modes and automatically generates the appropriate sub-comparisons.

Supported modes:

  • Binary: Simple A vs B comparison

  • Pairwise: All possible pairs from multiple selectors

  • One-vs-rest: Each selector vs all others combined

  • Multiclass: All selectors as separate classes

Examples

Pipeline mode (automatic injection):

>>> pipeline = PipelineManager()
>>> pipeline.comparison.create_comparison(
...     "folded_vs_unfolded", "binary", "key_features",
...     ["folded_frames", "unfolded_frames"]
... )

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = ComparisonManager()
>>> manager.create_comparison(
...     pipeline_data, "folded_vs_unfolded", "binary", "key_features",
...     ["folded_frames", "unfolded_frames"]
... )
__init__() None

Initialize the comparison manager.

The ComparisonManager creates and manages comparison metadata. Actual data processing is done via get_comparison_data() method which uses PipelineData for memmap-safe operations.

Parameters

None

No parameters required for initialization

Returns

None

Initializes ComparisonManager instance

create_comparison(pipeline_data: PipelineData, name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None, data_selector_groups: str | List[str] | None = None) None

Create a new comparison with specified mode and selectors.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.comparison.create_comparison("folded_vs_unfolded", "binary", "key_features", ["folded_frames", "unfolded_frames"])  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = ComparisonManager()
>>> manager.create_comparison(pipeline_data, "folded_vs_unfolded", "binary", "key_features", ["folded_frames", "unfolded_frames"])  # WITH pipeline_data parameter

Parameters

pipeline_dataPipelineData

Pipeline data object to store the comparison

namestr

Name for the new comparison

modestr

Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”

feature_selectorstr

Name of the feature selector to use (defines columns)

data_selectorsList[str], optional

Names of individual data selectors to compare

data_selector_groupsstr or List[str], optional

Name(s) of data selector groups to use

Returns

None

Creates ComparisonData in pipeline_data

Raises

ValueError

If comparison already exists, invalid mode, or selectors not found

Examples

>>> # Binary comparison
>>> manager.create_comparison(
...     pipeline_data, "folded_vs_unfolded", "binary", "key_features",
...     data_selectors=["folded_frames", "unfolded_frames"]
... )
>>> # Using groups
>>> manager.create_comparison(
...     pipeline_data, "cluster_comp", "one_vs_rest", "features",
...     data_selector_groups="clusters"
... )
list_comparisons(pipeline_data: PipelineData) List[str]

List all available comparisons.

Parameters

pipeline_dataPipelineData

Pipeline data object

Returns

List[str]

List of comparison names

Examples

>>> comparisons = manager.list_comparisons(pipeline_data)
>>> print(f"Available comparisons: {comparisons}")
get_comparison_info(pipeline_data: PipelineData, name: str) Dict[str, Any]

Get information about a comparison.

Parameters

pipeline_dataPipelineData

Pipeline data object

namestr

Name of the comparison

Returns

Dict[str, Any]

Dictionary with comparison information

Examples

>>> info = manager.get_comparison_info(pipeline_data, "conformations")
>>> print(f"Mode: {info['mode']}")
>>> print(f"Sub-comparisons: {info['sub_comparison_names']}")
remove_comparison(pipeline_data: PipelineData, name: str) None

Remove a comparison.

Parameters

pipeline_dataPipelineData

Pipeline data object

namestr

Name of the comparison to remove

Returns

None

Removes the comparison from pipeline_data

Examples

>>> manager.remove_comparison(pipeline_data, "old_comparison")
save(pipeline_data: PipelineData, save_path: str) None

Save all comparison data to single file.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.comparison.save('comparison.npy')  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = ComparisonManager()
>>> manager.save(pipeline_data, 'comparison.npy')  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container with comparison data

save_pathstr

Path where to save all comparison data in one file

Returns

None

Saves all comparison data to the specified file

Examples

>>> manager.save(pipeline_data, 'comparison.npy')
load(pipeline_data: PipelineData, load_path: str) None

Load all comparison data from single file.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.comparison.load('comparison.npy')  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = ComparisonManager()
>>> manager.load(pipeline_data, 'comparison.npy')  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container to load comparison data into

load_pathstr

Path to saved comparison data file

Returns

None

Loads all comparison data from the specified file

Examples

>>> manager.load(pipeline_data, 'comparison.npy')
print_info(pipeline_data: PipelineData) None

Print comparison information.

Warning

When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.

Pipeline mode:

>>> pipeline = PipelineManager()
>>> pipeline.comparison.print_info()  # NO pipeline_data parameter

Standalone mode:

>>> pipeline_data = PipelineData()
>>> manager = ComparisonManager()
>>> manager.print_info(pipeline_data)  # pipeline_data required

Parameters

pipeline_dataPipelineData

Pipeline data container with comparison data

Returns

None

Prints comparison information to console

Examples

>>> manager.print_info(pipeline_data)