Comparison Manager
GitHub Link to Code.
Comparison manager for creating data comparisons.
This module provides the ComparisonManager class that creates comparisons between different data selections for further analysis. It supports various comparison modes and automatically generates appropriate sub-comparisons.
- class mdxplain.comparison.manager.comparison_manager.ComparisonManager
Manager for creating and managing data comparisons.
This class provides methods to create comparisons between different data selections (created by DataSelectorManager) for further analysis. It supports various comparison modes and automatically generates the appropriate sub-comparisons.
Supported modes:
Binary: Simple A vs B comparison
Pairwise: All possible pairs from multiple selectors
One-vs-rest: Each selector vs all others combined
Multiclass: All selectors as separate classes
Examples
Pipeline mode (automatic injection):
>>> pipeline = PipelineManager() >>> pipeline.comparison.create_comparison( ... "folded_vs_unfolded", "binary", "key_features", ... ["folded_frames", "unfolded_frames"] ... )
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = ComparisonManager() >>> manager.create_comparison( ... pipeline_data, "folded_vs_unfolded", "binary", "key_features", ... ["folded_frames", "unfolded_frames"] ... )
- __init__() None
Initialize the comparison manager.
The ComparisonManager creates and manages comparison metadata. Actual data processing is done via get_comparison_data() method which uses PipelineData for memmap-safe operations.
Parameters
- None
No parameters required for initialization
Returns
- None
Initializes ComparisonManager instance
- create_comparison(pipeline_data: PipelineData, name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None, data_selector_groups: str | List[str] | None = None) None
Create a new comparison with specified mode and selectors.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.comparison.create_comparison("folded_vs_unfolded", "binary", "key_features", ["folded_frames", "unfolded_frames"]) # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = ComparisonManager() >>> manager.create_comparison(pipeline_data, "folded_vs_unfolded", "binary", "key_features", ["folded_frames", "unfolded_frames"]) # WITH pipeline_data parameter
Parameters
- pipeline_dataPipelineData
Pipeline data object to store the comparison
- namestr
Name for the new comparison
- modestr
Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”
- feature_selectorstr
Name of the feature selector to use (defines columns)
- data_selectorsList[str], optional
Names of individual data selectors to compare
- data_selector_groupsstr or List[str], optional
Name(s) of data selector groups to use
Returns
- None
Creates ComparisonData in pipeline_data
Raises
- ValueError
If comparison already exists, invalid mode, or selectors not found
Examples
>>> # Binary comparison >>> manager.create_comparison( ... pipeline_data, "folded_vs_unfolded", "binary", "key_features", ... data_selectors=["folded_frames", "unfolded_frames"] ... )
>>> # Using groups >>> manager.create_comparison( ... pipeline_data, "cluster_comp", "one_vs_rest", "features", ... data_selector_groups="clusters" ... )
- list_comparisons(pipeline_data: PipelineData) List[str]
List all available comparisons.
Parameters
- pipeline_dataPipelineData
Pipeline data object
Returns
- List[str]
List of comparison names
Examples
>>> comparisons = manager.list_comparisons(pipeline_data) >>> print(f"Available comparisons: {comparisons}")
- get_comparison_info(pipeline_data: PipelineData, name: str) Dict[str, Any]
Get information about a comparison.
Parameters
- pipeline_dataPipelineData
Pipeline data object
- namestr
Name of the comparison
Returns
- Dict[str, Any]
Dictionary with comparison information
Examples
>>> info = manager.get_comparison_info(pipeline_data, "conformations") >>> print(f"Mode: {info['mode']}") >>> print(f"Sub-comparisons: {info['sub_comparison_names']}")
- remove_comparison(pipeline_data: PipelineData, name: str) None
Remove a comparison.
Parameters
- pipeline_dataPipelineData
Pipeline data object
- namestr
Name of the comparison to remove
Returns
- None
Removes the comparison from pipeline_data
Examples
>>> manager.remove_comparison(pipeline_data, "old_comparison")
- save(pipeline_data: PipelineData, save_path: str) None
Save all comparison data to single file.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.comparison.save('comparison.npy') # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = ComparisonManager() >>> manager.save(pipeline_data, 'comparison.npy') # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container with comparison data
- save_pathstr
Path where to save all comparison data in one file
Returns
- None
Saves all comparison data to the specified file
Examples
>>> manager.save(pipeline_data, 'comparison.npy')
- load(pipeline_data: PipelineData, load_path: str) None
Load all comparison data from single file.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.comparison.load('comparison.npy') # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = ComparisonManager() >>> manager.load(pipeline_data, 'comparison.npy') # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container to load comparison data into
- load_pathstr
Path to saved comparison data file
Returns
- None
Loads all comparison data from the specified file
Examples
>>> manager.load(pipeline_data, 'comparison.npy')
- print_info(pipeline_data: PipelineData) None
Print comparison information.
Warning
When using PipelineManager, do NOT provide the pipeline_data parameter. The PipelineManager automatically injects this parameter.
Pipeline mode:
>>> pipeline = PipelineManager() >>> pipeline.comparison.print_info() # NO pipeline_data parameter
Standalone mode:
>>> pipeline_data = PipelineData() >>> manager = ComparisonManager() >>> manager.print_info(pipeline_data) # pipeline_data required
Parameters
- pipeline_dataPipelineData
Pipeline data container with comparison data
Returns
- None
Prints comparison information to console
Examples
>>> manager.print_info(pipeline_data)