Comparison Entities
GitHub Link to Code.
Comparison data entity for storing comparison configurations.
This module contains the ComparisonData class that stores comparison configurations including sub-comparisons, feature selectors, and data selectors for ML analysis.
- class mdxplain.comparison.entities.comparison_data.ComparisonData(name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None)
Data entity for storing comparison configurations for ML analysis.
Stores comparison configurations that define how to create ML-ready datasets from different data selections. A single ComparisonData can contain multiple sub-comparisons (e.g., in one-vs-rest mode).
Attributes
- namestr
Name identifier for this comparison
- modestr
Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”
- feature_selectorstr
Name of the feature selector to use for columns
- data_selectorsList[str]
Names of data selectors involved in the comparison
- sub_comparisonsList[Dict[str, Any]]
List of sub-comparisons with their configurations
Examples
Binary comparison: >>> comp_data = ComparisonData(“folded_vs_unfolded”, “binary”, “key_features”) >>> comp_data.add_sub_comparison(“folded_vs_unfolded”, [“folded”], [“unfolded”])
One-vs-rest comparison: >>> comp_data = ComparisonData(“conformations”, “one_vs_rest”, “all_features”) >>> # This will contain multiple sub-comparisons automatically
- __init__(name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None)
Initialize comparison data with basic configuration.
ComparisonData is a pure metadata container that stores configuration for ML comparisons without processing actual data. Data processing is handled by ComparisonManager.
Parameters
- namestr
Name identifier for this comparison
- modestr
Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”
- feature_selectorstr
Name of the feature selector to use for columns
- data_selectorsList[str], optional
Names of data selectors involved in the comparison
Returns
- None
Initializes ComparisonData with given configuration
Examples
>>> comp_data = ComparisonData( ... "systems_comparison", "pairwise", "important_features", ... ["system_A", "system_B", "system_C"] ... )
- add_sub_comparison(sub_name: str, group1_selectors: List[str], group2_selectors: List[str], labels: Tuple[int, int] | None = None) None
Add a sub-comparison to this comparison.
Parameters
- sub_namestr
Name of the sub-comparison
- group1_selectorsList[str]
Data selector names for group 1
- group2_selectorsList[str]
Data selector names for group 2
- labelsTuple[int, int], optional
Label values for (group1, group2). Defaults to (0, 1)
Returns
- None
Adds sub-comparison to the list
Examples
>>> comp_data.add_sub_comparison( ... "folded_vs_unfolded", ["folded"], ["unfolded"], (0, 1) ... )
- get_sub_comparison(sub_name: str) Dict[str, Any] | None
Get a specific sub-comparison by name.
Parameters
- sub_namestr
Name of the sub-comparison to retrieve
Returns
- Dict[str, Any] or None
Sub-comparison dictionary or None if not found
Examples
>>> sub_comp = comp_data.get_sub_comparison("folded_vs_rest") >>> if sub_comp: ... print(f"Group 1: {sub_comp['group1_selectors']}")
- list_sub_comparisons() List[str]
List names of all sub-comparisons.
Returns
- List[str]
List of sub-comparison names
Examples
>>> names = comp_data.list_sub_comparisons() >>> print(f"Available sub-comparisons: {names}")
- get_comparison_info() Dict[str, Any]
Get summary information about this comparison.
Returns
- Dict[str, Any]
Dictionary with comparison summary information
Examples
>>> info = comp_data.get_comparison_info() >>> print(f"Mode: {info['mode']}") >>> print(f"Sub-comparisons: {info['n_sub_comparisons']}")
- save(save_path: str) None
Save ComparisonData object to disk.
Parameters
- save_pathstr
Path where to save the ComparisonData object
Returns
- None
Saves the ComparisonData object to the specified path
Examples
>>> comparison_data.save('analysis_results/folded_analysis.pkl')
- load(load_path: str) None
Load ComparisonData object from disk.
Parameters
- load_pathstr
Path to the saved ComparisonData file
Returns
- None
Loads the ComparisonData object from the specified path
Examples
>>> comparison_data.load('analysis_results/folded_analysis.pkl')
- print_info() None
Print comprehensive comparison information.
Parameters
None
Returns
- None
Prints comparison information to console
Examples
>>> comparison_data.print_info() === ComparisonData === Name: folded_analysis Comparison Mode: one_vs_rest Feature Selector: key_features Data Selectors: 3 (folded, intermediate, unfolded) Sub-Comparisons: 3 (folded_vs_rest, intermediate_vs_rest, unfolded_vs_rest)