Comparison Entities

GitHub Link to Code.

Comparison data entity for storing comparison configurations.

This module contains the ComparisonData class that stores comparison configurations including sub-comparisons, feature selectors, and data selectors for ML analysis.

class mdxplain.comparison.entities.comparison_data.ComparisonData(name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None)

Data entity for storing comparison configurations for ML analysis.

Stores comparison configurations that define how to create ML-ready datasets from different data selections. A single ComparisonData can contain multiple sub-comparisons (e.g., in one-vs-rest mode).

Attributes

namestr

Name identifier for this comparison

modestr

Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”

feature_selectorstr

Name of the feature selector to use for columns

data_selectorsList[str]

Names of data selectors involved in the comparison

sub_comparisonsList[Dict[str, Any]]

List of sub-comparisons with their configurations

Examples

Binary comparison: >>> comp_data = ComparisonData(“folded_vs_unfolded”, “binary”, “key_features”) >>> comp_data.add_sub_comparison(“folded_vs_unfolded”, [“folded”], [“unfolded”])

One-vs-rest comparison: >>> comp_data = ComparisonData(“conformations”, “one_vs_rest”, “all_features”) >>> # This will contain multiple sub-comparisons automatically

__init__(name: str, mode: str, feature_selector: str, data_selectors: List[str] | None = None)

Initialize comparison data with basic configuration.

ComparisonData is a pure metadata container that stores configuration for ML comparisons without processing actual data. Data processing is handled by ComparisonManager.

Parameters

namestr

Name identifier for this comparison

modestr

Comparison mode: “binary”, “pairwise”, “one_vs_rest”, “multiclass”

feature_selectorstr

Name of the feature selector to use for columns

data_selectorsList[str], optional

Names of data selectors involved in the comparison

Returns

None

Initializes ComparisonData with given configuration

Examples

>>> comp_data = ComparisonData(
...     "systems_comparison", "pairwise", "important_features",
...     ["system_A", "system_B", "system_C"]
... )
add_sub_comparison(sub_name: str, group1_selectors: List[str], group2_selectors: List[str], labels: Tuple[int, int] | None = None) None

Add a sub-comparison to this comparison.

Parameters

sub_namestr

Name of the sub-comparison

group1_selectorsList[str]

Data selector names for group 1

group2_selectorsList[str]

Data selector names for group 2

labelsTuple[int, int], optional

Label values for (group1, group2). Defaults to (0, 1)

Returns

None

Adds sub-comparison to the list

Examples

>>> comp_data.add_sub_comparison(
...     "folded_vs_unfolded", ["folded"], ["unfolded"], (0, 1)
... )
get_sub_comparison(sub_name: str) Dict[str, Any] | None

Get a specific sub-comparison by name.

Parameters

sub_namestr

Name of the sub-comparison to retrieve

Returns

Dict[str, Any] or None

Sub-comparison dictionary or None if not found

Examples

>>> sub_comp = comp_data.get_sub_comparison("folded_vs_rest")
>>> if sub_comp:
...     print(f"Group 1: {sub_comp['group1_selectors']}")
list_sub_comparisons() List[str]

List names of all sub-comparisons.

Returns

List[str]

List of sub-comparison names

Examples

>>> names = comp_data.list_sub_comparisons()
>>> print(f"Available sub-comparisons: {names}")
get_comparison_info() Dict[str, Any]

Get summary information about this comparison.

Returns

Dict[str, Any]

Dictionary with comparison summary information

Examples

>>> info = comp_data.get_comparison_info()
>>> print(f"Mode: {info['mode']}")
>>> print(f"Sub-comparisons: {info['n_sub_comparisons']}")
save(save_path: str) None

Save ComparisonData object to disk.

Parameters

save_pathstr

Path where to save the ComparisonData object

Returns

None

Saves the ComparisonData object to the specified path

Examples

>>> comparison_data.save('analysis_results/folded_analysis.pkl')
load(load_path: str) None

Load ComparisonData object from disk.

Parameters

load_pathstr

Path to the saved ComparisonData file

Returns

None

Loads the ComparisonData object from the specified path

Examples

>>> comparison_data.load('analysis_results/folded_analysis.pkl')
print_info() None

Print comprehensive comparison information.

Parameters

None

Returns

None

Prints comparison information to console

Examples

>>> comparison_data.print_info()
=== ComparisonData ===
Name: folded_analysis
Comparison Mode: one_vs_rest
Feature Selector: key_features
Data Selectors: 3 (folded, intermediate, unfolded)
Sub-Comparisons: 3 (folded_vs_rest, intermediate_vs_rest, unfolded_vs_rest)