Discrete Feature Helper

GitHub Link to Code.

Helper for discrete feature data preparation.

Provides data conversion for discrete features to prepare them for visualization by mapping categorical values to integer positions.

class mdxplain.plots.helper.discrete_feature_helper.DiscreteFeatureHelper

Helper for discrete feature data preparation.

Provides data conversion for discrete features to prepare them for visualization by mapping categorical values to integer positions.

Examples

>>> # Convert character data to positions
>>> data = np.array(['H', 'E', 'C', 'H'])
>>> mapping = {'H': 0, 'E': 1, 'C': 2}
>>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping)
>>> print(positions)
[0 1 2 0]
static prepare_discrete_data(data: ndarray, value_to_position: Dict) ndarray

Convert discrete data to integer positions for plotting.

Maps categorical data values (integers or characters) to sequential integer positions suitable for visualization.

Parameters

datanp.ndarray

Original data (integers or character strings)

value_to_positiondict

Mapping from data values to plot positions

Returns

np.ndarray

Data converted to integer positions

Examples

>>> # Integer data (no conversion needed)
>>> data = np.array([0, 1, 2, 0, 1])
>>> mapping = {0: 0, 1: 1, 2: 2}
>>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping)
>>> print(positions)
[0 1 2 0 1]
>>> # Character data (needs conversion)
>>> data = np.array(['H', 'E', 'C', 'H', 'E'])
>>> mapping = {'H': 0, 'E': 1, 'C': 2}
>>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping)
>>> print(positions)
[0 1 2 0 1]

Notes

Values not present in value_to_position are returned as -1.

static build_axis_config(selector_data: Dict[str, ndarray] | None = None, viz: Dict[str, Any] | None = None, long_labels: bool = False, x_padding: float = 0.3, fallback_from_data: bool = False) Dict[str, Any]

Build axis configuration for discrete plotting.

Parameters

selector_dataDict[str, np.ndarray], optional

Selector values used for optional data-driven fallback.

vizDict[str, Any], optional

Visualization metadata (expects optional tick_labels).

long_labelsbool, default=False

If True, uses long tick labels from metadata when available.

x_paddingfloat, default=0.3

Horizontal padding applied to x-limits.

fallback_from_databool, default=False

If True and metadata has no tick labels, build positions from unique observed values in selector_data. If False, binary fallback.

Returns

Dict[str, Any]

Axis configuration with keys:

  • positions: np.ndarray

  • value_to_position: Dict[Any, int]

  • tick_labels: List[str]

  • xlim: tuple(float, float)

static calculate_discrete_probabilities(data: ndarray, value_to_position: Dict[Any, int], n_positions: int) ndarray

Convert discrete samples into per-position probabilities.

Parameters

datanp.ndarray

Raw discrete samples for one selector.

value_to_positionDict[Any, int]

Mapping from raw discrete values to axis position indices.

n_positionsint

Total number of discrete axis positions.

Returns

np.ndarray

Probability vector of length n_positions.

Notes

Samples that cannot be mapped to a valid position are ignored.