Discrete Feature Helper
GitHub Link to Code.
Helper for discrete feature data preparation.
Provides data conversion for discrete features to prepare them for visualization by mapping categorical values to integer positions.
- class mdxplain.plots.helper.discrete_feature_helper.DiscreteFeatureHelper
Helper for discrete feature data preparation.
Provides data conversion for discrete features to prepare them for visualization by mapping categorical values to integer positions.
Examples
>>> # Convert character data to positions >>> data = np.array(['H', 'E', 'C', 'H']) >>> mapping = {'H': 0, 'E': 1, 'C': 2} >>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping) >>> print(positions) [0 1 2 0]
- static prepare_discrete_data(data: ndarray, value_to_position: Dict) ndarray
Convert discrete data to integer positions for plotting.
Maps categorical data values (integers or characters) to sequential integer positions suitable for visualization.
Parameters
- datanp.ndarray
Original data (integers or character strings)
- value_to_positiondict
Mapping from data values to plot positions
Returns
- np.ndarray
Data converted to integer positions
Examples
>>> # Integer data (no conversion needed) >>> data = np.array([0, 1, 2, 0, 1]) >>> mapping = {0: 0, 1: 1, 2: 2} >>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping) >>> print(positions) [0 1 2 0 1]
>>> # Character data (needs conversion) >>> data = np.array(['H', 'E', 'C', 'H', 'E']) >>> mapping = {'H': 0, 'E': 1, 'C': 2} >>> positions = DiscreteFeatureHelper.prepare_discrete_data(data, mapping) >>> print(positions) [0 1 2 0 1]
Notes
Values not present in value_to_position are returned as -1.
- static build_axis_config(selector_data: Dict[str, ndarray] | None = None, viz: Dict[str, Any] | None = None, long_labels: bool = False, x_padding: float = 0.3, fallback_from_data: bool = False) Dict[str, Any]
Build axis configuration for discrete plotting.
Parameters
- selector_dataDict[str, np.ndarray], optional
Selector values used for optional data-driven fallback.
- vizDict[str, Any], optional
Visualization metadata (expects optional tick_labels).
- long_labelsbool, default=False
If True, uses long tick labels from metadata when available.
- x_paddingfloat, default=0.3
Horizontal padding applied to x-limits.
- fallback_from_databool, default=False
If True and metadata has no tick labels, build positions from unique observed values in selector_data. If False, binary fallback.
Returns
- Dict[str, Any]
Axis configuration with keys:
positions: np.ndarray
value_to_position: Dict[Any, int]
tick_labels: List[str]
xlim: tuple(float, float)
- static calculate_discrete_probabilities(data: ndarray, value_to_position: Dict[Any, int], n_positions: int) ndarray
Convert discrete samples into per-position probabilities.
Parameters
- datanp.ndarray
Raw discrete samples for one selector.
- value_to_positionDict[Any, int]
Mapping from raw discrete values to axis position indices.
- n_positionsint
Total number of discrete axis positions.
Returns
- np.ndarray
Probability vector of length n_positions.
Notes
Samples that cannot be mapped to a valid position are ignored.