autoqild.mi_estimators.mi_estimator_classificationΒΆ

Base class for classification-based MI estimators, providing a framework for estimating MI in supervised learning.

Classes

ClassificationMIEstimator(n_classes, n_features)

Class to estimate Mutual Information (MI) using a classification model.

class autoqild.mi_estimators.mi_estimator_classification.ClassificationMIEstimator(n_classes, n_features, base_estimator=<class 'sklearn.ensemble._forest.RandomForestClassifier'>, learner_params={}, random_state=None, **kwargs)[source]ΒΆ

Bases: MIEstimatorBase

Class to estimate Mutual Information (MI) using a classification model.

This class leverages a classification model, such as RandomForestClassifier, to estimate the Mutual Information between input features and class labels using various metrics, including log-loss and softmax probabilities. It extends the MIEstimatorBase class, inheriting its basic structure and functionalities.

Parameters:
  • n_classes (int) – Number of classes in the classification data samples.

  • n_features (int) – Number of features or dimensionality of the inputs of the classification data samples.

  • base_estimator (sklearn.ensemble.RandomForestClassifier) – Base estimator used for classification.

  • learner_params (dict) – Parameters passed to the base estimator.

  • random_state (int or object, optional, default=42) – Random state for reproducibility.

  • **kwargs (dict, optional) – Additional keyword arguments passed to the base learner RandomForestClassifier.

random_stateΒΆ

Random state instance for reproducibility.

Type:

RandomState instance

loggerΒΆ

Logger instance for logging information.

Type:

logging.Logger

base_estimatorΒΆ

Base estimator used for classification.

Type:

sklearn.ensemble.RandomForestClassifier

learner_paramsΒΆ

Parameters passed to the base estimator.

Type:

dict

base_learnerΒΆ

The instantiated base learner.

Type:

object

fit(X, y, \*\*kwd):

Fit the classification model to the data.

predict(X, verbose=0):

Predict class labels for samples in X.

score(X, y, sample_weight=None, verbose=0):

Return the accuracy score of the model on the given test data and labels.

predict_proba(X, verbose=0):

Predict class probabilities for samples in X.

decision_function(X, verbose=0):

Predict confidence scores for samples, which may coincide with the probability scores in X.

estimate_mi(X, y, method=LOG_LOSS_MI_ESTIMATION, \*\*kwargs):

Estimate Mutual Information using the specified method.

decision_function(X, verbose=0)[source]ΒΆ

Predict confidence scores for samples, which may coincide with the probability scores in X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

scores – Predicted confidence scores.

Return type:

array-like of shape (n_samples, n_classes)

estimate_mi(X, y, method='LogLossMI', **kwargs)[source]ΒΆ

Estimate Mutual Information using the specified method.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input data.

  • y (array-like of shape (n_samples,)) – Target labels.

  • method (str, optional, default=`LogLossMI`) –

    The method to use for mutual information estimation. Options include:

    • ’MidPointMI’: Estimate MI using Mid-point method.

    • LogLossMI: Estimate MI using Log-Loss method.

    • LogLossMIIsotonicRegression: Estimate MI using Log-Loss method with Isotonic Regression.

    • LogLossMIPlattScaling: Estimate MI using Log-Loss method with Platt Scaling.

    • LogLossMIBetaCalibration: Estimate MI using Log-Loss method with Beta Calibration.

    • LogLossMITemperatureScaling: Estimate MI using Log-Loss method with Temperature Scaling.

    • LogLossMIHistogramBinning: Estimate MI using Log-Loss method with Histogram Binning.

    • PCSoftmaxMI: Estimate MI using Softmax probabilities.

  • **kwargs (dict, optional) – Additional keyword arguments passed to the estimation methods.

Returns:

mutual_information – A mean of estimated MI values from cross-validation splits.

Return type:

float

fit(X, y, **kwd)[source]ΒΆ

Fit the classification model to the data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (array-like of shape (n_samples,)) – Target labels.

  • **kwd (dict, optional) – Additional keyword arguments passed to the fit method of the classifier.

Returns:

self – Fitted estimator.

Return type:

object

predict(X, verbose=0)[source]ΒΆ

Predict class labels for samples in X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

y_pred – Predicted class labels.

Return type:

array-like of shape (n_samples,)

predict_proba(X, verbose=0)[source]ΒΆ

Predict class probabilities for samples in X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

p_pred – Predicted class probabilities.

Return type:

array-like of shape (n_samples, n_classes)

score(X, y, sample_weight=None, verbose=0)[source]ΒΆ

Return the accuracy score of the model on the given test data and labels.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,)) – True labels for X.

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

score – Mean accuracy of self.predict(X) w.r.t. y.

Return type:

float