autoqild.mi_estimators.pc_softmax_estimatorΒΆ

MI estimator that uses probability-corrected softmax functions to assess the information content in classification scenarios.

Classes

PCSoftmaxMIEstimator(n_classes, n_features)

PCSoftmaxMIEstimator estimates Mutual Information (MI) using a neural network trained with a modified softmax function.

class autoqild.mi_estimators.pc_softmax_estimator.PCSoftmaxMIEstimator(n_classes, n_features, n_hidden=10, n_units=100, loss_function=NLLLoss(), optimizer_str='adam', learning_rate=0.001, reg_strength=0.001, is_pc_softmax=False, random_state=42)[source]ΒΆ

Bases: MIEstimatorBase

PCSoftmaxMIEstimator estimates Mutual Information (MI) using a neural network trained with a modified softmax function.

This class uses a neural network to estimate the MI between input features and class labels. The neural network is trained using a custom softmax function that accounts for label proportions, which can help in handling imbalanced data.

Parameters:
  • n_classes (int) – Number of classes in the classification task.

  • n_features (int) – Number of features or dimensionality of the input data.

  • n_hidden (int, optional, default=10) – Number of hidden layers in the neural network.

  • n_units (int, optional, default=100) – Number of units in each hidden layer.

  • loss_function (torch.nn.Module, optional, default=torch.nn.NLLLoss()) – Loss function to be used during training.

  • optimizer_str ({RMSprop, sgd, adam, AdamW, Adagrad, Adamax, Adadelta}, default=`adam`) –

    Optimizer type to use for training the neural network. Must be one of:

    • RMSprop: Root Mean Square Propagation, an adaptive learning rate method.

    • sgd: Stochastic Gradient Descent, a simple and widely-used optimizer.

    • ”adam”: Adaptive Moment Estimation, combining momentum and RMSProp for better convergence.

    • AdamW: Adam with weight decay, an improved variant of Adam with better regularization.

    • Adagrad: Adaptive Gradient Algorithm, adjusting the learning rate based on feature frequency.

    • Adamax: Variant of Adam based on infinity norm, more robust with sparse gradients.

    • Adadelta: An extension of Adagrad that seeks to reduce its aggressive learning rate decay.

  • learning_rate (float, optional, default=0.001) – Learning rate for the optimizer.

  • reg_strength (float, optional, default=0.001) – Regularization strength for the optimizer.

  • is_pc_softmax (bool, optional, default=False) – If True, use the custom softmax function that accounts for label proportions.

  • random_state (int, optional, default=42) – Seed for random number generation to ensure reproducibility.

loggerΒΆ

Logger for logging messages and errors.

Type:

logging.Logger

optimizerΒΆ

Optimizer used for training the neural network.

Type:

torch.optim.Optimizer

class_netΒΆ

Instance of the neural network used for classification.

Type:

ClassNet

dataset_propertiesΒΆ

Proportions of each class in the dataset.

Type:

list

final_lossΒΆ

Final loss value after training.

Type:

float

mi_valΒΆ

Estimated mutual information after training.

Type:

float

deviceΒΆ

Device used for computation (CPU or GPU).

Type:

torch.device

decision_function(X, verbose=0)[source]ΒΆ

Compute the decision function in form of class probabilities for the input samples.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Feature matrix.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

scores – Decision function values.

Return type:

array-like of shape (n_samples, n_classes)

estimate_mi(X, y, verbose=1, **kwargs)[source]ΒΆ

Estimate Mutual Information using the trained neural network using the Softmax and PC-Softmax loss functions.

\[I(X;Y) = H(Y) - H(Y|X)\]

Softmax Function:

\[S(z_k) = \frac{e^{z_k}}{\sum_{j=1}^{K} e^{z_j}}\]

where:

  • ( z_k ) is the logit or raw score for class ( k ).

  • ( K ) is the total number of classes.

PC-Softmax Function:

\[S_{pc}(z_k) = \frac{e^{z_k}}{\sum_{j=1}^{K} e^{z_j} \cdot p_j}\]

where:

  • ( z_k ) is the logit or raw score for class ( k ).

  • ( p_j = frac{text{counts}_j}{text{total samples}} ) is the prior probability of class ( j )

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input data.

  • y (array-like of shape (n_samples,)) – Target labels.

  • verbose (int, optional, default=1) – Verbosity level.

  • **kwargs (dict, optional) – Additional keyword arguments.

Returns:

mi_estimated – The estimated mutual information.

Return type:

float

fit(X, y, epochs=50, verbose=0, **kwd)[source]ΒΆ

Fit the neural network to the data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (array-like of shape (n_samples,)) – Target labels.

  • epochs (int, optional, default=50) – Number of training epochs.

  • verbose (int, optional, default=0) – Verbosity level.

  • **kwd (dict, optional) – Additional keyword arguments.

Returns:

self – Fitted estimator.

Return type:

PCSoftmaxMIEstimator

predict(X, verbose=0)[source]ΒΆ

Predict class labels for the input samples.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Feature matrix.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

y_pred – Predicted class labels.

Return type:

array-like of shape (n_samples,)

predict_proba(X, verbose=0)[source]ΒΆ

Predict class probabilities for the input samples.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Feature matrix.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

p_pred – Predicted class probabilities.

Return type:

array-like of shape (n_samples, n_classes)

score(X, y, sample_weight=None, verbose=0)[source]ΒΆ

Compute the score of the neural network.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Feature matrix.

  • y (array-like of shape (n_samples,)) – True labels for β€œX”.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

  • verbose (int, optional, default=0) – Verbosity level.

Returns:

score – Negative loss of the model on the validation data.

Return type:

float