autoqild.mi_estimators.pytorch_utils¶

Utilities for running the PC-softmax and Mine MI estimator, like loss functions and optimizers.

Functions

`get_mine_loss`(preds_xy, preds_xy_tilde, metric)	Calculate the MINE loss based on the specified metric.
`get_optimizer_and_parameters`(optimizer_str, ...)	Get the optimizer and its configuration parameters based on the specified optimizer string.
`init`(m)	Initialize the weights and biases of a neural network layer.
`log_mean_exp`(inputs[, dim, keepdim])	Compute the log of the mean of the exponentials of input elements.
`own_softmax`(x, label_proportions, device)	Custom softmax function that incorporates label proportions to handle imbalanced data.

autoqild.mi_estimators.pytorch_utils.get_mine_loss(preds_xy, preds_xy_tilde, metric)[source]¶

Calculate the MINE loss based on the specified metric.

Parameters:

preds_xy (torch.Tensor) – Predictions for the joint distribution samples.
preds_xy_tilde (torch.Tensor) – Predictions for the product of marginals distribution samples.
metric ({donsker_varadhan, donsker_varadhan_softplus, fdivergence}) –
The divergence metric to use for the MINE loss. Options include:
- donsker_varadhan: Donsker-Varadhan representation of KL divergence.
- donsker_varadhan_softplus: Softplus version of the Donsker-Varadhan representation.
- fdivergence: f-divergence representation of mutual information.

Returns:

loss – Calculated MINE loss based on the specified metric.

Return type:

torch.Tensor

Raises:

ValueError – If the specified metric is not recognized.

autoqild.mi_estimators.pytorch_utils.get_optimizer_and_parameters(optimizer_str, learning_rate, reg_strength)[source]¶

Get the optimizer and its configuration parameters based on the specified optimizer string.

Parameters:

optimizer_str ({RMSprop, sgd, “adam”, AdamW, Adagrad, Adamax, Adadelta}, default=”adam”) –
Optimizer type to use for training the neural network. Must be one of:
- RMSprop: Root Mean Square Propagation, an adaptive learning rate method.
- sgd: Stochastic Gradient Descent, a simple and widely-used optimizer.
- ”adam”: Adaptive Moment Estimation, combining momentum and RMSProp for better convergence.
- AdamW: Adam with weight decay, an improved variant of Adam with better regularization.
- Adagrad: Adaptive Gradient Algorithm, adjusting the learning rate based on feature frequency.
- Adamax: Variant of Adam based on infinity norm, more robust with sparse gradients.
- Adadelta: An extension of Adagrad that seeks to reduce its aggressive learning rate decay.
learning_rate (float) – The learning rate for the optimizer.
reg_strength (float) – The regularization strength (weight decay) for the optimizer.

Returns:

optimizer (torch.optim.Optimizer) – The optimizer class.
optimizer_config (dict) – The configuration parameters for the optimizer.

Raises:

ValueError – If the specified optimizer string is not recognized.

autoqild.mi_estimators.pytorch_utils.init(m)[source]¶

Initialize the weights and biases of a neural network layer.

Parameters:: m (torch.nn.Module) – The neural network layer to initialize.

Notes

This function initializes the weights of a linear layer using orthogonal initialization and sets the biases to zero.

autoqild.mi_estimators.pytorch_utils.log_mean_exp(inputs, dim=None, keepdim=False)[source]¶

Compute the log of the mean of the exponentials of input elements.

Parameters:

inputs (torch.Tensor) – Input tensor.
dim (int or tuple of ints, optional) – The dimension or dimensions to reduce. If None, reduces all dimensions.
keepdim (bool, optional) – Whether the output tensor has dim retained or not.

Returns:

outputs – The logarithm of the mean of the exponentials of the input tensor.

Return type:

torch.Tensor

autoqild.mi_estimators.pytorch_utils.own_softmax(x, label_proportions, device)[source]¶

Custom softmax function that incorporates label proportions to handle imbalanced data.

This function computes a modified softmax, where the exponentiated logits are weighted by the proportions of each class label. This can help in cases where class imbalance is significant, ensuring that the model accounts for the distribution of labels during prediction.

Parameters:

x (torch.Tensor) – The input tensor (logits) of shape (n_samples, n_classes).
label_proportions (list, numpy.ndarray, or torch.Tensor) – The proportions of each class in the dataset. This should be a list or tensor of shape (n_classes,) representing the proportion of each class in the dataset.
device (torch.device) – The device on which to perform the computation (e.g., cpu or cuda).

Returns:

The resulting tensor after applying the weighted softmax operation, of shape (n_samples, n_classes).

Return type:

torch.Tensor

Notes

This function first exponentiates the logits (x) and then multiplies them by the corresponding class proportions (label_proportions). The resulting tensor is normalized by the sum of the weighted exponentiated logits to produce a probability distribution across classes.