autoqild.utilities.metrics¶
This Python module provides functions for calculating various metrics related to mutual information and classification performance, including binary cross-entropy, upper and lower bounds of mutual information, AUC score, and more.
Functions
|
Computes the AUC score for the given true labels and predicted probabilities. |
|
Computes the binary cross-entropy for a given probability p_e. |
|
Computes the false negative rate (FNR). |
|
Computes the false positive rate (FPR). |
|
Computes the adjusted Fano"s lower bound for mutual information. |
|
Computes Fano"s lower bound for mutual information. |
|
Computes the entropy of the true labels. |
|
Computes the Hellman-Raviv function for a given error probability pe. |
|
Computes the Hellman-Raviv upper bound for mutual information based on classification performance. |
|
Estimates mutual information by evaluating the log-loss of the predicted probabilities and entropy of outputs. |
|
Computes the midpoint mutual information estimate by averaging the upper and lower bounds. |
|
Estimates the mutual information using predicted probabilities in the softmax and PC-Softmax functions. |
|
Removes rows containing NaN values from the predicted probabilities and true labels. |
|
Computes the Santhi-Vardi upper bound for mutual information. |
- autoqild.utilities.metrics.auc_score(y_true, p_pred)[source]¶
Computes the AUC score for the given true labels and predicted probabilities.
- Parameters:
y_true (ndarray) – True class labels.
p_pred (ndarray) – Predicted probabilities.
- Returns:
auc_roc – AUC score.
- Return type:
float
Notes
For multi-class scenarios, the AUC is computed using a one-vs-rest approach.
The method includes normalization as a fallback if issues arise during computation.
- autoqild.utilities.metrics.bin_ce(p_e)[source]¶
Computes the binary cross-entropy for a given probability p_e.
- Parameters:
p_e (float) – Probability value for which binary cross-entropy is computed.
- Returns:
binary_cross_entropy – The binary cross-entropy value.
- Return type:
float
Notes
This function handles edge cases where p_e is 0 or 1 by adding or subtracting a small epsilon value to prevent division by zero errors.
- autoqild.utilities.metrics.false_negative_rate(y_true, y_pred)[source]¶
Computes the false negative rate (FNR).
- Parameters:
y_true (ndarray) – True binary labels.
y_pred (ndarray) – Predicted binary labels.
- Returns:
fnr – False negative rate.
- Return type:
float
Notes
FNR is calculated as the ratio of false negatives to the sum of false negatives and true positives.
- autoqild.utilities.metrics.false_positive_rate(y_true, y_pred)[source]¶
Computes the false positive rate (FPR).
- Parameters:
y_true (ndarray) – True binary labels.
y_pred (ndarray) – Predicted binary labels.
- Returns:
fpr – False positive rate.
- Return type:
float
Notes
FPR is calculated as the ratio of false positives to the sum of false positives and true negatives.
- autoqild.utilities.metrics.fanos_adjusted_lower_bound(y_true, y_pred)[source]¶
Computes the adjusted Fano”s lower bound for mutual information.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted class labels.
- Returns:
fanos_adjusted_lb – Adjusted Fano”s lower bound.
- Return type:
float
Notes
This adjusted bound accounts for binary cross-entropy and provides a refined lower bound estimate compared to the standard Fano”s bound.
- autoqild.utilities.metrics.fanos_lower_bound(y_true, y_pred)[source]¶
Computes Fano”s lower bound for mutual information.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted class labels.
- Returns:
fanos_lb – Fano”s lower bound.
- Return type:
float
Notes
Fano”s bound gives a lower estimate of mutual information by considering the classification error and the complexity of the classification task (in terms of the number of classes).
- autoqild.utilities.metrics.helmann_raviv_function(n_classes, pe)[source]¶
Computes the Hellman-Raviv function for a given error probability pe.
The Hellman-Raviv function is used to estimate the upper bound of mutual information based on classification error rates.
- Parameters:
n_classes (int) – The number of classes in the classification task.
pe (ndarray) – The error probability values for each sample.
- Returns:
hrf_values – The computed Hellman-Raviv function values.
- Return type:
ndarray
Notes
The function partitions the error probabilities into ranges based on the number of classes and computes the upper bound using a series of logarithmic transformations.
- autoqild.utilities.metrics.helmann_raviv_upper_bound(y_true, y_pred)[source]¶
Computes the Hellman-Raviv upper bound for mutual information based on classification performance.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted class labels.
- Returns:
hr_u – The Hellman-Raviv upper bound for mutual information.
- Return type:
float
Notes
The Hellman-Raviv bound is calculated as the difference between the logarithm of the number of classes and the computed Hellman-Raviv function for the error rate.
- autoqild.utilities.metrics.log_loss_estimation(y_true, y_pred)[source]¶
Estimates mutual information by evaluating the log-loss of the predicted probabilities and entropy of outputs.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted probabilities.
- Returns:
estimated_mi – Estimated mutual information.
- Return type:
float
Notes
The estimation is based on calculating the entropy H(Y) of the true labels and the average log-loss of the predictions.
NaN values in the input are removed before performing the estimation.
- autoqild.utilities.metrics.mid_point_mi(y_true, y_pred)[source]¶
Computes the midpoint mutual information estimate by averaging the upper and lower bounds.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted class labels.
- Returns:
mid_point – Midpoint mutual information estimate.
- Return type:
float
Notes
This estimate is computed as the average of the Hellman-Raviv upper bound and Fano”s lower bound.
The estimate is constrained to be non-negative by taking the maximum with zero.
- autoqild.utilities.metrics.pc_softmax_estimation(y_true, p_pred)[source]¶
Estimates the mutual information using predicted probabilities in the softmax and PC-Softmax functions.
The mutual information I(X; Y) is estimated using the formula:
\[I(X;Y) = H(Y) - H(Y|X)\]where H(Y) is the entropy of the true labels and H(Y|X) is the conditional entropy estimated from the predicted probabilities.
Softmax Function:
\[S(z_k) = \frac{e^{z_k}}{\sum_{j=1}^{K} e^{z_j}}\]PC-Softmax Function:
\[S_{pc}(z_k) = \frac{e^{z_k}}{\sum_{j=1}^{K} e^{z_j} \cdot p_j}\]- Parameters:
y_true (ndarray) – True class labels.
p_pred (ndarray) – Predicted probabilities.
- Returns:
estimated_mi – Estimated mutual information.
- Return type:
float
Notes
The PC-Softmax estimation adjusts the softmax probabilities using class priors, which can improve the robustness of the MI estimate.
If the input contains NaN values, they are removed before performing the estimation.
- autoqild.utilities.metrics.santhi_vardi_upper_bound(y_true, y_pred)[source]¶
Computes the Santhi-Vardi upper bound for mutual information.
- Parameters:
y_true (ndarray) – True class labels.
y_pred (ndarray) – Predicted class labels.
- Returns:
sv_u – The Santhi-Vardi upper bound.
- Return type:
float
Notes
The Santhi-Vardi bound is based on the classification error rate and gives an upper estimate of the mutual information, adjusted logarithmically based on the number of classes.