autoqild.automl.autogluon_classifier¶
AutoGluonClassifier is a wrapper for building, training, and evaluating an AutoML model using AutoGluon.
Classes
|
AutoGluonClassifier is a wrapper for building, training, and evaluating an AutoML model using AutoGluon. |
- class autoqild.automl.autogluon_classifier.AutoGluonClassifier(n_features, n_classes, time_limit=1800, output_folder=None, eval_metric='accuracy', use_hyperparameters=True, delete_tmp_folder_after_terminate=True, auto_stack=True, remove_boosting_models=True, verbosity=6, random_state=None, **kwargs)[source]¶
Bases:
AutomlClassifierAutoGluonClassifier is a wrapper for building, training, and evaluating an AutoML model using AutoGluon.
This class facilitates the use of AutoGluon for automatic machine learning (AutoML) tasks, specifically focusing on classification problems. It handles various aspects of model training, including hyperparameter tuning, model stacking, and model evaluation. The class is designed to work seamlessly with the AutoGluon library, allowing users to leverage its powerful features with minimal setup.
- Parameters:
n_features (int) – Number of features or dimensionality of the input data.
n_classes (int) – Number of classes in the classification problem.
time_limit (int, optional) – Time limit for training the model, in seconds. Default is 1800.
output_folder (str, optional) – Path to the directory where the trained model and related files will be saved. Default is None.
eval_metric (str, optional) – Evaluation metric used to assess the performance of the model. Default is accuracy.
use_hyperparameters (bool, optional) – Flag indicating whether to use predefined hyperparameters for model training. Default is True.
delete_tmp_folder_after_terminate (bool, optional) – Flag indicating whether to delete the temporary folder after model training is complete. Default is True.
auto_stack (bool, optional) – Flag indicating whether to use automatic stacking of models in AutoGluon. Default is True.
remove_boosting_models (bool, optional) – Flag indicating whether to exclude boosting models (like GBM, CAT, XGB) from the hyperparameters. Default is True.
verbosity (int, optional) – Level of verbosity for logging and output. Default is 6.
random_state (int or None, optional) – Seed for random number generation to ensure reproducibility. Default is None.
- logger¶
Logger object used for logging messages and errors.
- Type:
logging.Logger
- random_state¶
Random state instance for reproducibility.
- Type:
np.random.RandomState
- output_folder¶
Path to the directory where the trained model and related files will be saved.
- Type:
str
- delete_tmp_folder_after_terminate¶
Flag indicating whether to delete the temporary folder after model training is complete.
- Type:
bool
- hyperparameter_tune_kwargs¶
Dictionary containing options for hyperparameter tuning, including the scheduler and searcher.
- Type:
dict
- eval_metric¶
Evaluation metric used to assess the performance of the model.
- Type:
str
- use_hyperparameters¶
Flag indicating whether to use predefined hyperparameters for model training.
- Type:
bool
- verbosity¶
Level of verbosity for logging and output.
- Type:
int
- hyperparameters¶
Dictionary of hyperparameters used for model training. If use_hyperparameters is False, this is None.
- Type:
dict or None
- exclude_model_types¶
List of model types to exclude from the training process.
- Type:
list
- auto_stack¶
Flag indicating whether to use automatic stacking of models in AutoGluon.
- Type:
bool
- n_features¶
Number of features or dimensionality of the input data.
- Type:
int
- n_classes¶
Number of classes in the classification problem.
- Type:
int
- sample_weight¶
Method for determining sample weights during training, default is auto_weight.
- Type:
str
- time_limit¶
Time limit for training the model, in seconds.
- Type:
int
- model¶
The AutoGluon model object, initialized after fitting.
- Type:
autogluon.tabular.TabularPredictor or None
- class_label¶
Name of the target label column.
- Type:
str
- columns¶
List of column names for the input DataFrame, including feature names and the class label.
- Type:
list
- leaderboard¶
DataFrame containing information about the models trained during the fitting process.
- Type:
pandas.DataFrame or None
- Private Methods
- ---------------
- _is_fitted_ bool¶
Property to check if the model is already fitted.
- convert_to_dataframe(X, y=None)[source]¶
Convert the input data to a DataFrame.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
y (array-like of shape (n_samples,), optional) – Target vector.
- Returns:
df_data – DataFrame containing the input data.
- Return type:
pandas.DataFrame
- decision_function(X, verbose=0)[source]¶
Compute the decision function in form of class probabilities for the input samples.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
verbose (int, optional, default=0) – Verbosity level.
- Returns:
decision – Decision function values.
- Return type:
array-like of shape (n_samples,)
- fit(X, y, **kwd)[source]¶
Fit the AutoGluon model to the training data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
y (array-like of shape (n_samples,)) – Target vector.
**kwd (dict, optional) – Additional keyword arguments.
- get_k_rank_model(k)[source]¶
Get the k-th ranked model from the leaderboard.
- Parameters:
k (int) – Rank of the model to retrieve.
- Returns:
model – The k-th ranked model.
- Return type:
autogluon.tabular.TabularPredictor
- get_model(model_name)[source]¶
Get a model by its name from the leaderboard.
- Parameters:
model_name (str) – Name of the model to retrieve.
- Returns:
model – The specified model.
- Return type:
autogluon.tabular.TabularPredictor
- predict(X, verbose=0)[source]¶
Predict class labels for the input samples.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
verbose (int, optional, default=0) – Verbosity level.
- Returns:
y_pred – Predicted class labels.
- Return type:
array-like of shape (n_samples,)
- predict_proba(X, verbose=0)[source]¶
Predict class probabilities for the input samples.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
verbose (int, optional, default=0) – Verbosity level.
- Returns:
y_pred – Predicted class probabilities.
- Return type:
array-like of shape (n_samples, n_classes)
- score(X, y, sample_weight=None, verbose=0)[source]¶
Compute the balanced accuracy score for the input samples.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Feature matrix.
y (array-like of shape (n_samples,)) – True labels.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
verbose (int, optional, default=0) – Verbosity level.
- Returns:
score – Balanced accuracy score.
- Return type:
float