autoqild.bayes_search.bayes_search_cv¶

Implements the main BayesSearchCV class, which orchestrates the Bayesian optimization process extending the functionality of BayesSearchCV from the scikit-optimize library.

Classes

BayesSearchCV(estimator, search_spaces[, ...])

BayesSearchCV is a custom implementation of Bayesian optimization-based hyperparameter tuning, extending the functionality of BayesSearchCV from the scikit-optimize library.

class autoqild.bayes_search.bayes_search_cv.BayesSearchCV(estimator, search_spaces, optimizer_kwargs=None, n_iter=50, scoring=None, fit_params=None, n_jobs=1, n_points=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', random_state=None, error_score='raise', return_train_score=False, optimizers_file_path='results.pkl')[source]¶

Bases: BayesSearchCV

BayesSearchCV is a custom implementation of Bayesian optimization-based hyperparameter tuning, extending the functionality of BayesSearchCV from the scikit-optimize library. This class facilitates efficient exploration of hyperparameter spaces to identify the best-performing model configurations.

This implementation provides additional functionality for logging, handling optimizer states, and saving optimization progress to a file, enabling resumption of interrupted searches.

logger¶

Logger instance used for logging the optimization process and any errors encountered.

Type:: logging.Logger

optimizers_file_path¶

Path to the file where the optimizer states are saved. This allows for resuming optimization from where it was left off in case of interruptions.

Type:: str

Parameters:

estimator (estimator object) – The object to use to fit the data.
search_spaces (dict, list of dict or list of tuple) – The search space for the hyperparameters.
optimizer_kwargs (dict, optional) – Additional arguments for the optimizer.
n_iter (int, default=50) – Number of parameter settings that are sampled.
scoring (string, callable or None, default=None) – A single string or a callable to evaluate the predictions on the test set.
fit_params (dict, optional) – Parameters to pass to the fit method of the estimator.
n_jobs (int, default=1) – Number of jobs to run in parallel.
n_points (int, default=1) – Number of parameter settings to sample in parallel.
iid (boolean, default=True) – If True, return the average score across folds.
refit (boolean, default=True) – Refit the best estimator with the entire dataset.
cv (int, cross-validation generator or an iterable, optional) – Determines the cross-validation splitting strategy.
verbose (int, default=0) – Controls the verbosity.
pre_dispatch (int or string, default=`2*n_jobs`) – Controls the number of jobs that get dispatched during parallel execution.
random_state (int, RandomState instance or None, optional) – Controls the randomness of the estimator.
error_score (raise or numeric, default=`raise`) – Value to assign to the score if an error occurs.
return_train_score (boolean, default=False) – If False, the results attribute will not include training scores.
optimizers_file_path (string, default=`results.pkl`) – Path to save the optimizer states.

Private Methods¶

_step(search_space, optimizer, evaluate_candidates, n_points=1): Generates parameter combinations and evaluates them in parallel.
_run_search(evaluate_candidates): Runs the search process to find the best hyperparameters by iteratively evaluating different configurations based on the Bayesian optimization strategy.