autoqild.dataset_readersΒΆ

The dataset_readers package provides utilities for reading OpenML datasets, generating synthetic data, both distance-based and traditional, for testing and evaluation purposes.

Modules

open_ml_padding_dr

Reader for OpenML datasets applying padding strategies to analyze data leakage.

open_ml_timming_dr

Reader for OpenML datasets focusing on timing features for data leakage analysis.

synthetic_data_generator

Generates synthetic datasets with introducing noise by flipping certain percentage of labels for testing and evaluating machine learning models.

synthetic_data_generator_distance

Generates synthetic datasets by instroducing noise with reducing the distance between gaussians of each class, simulating different distributions.

utils

Provides utility functions for dataset handling, operations, and preprocessing.