Dataset Creation

class dipm.data.chemical_datasets.dataset_creation.DatasetCreation(config: DatasetCreationConfig)

Class for creating datasets.

post_process_fn

Function to apply to each system after loading.

Type:

Callable[[ChemicalSystem], Any] | None

__init__(config: DatasetCreationConfig)
create_datasets(post_process_fn: Callable[[ChemicalSystem], Any] | None = None) tuple[ConcatDataset, ConcatDataset | None, ConcatDataset | None]

Create dataset from config.

get_train_datasets(post_process_fn: Callable[[ChemicalSystem], Any] | None = None, load_exclusions: bool = True) list[Hdf5Dataset]

List of training datasets.

get_valid_datasets(post_process_fn: Callable[[ChemicalSystem], Any] | None = None, load_exclusions: bool = True) list[Hdf5Dataset]

List of validation datasets.

get_test_datasets(post_process_fn: Callable[[ChemicalSystem], Any] | None = None, load_exclusions: bool = True) list[Hdf5Dataset]

List of test datasets.

static filter_duplicates(datasets: list[Hdf5Dataset], datasets_to_exclude: list[Hdf5Dataset])

Filter out duplicate datasets.