fruit_project.utils.data ======================== .. py:module:: fruit_project.utils.data Attributes ---------- .. autoapisummary:: fruit_project.utils.data.rlimit Functions --------- .. autoapisummary:: fruit_project.utils.data.download_dataset fruit_project.utils.data.make_datasets fruit_project.utils.data.get_sampler fruit_project.utils.data.make_dataloaders fruit_project.utils.data.get_labels_and_mappings fruit_project.utils.data.collate_fn fruit_project.utils.data.set_transforms Module Contents --------------- .. py:data:: rlimit .. py:function:: download_dataset() Downloads the dataset from Kaggle using the Kaggle API. :raises RuntimeError: If the required environment variables for Kaggle API are not set. :returns: None .. py:function:: make_datasets(cfg: omegaconf.DictConfig) -> Tuple[fruit_project.utils.datasets.det_dataset.DET_DS, fruit_project.utils.datasets.det_dataset.DET_DS, fruit_project.utils.datasets.det_dataset.DET_DS] Creates training, testing, and validation datasets. :param cfg: Configuration object containing dataset parameters. :type cfg: DictConfig :returns: The training, testing, and validation datasets. :rtype: Tuple[DET_DS, DET_DS, DET_DS] .. py:function:: get_sampler(train_ds: fruit_project.utils.datasets.det_dataset.DET_DS, strat: str) -> torch.utils.data.WeightedRandomSampler Creates a WeightedRandomSampler for the training dataset. :param train_ds: The training dataset. :type train_ds: DET_DS :param strat: The strategy for weighting ('max' or 'mean'). :type strat: str :returns: A sampler for the training dataset. :rtype: WeightedRandomSampler .. py:function:: make_dataloaders(cfg: omegaconf.DictConfig, train_ds: fruit_project.utils.datasets.det_dataset.DET_DS, test_ds: fruit_project.utils.datasets.det_dataset.DET_DS, val_ds: fruit_project.utils.datasets.det_dataset.DET_DS, generator: torch.Generator, processor: transformers.AutoImageProcessor, transforms: albumentations.Compose) -> Tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader, torch.utils.data.DataLoader] Creates dataloaders for training, testing, and validation datasets. :param cfg: Configuration object containing dataloader parameters. :type cfg: DictConfig :param train_ds: The training dataset. :type train_ds: DET_DS :param test_ds: The testing dataset. :type test_ds: DET_DS :param val_ds: The validation dataset. :type val_ds: DET_DS :param generator: A PyTorch generator for reproducibility. :type generator: torch.Generator :param processor: Processor for image preprocessing. :type processor: AutoImageProcessor :param transforms: Transformations to apply to the datasets. :type transforms: Compose :returns: The training, testing, and validation dataloaders. :rtype: Tuple[DataLoader, DataLoader, DataLoader] .. py:function:: get_labels_and_mappings(train_labels: List, test_labels: List) -> Tuple[List, Dict, Dict] Generates labels and mappings for class IDs and names. :param train_labels: List of labels from the training dataset. :type train_labels: List :param test_labels: List of labels from the testing dataset. :type test_labels: List :returns: A tuple containing: - labels (List): Sorted list of unique labels. - id2lbl (Dict): Mapping from class IDs to labels. - lbl2id (Dict): Mapping from labels to class IDs. :rtype: Tuple[List, Dict, Dict] .. py:function:: collate_fn(batch: transformers.BatchEncoding, processor: transformers.AutoImageProcessor) -> Tuple[transformers.BatchEncoding, List] Collates a batch of data for the dataloader. :param batch: A batch of data containing images and targets. :type batch: BatchEncoding :param processor: Processor for image preprocessing. :type processor: AutoImageProcessor :returns: Processed batch and list of targets. :rtype: Tuple[BatchEncoding, List] .. py:function:: set_transforms(train_dl: torch.utils.data.DataLoader, test_dl: torch.utils.data.DataLoader, val_dl: torch.utils.data.DataLoader, transforms: albumentations.Compose) -> Tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader, torch.utils.data.DataLoader] Sets transformations for the datasets in the dataloaders. :param train_dl: Training dataloader. :type train_dl: DataLoader :param test_dl: Testing dataloader. :type test_dl: DataLoader :param val_dl: Validation dataloader. :type val_dl: DataLoader :param transforms: Transformations to apply. :type transforms: Compose :returns: Updated dataloaders with transformations applied. :rtype: Tuple[DataLoader, DataLoader, DataLoader]