fruit_project.utils.datasets.det_dataset
========================================

.. py:module:: fruit_project.utils.datasets.det_dataset


Classes
-------

.. autoapisummary::

   fruit_project.utils.datasets.det_dataset.DET_DS


Functions
---------

.. autoapisummary::

   fruit_project.utils.datasets.det_dataset.format_for_hf_processor


Module Contents
---------------

.. py:class:: DET_DS(root_dir: str | None, split: str, config_file: str, transforms: albumentations.Compose | None = None, processor=None, normalize: bool = False)

   Bases: :py:obj:`torch.utils.data.Dataset`


   A custom dataset class for object detection tasks.

   This dataset class loads images and their corresponding labels from specified directories,
   applies transformations if provided, and returns the processed image along with target annotations.

   .. attribute:: root_dir

      The root directory containing the dataset.

      :type: str

   .. attribute:: split

      The dataset split (e.g., 'train', 'val', 'test').

      :type: str

   .. attribute:: config_file

      The path to the configuration file containing class names and folder structure.

      :type: str

   .. attribute:: transforms

      A function or object to apply transformations to the images and annotations.

      :type: Albumentations Compose, optional

   .. attribute:: image_paths

      A list of valid image file paths.

      :type: list

   .. attribute:: labels

      A list of class names.

      :type: list

   .. attribute:: id2lbl

      A mapping from class IDs to class names.

      :type: dict

   .. attribute:: lbl2id

      A mapping from class names to class IDs.

      :type: dict

   The configuration file (YAML) should contain:
       - names: List of class names
       - folders (optional): Dictionary with keys 'images', 'labels', 'train', 'val', 'test'
         specifying the folder names. Defaults to standard names if not provided.
       - folders.structure (optional): Either 'type_first' (default) for images/train structure
         or 'split_first' for train/images structure.

   .. method:: __len__()

      Returns the number of valid images in the dataset.

   .. method:: __getitem__(idx)

      Returns the processed image and target annotations for the given index.
      

   :param root_dir: The root directory containing the dataset.
   :type root_dir: str
   :param split: The dataset split (e.g., 'train', 'val', 'test').
   :type split: str
   :param config_file: The path to the configuration file containing class names and folder structure.
   :type config_file: str
   :param transforms: A function or object to apply transformations to the images and annotations.
   :type transforms: Albumentations Compose, optional

   :raises FileNotFoundError: If the configuration file or label files are not found.
   :raises ValueError: If an image cannot be loaded or is invalid.


   .. py:attribute:: root_dir
      :type:  pathlib.Path


   .. py:attribute:: split


   .. py:attribute:: transforms
      :value: None


   .. py:attribute:: config_dir


   .. py:attribute:: processor
      :value: None


   .. py:attribute:: normalize
      :value: False


   .. py:attribute:: labels


   .. py:attribute:: id2lbl


   .. py:attribute:: lbl2id


   .. py:attribute:: image_paths
      :value: []


   .. py:attribute:: label_paths
      :value: []


   .. py:method:: __len__()

      :returns: The number of valid images in the dataset.
      :rtype: int


   .. py:method:: __getitem__(idx)

      Retrieves the processed image and target annotations for the given index.

      :param idx: The index of the image to retrieve.
      :type idx: int

      :returns:

                A tuple containing:
                    - img (numpy.ndarray): The processed image.
                    - target (dict): A dictionary containing target annotations, including:
                        - image_id (int): The index of the image.
                        - annotations (list): A list of dictionaries with bounding box, category ID, area, and iscrowd flag.
                        - orig_size (torch.Tensor): The original size of the image (height, width).
      :rtype: tuple


   .. py:method:: get_raw_item(idx: int)

      Fetches a raw, untransformed image and its annotations.
      This is a helper method for multi-sample augmentations like Mosaic.


.. py:function:: format_for_hf_processor(boxes, labels, idx)

   Convert back to HF format