opr.datasets package
Module for datasets.
opr.datasets.augmentations
Data augmentation pipelines.
Point cloud augmentations adopted from the repository: https://github.com/jac99/MinkLocMultimodal, MIT License
- class opr.datasets.augmentations.DefaultCloudSetTransform(train: bool = False)[source]
Bases:
objectDefault point cloud set augmentation pipeline.
- class opr.datasets.augmentations.DefaultCloudTransform(train: bool = False)[source]
Bases:
objectDefault point cloud augmentation pipeline.
- class opr.datasets.augmentations.DefaultHM3DImageTransform(train: bool = False, resize: tuple[int, int] | None = (288, 160))[source]
Bases:
objectDefault image augmentation pipeline.
- class opr.datasets.augmentations.DefaultImageTransform(train: bool = False, resize: Tuple[int, int] | None = None)[source]
Bases:
objectDefault image augmentation pipeline.
- class opr.datasets.augmentations.DefaultSemanticTransform(train: bool = False, resize: Tuple[int, int] | None = None)[source]
Bases:
objectDefault semantic mask augmentation pipeline.
- class opr.datasets.augmentations.OheHotTransform[source]
Bases:
objectRotate by one of the given angles.
- class opr.datasets.augmentations.OneHotSemanticTransform(train: bool = False, resize: Tuple[int, int] | None = None)[source]
Bases:
objectOne-Hot semantic mask augmentation pipeline.
- class opr.datasets.augmentations.RandomRotation(axis=None, max_theta=180, max_theta2=15)[source]
Bases:
object
- class opr.datasets.augmentations.RemoveRandomBlock(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3))[source]
Bases:
objectRandomly remove part of the point cloud. Similar to PyTorch RandomErasing but operating on 3D point clouds. Erases fronto-parallel cuboid. Instead of erasing we set coords of removed points to (0, 0, 0) to retain the same number of points
opr.datasets.base
Base dataset implementation.
- class opr.datasets.base.BasePlaceRecognitionDataset(dataset_root: str | Path, subset: Literal['train', 'val', 'test'], data_to_load: str | Tuple[str, ...], positive_threshold: float = 10.0, negative_threshold: float = 50.0, image_transform: Any | None = None, semantic_transform: Any | None = None, pointcloud_transform: Any | None = None, pointcloud_set_transform: Any | None = None)[source]
Bases:
DatasetBase class for track-based Place Recognition dataset.
- collate_fn(data_list: List[Dict[str, Tensor]]) Dict[str, Tensor][source]
Collate function for torch.utils.data.DataLoader.
- data_to_load: Tuple[str, ...]
- dataset_df: DataFrame
- dataset_root: Path
- property negatives_mask: Tensor
Boolean mask of negative samples for each element in the dataset.
- property nonnegative_index: List[Tensor]
List of indexes of non-negatives samples for each element in the dataset.
- property positives_index: List[Tensor]
List of indexes of positive samples for each element in the dataset.
- property positives_mask: Tensor
Boolean mask of positive samples for each element in the dataset.
- subset: Literal['train', 'val', 'test']
opr.datasets.dataloader_factory
Functions to create PyTorch DataLoaders for different datasets.
- opr.datasets.dataloader_factory.make_dataloaders(dataset_cfg: DictConfig, batch_sampler_cfg: DictConfig, num_workers: int = 0) Dict[str, DataLoader][source]
Function to create DataLoader objects from given dataset and sampler configs.
- Parameters:
dataset_cfg (DictConfig) – Dataset configuration.
batch_sampler_cfg (DictConfig) – Batch sampler configuration.
num_workers (int) – Number of workers for DataLoader. Defaults to 0.
- Returns:
Dictionary with DataLoaders.
- Return type:
Dict[str, DataLoader]
- opr.datasets.dataloader_factory.make_distributed_dataloaders(dataset_cfg: DictConfig, batch_sampler_cfg: DictConfig, num_workers: int = 0) Dict[str, DataLoader][source]
Function to create DataLoader objects from given dataset and sampler configs.
- Parameters:
dataset_cfg (DictConfig) – Dataset configuration.
batch_sampler_cfg (DictConfig) – Batch sampler configuration.
num_workers (int) – Number of workers for DataLoader. Defaults to 0.
- Returns:
Dictionary with DataLoaders.
- Return type:
Dict[str, DataLoader]
opr.datasets.hm3d
HM3D dataset implementation.
- class opr.datasets.hm3d.HM3DDataset(dataset_root: str | Path, subset: Literal['train', 'val', 'test'], data_to_load: str | tuple[str, ...], positive_threshold: float = 5.0, negative_threshold: float = 10.0, positive_iou_threshold: float = 0.1, pointcloud_quantization_size: float = 0.1, max_point_distance: float = 20.0, image_transform: Any | None = None, pointcloud_transform: Any | None = None, pointcloud_set_transform: Any | None = None)[source]
Bases:
BasePlaceRecognitionDatasetHM3D dataset implementation.
opr.datasets.itlp
Custom ITLP-Campus dataset implementations.
- class opr.datasets.itlp.ITLPCampus(dataset_root: str | ~pathlib.Path, subset: ~typing.Literal['train', 'val', 'test'] | None = None, csv_file: str = 'track.csv', sensors: str | ~typing.Tuple[str, ...] = ('front_cam', 'lidar'), mink_quantization_size: float | None = 0.5, max_point_distance: float | None = None, load_semantics: bool = False, exclude_dynamic_classes: bool = False, load_text_descriptions: bool = False, load_text_labels: bool = False, load_aruco_labels: bool = False, indoor: bool = False, positive_threshold: float = 10.0, negative_threshold: float = 50.0, image_transform=<opr.datasets.augmentations.DefaultImageTransform object>, semantic_transform=<opr.datasets.augmentations.DefaultSemanticTransform object>, late_image_transform=None, load_soc: bool = False, top_k_soc: int = 5, soc_coords_type: ~typing.Literal['cylindrical_3d', 'cylindrical_2d', 'euclidean', 'spherical'] = 'cylindrical_3d', max_distance_soc: float = 50.0, sensors_cfg: ~omegaconf.omegaconf.OmegaConf | None = None, anno: ~omegaconf.omegaconf.OmegaConf | None = None, train_split: list | None = None, test_split: list | None = None)[source]
Bases:
DatasetITLP Campus dataset implementation.
- anno: OmegaConf
- aruco_labels_subdir: str = 'aruco_labels'
- augment_coords_with_normal(coords: ndarray, mean: Tuple[float, float, float] = (0.0, 0.0, 0.0), std: Tuple[float, float, float] = (1.0, 1.0, 1.0)) ndarray[source]
Augment the coordinates with a random normal distribution.
- Parameters:
coords (np.ndarray) – The coordinates to be augmented.
mean (Tuple[float, float, float], optional) – The mean of the normal distribution. Defaults to (0.0, 0.0, 0.0).
std (Tuple[float, float, float], optional) – The standard deviation of the normal distribution. Defaults to (1.0, 1.0, 1.0).
- Returns:
The augmented coordinates.
- Return type:
np.ndarray
- augment_coords_with_rotation(coords: ndarray, angle_range: Tuple = (-3.141592653589793, 3.141592653589793)) ndarray[source]
Augment the coordinates with a random rotation - all objects are rotated by the same, random uniformly distributed angle.
- Parameters:
coords (np.ndarray) – The coordinates to be augmented.
angle_range (Tuple, optional) – The range of the random rotation angle. Defaults to (-np.pi, np.pi).
- Returns:
The augmented coordinates.
- Return type:
np.ndarray
- back_cam_aruco_labels_df: DataFrame | None
- back_cam_text_descriptions_df: DataFrame | None
- back_cam_text_labels_df: DataFrame | None
- cam_config: dict
- cloud_set_transform: DefaultCloudSetTransform
- clouds_subdir: str = 'lidar'
- collate_fn(data_list: List[Dict[str, Tensor]]) Dict[str, Tensor][source]
Pack input data list into batch.
- Parameters:
data_list (List[Dict[str, Tensor]]) – batch data list generated by DataLoader.
- Returns:
dictionary of batched data.
- Return type:
Dict[str, Tensor]
- dataset_df: DataFrame
- dataset_root: Path
- static download_data(out_dir: Path | str) None[source]
Download ITLP-Campus dataset tracks.
- Parameters:
out_dir (Union[Path, str]) – Output directory for downloaded tracks.
- front_cam_aruco_labels_df: DataFrame | None
- front_cam_text_descriptions_df: DataFrame | None
- front_cam_text_labels_df: DataFrame | None
- image_transform: DefaultImageTransform
- images_subdir: str = ''
- indoor: bool
- load_aruco_labels: bool
- load_semantics: bool
- load_soc: bool
- load_text_descriptions: bool
- load_text_labels: bool
- max_distance_soc: float
- property negatives_mask: Tensor
Boolean mask of negative samples for each element in the dataset.
- property nonnegative_index: List[Tensor]
List of indexes of non-negatives samples for each element in the dataset.
- pointcloud_transform: DefaultCloudTransform
- property positives_index: List[Tensor]
List of indexes of positive samples for each element in the dataset.
- property positives_mask: Tensor
Boolean mask of positive samples for each element in the dataset.
- semantic_subdir: str = 'masks'
- sensors: Tuple[str, ...]
- sensors_cfg: OmegaConf
- soc_coords_type: Literal['cylindrical_3d', 'cylindrical_2d', 'euclidean', 'spherical'] = 'cylindrical_3d'
- subset: Literal['train', 'val', 'test']
- test_split: list = None
- text_descriptions_subdir: str = 'text_descriptions'
- text_labels_subdir: str = 'text_labels'
- top_k_soc: int
- train_split: list = None
- vis_dir: str = './vis/'
opr.datasets.nclt
NCLT dataset implementation.
- class opr.datasets.nclt.NCLTDataset(dataset_root: str | Path, subset: Literal['train', 'val', 'test'], data_to_load: str | Tuple[str, ...], positive_threshold: float = 10.0, negative_threshold: float = 50.0, images_dirname: str = 'images_small', masks_dirname: str = 'segmentation_masks_small', pointclouds_dirname: str = 'velodyne_data', use_minkowski: bool = True, pointcloud_quantization_size: float | Tuple[float, float, float] | None = 0.5, max_point_distance: float | None = None, normalize_point_cloud: bool = False, num_points_sample: int | None = None, spherical_coords: bool = False, use_intensity_values: bool = False, image_transform: Any | None = None, semantic_transform: Any | None = None, pointcloud_transform: Any | None = None, pointcloud_set_transform: Any | None = None, load_soc: bool = False, top_k_soc: int = 10, soc_coords_type: Literal['cylindrical_3d', 'cylindrical_2d', 'euclidean', 'spherical'] = 'euclidean', max_distance_soc: float = 50.0, anno: OmegaConf | None = None, exclude_dynamic: bool = False, dynamic_labels: list | None = None)[source]
Bases:
BasePlaceRecognitionDatasetNCLT dataset implementation.
- augment_coords_with_normal(coords: ndarray, mean: Tuple[float, float, float] = (0.0, 0.0, 0.0), std: Tuple[float, float, float] = (1.0, 1.0, 1.0)) ndarray[source]
Augment the coordinates with a random normal distribution.
- Parameters:
coords (np.ndarray) – The coordinates to be augmented.
mean (Tuple[float, float, float]) – The mean of the normal distribution. Defaults to (0.0, 0.0, 0.0).
std (Tuple[float, float, float]) – The standard deviation of the normal distribution. Defaults to (1.0, 1.0, 1.0).
- Returns:
The augmented coordinates.
- Return type:
np.ndarray
- augment_coords_with_rotation(coords: ndarray, angle_range: Tuple = (-3.141592653589793, 3.141592653589793)) ndarray[source]
Augment the coordinates with a random rotation.
All objects are rotated by the same, random uniformly distributed angle.
- Parameters:
coords (np.ndarray) – The coordinates to be augmented.
angle_range (Tuple) – The range of the random rotation angle. Defaults to (-np.pi, np.pi).
- Returns:
The augmented coordinates.
- Return type:
np.ndarray
opr.datasets.oxford
PointNetVLAD Oxford RobotCar dataset implementation.
- class opr.datasets.oxford.OxfordDataset(dataset_root: str | Path, subset: Literal['train', 'val', 'test'], data_to_load: str | Tuple[str, ...], positive_threshold: float = 10.0, negative_threshold: float = 50.0, images_dirname: str = 'images_small', masks_dirname: str = 'segmentation_masks_small', pointclouds_dirname: str | None = None, pointcloud_quantization_size: float | Tuple[float, float, float] | None = 0.01, max_point_distance: float | None = None, spherical_coords: bool = False, image_transform: Any | None = None, semantic_transform: Any | None = None, pointcloud_transform: Any | None = None, pointcloud_set_transform: Any | None = None)[source]
Bases:
BasePlaceRecognitionDatasetPointNetVLAD Oxford RobotCar dataset implementation.
opr.datasets.projection
Projection of pointcloud to camera image plane.
- class opr.datasets.projection.NCLTProjector(front=True)[source]
Bases:
object- adjust_points(projected_points, center_crop_size=(960, 768), resize_size=(320, 256), original_size=(1616, 1232))[source]
Adjust 3D LiDAR points projected onto the image plane to correspond to the new image after center cropping and resizing.
Parameters: - projected_points: np.ndarray of shape (N, 2) representing the 2D points on the original image. - center_crop_size: Tuple (W_c, H_c) representing the width and height of the crop. - resize_size: Tuple (W_r, H_r) representing the width and height of the resized image. - original_size: Tuple (W, H) representing the width and height of the original image.
Returns: - adjusted_points: np.ndarray of shape (N, 2) representing the adjusted 2D points on the resized image.
- class opr.datasets.projection.Projector(cam_cfg: OmegaConf, lidar_cfg: OmegaConf)[source]
Bases:
objectClass for projecting pointcloud to camera image plane.
- build_matrix(x: float, y: float, z: float, q: float) ndarray[source]
Build rotation matrix from quaternion.
- Parameters:
x (float) – x coordinate
y (float) – y coordinate
z (float) – z coordinate
q (float) – quaternion
- Returns:
rotation matrix
- Return type:
np.ndarray
- project_scan_to_camera(points: ndarray, return_mask: bool | None = True) Tuple[ndarray, ndarray, ndarray] | Tuple[ndarray, ndarray][source]
Project pointcloud to camera image plane.
- Parameters:
points (np.ndarray) – pointcloud to project
return_mask (bool, optional) – whether to return mask. Defaults to True.
- Returns:
if return_mask: (uv, depths, in_image) else: (uv, depths)
- Return type:
Union[Tuple[np.ndarray, np.ndarray, np.ndarray], Tuple[np.ndarray, np.ndarray]]
- Raises:
ValueError – if wrong shape of points array
opr.datasets.soc_utils
Utility functions for Semantic-Object-Context modality.
- opr.datasets.soc_utils.cylindrical_to_euclidean(points: ndarray) ndarray[source]
Convert cylindrical coordinates to euclidean.
- Parameters:
points (np.ndarray) – array of cylindrical coordinates with shape (n, 3).
- Returns:
array of euclidean coordinates with shape (n, 3).
- Return type:
points (np.ndarray)
- opr.datasets.soc_utils.euclidean_to_cylindrical(points: ndarray, to_2d: bool = False) ndarray[source]
Convert euclidean coordinates to cylindrical.
- Parameters:
points (np.ndarray) – array of 3D coordinates with shape (n, 3).
to_2d (bool, optional) – whether to return 2D cylindrical coordinates. Defaults to False.
- Returns:
array of cylindrical coordinates with shape (n, 3) or (n, 2) if to_2d is True.
- Return type:
points (np.ndarray)
- opr.datasets.soc_utils.euclidean_to_spherical(points: ndarray) ndarray[source]
Convert euclidean coordinates to spherical.
- Parameters:
points (np.ndarray) – array of 3D coordinates with shape (n, 3).
- Returns:
array of spherical coordinates with shape (n, 3).
- Return type:
points (np.ndarray)
- opr.datasets.soc_utils.generate_color_sequence(num_colors: int, palette: str | None = 'husl') list[source]
Generate color sequence.
- Parameters:
num_colors (int) – number of colors to generate
palette (str, optional) – palette to use. Defaults to “husl”.
- Returns:
list of colors in RGB format.
- Return type:
colors (list)
- opr.datasets.soc_utils.get_points_labels_by_mask(points: ndarray, mask: ndarray) ndarray[source]
Get point labels from semantic mask.
- Parameters:
points (np.ndarray) – array of 2D coordinates of projected points with shape (n, 2).
cam_resolution. (Coordinates should match with)
mask (np.ndarray) – semantic mask in opencv image format (ndarray)
- Returns:
point labels taken from the mask.
- Return type:
labels (np.ndarray)
- opr.datasets.soc_utils.instance_masks_to_objects(instance_masks: dict, points_2d: ndarray, point_labels: ndarray, points_3d: ndarray) dict[source]
Get objects from instance masks.
- Parameters:
instance_masks (dict) – dict of instances with keys as instance labels and values as instance masks.
points_2d (np.ndarray) – 2d points of pointcloud projected to image plane
point_labels (np.ndarray) – labels of points
points_3d (np.ndarray) – 3d points of pointcloud
- Returns:
dict of objects with keys as object labels and values as object properties.
- Return type:
objects (dict)
- opr.datasets.soc_utils.pack_objects(objects: dict, top_k: int, max_distance: float, special_classes: list) ndarray[source]
Pack objects into a single array.
- Parameters:
objects (dict) – dict of objects with keys as object labels and values as object properties.
top_k (int) – maximum number of each class objects to pack
max_distance (float) – maximum distance between objects
special_classes (list) – list of special classes to pack
- Returns:
array of packed objects with shape (N, K, 3), where N - number of classes, K - number of objects of each class, 3 - 3DoF coords.
- Return type:
packed_objects (np.mdarray)
- opr.datasets.soc_utils.semantic_mask_to_instances(mask: ndarray, area_threshold: int | None = 10, labels_whitelist: list | None = None) dict[source]
Get instance labels from semantic mask.
Instances are defined as connected components of the same class. Connected components found using opencv connectedComponentsWithStats opencv algorithm in class-wise manner.
- Parameters:
mask (ndarray) – semantic mask in opencv image format (ndarray)
area_threshold (int, optional) – minimum area of instance to be considered. Defaults to 10.
labels_whitelist (list, optional) – list of labels to consider. Defaults to None.
- Returns:
dict of instances with keys as instance labels and values as instance masks.
- Return type:
instances (dict)