opr.models.place_recognition package
Module for Place Recognition models.
opr.models.place_recognition.apgem
Implementation of APGeM Image Model.
- class opr.models.place_recognition.apgem.APGeMModel(backbone: Literal['resnet18', 'resnet50', 'vgg16'] = 'resnet50')[source]
Bases:
ImageModelAPGeM: ‘Learning with Average Precision: Training Image Retrieval with a Listwise Loss’.
opr.models.place_recognition.base
Base meta-models for Place Recognition.
- class opr.models.place_recognition.base.CloudModel(backbone: Module, head: Module)[source]
Bases:
ModuleMeta-model for lidar-based Place Recognition. Combines feature extraction backbone and head modules.
- forward(batch: Dict[str, Tensor]) Dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.base.ImageModel(backbone: Module, head: Module, fusion: Module | None = None, forward_type: str | None = 'fp32', onnx_model_path: str | None = None, engine_path: str | None = None)[source]
Bases:
ModuleMeta-model for image-based Place Recognition. Combines feature extraction backbone and head modules.
- forward(batch: Dict[str, Tensor]) Dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.base.LateFusionModel(image_module: ImageModel | None = None, semantic_module: SemanticModel | None = None, cloud_module: CloudModel | None = None, soc_module: Module | None = None, fusion_module: Module | None = None)[source]
Bases:
ModuleMeta-model for multimodal Place Recognition architectures with late fusion.
- forward(batch: Dict[str, Tensor]) Dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.base.SemanticModel(backbone: Module, head: Module, fusion: Module | None = None, forward_type: str | None = 'fp32', onnx_model_path: str | None = None, engine_path: str | None = None)[source]
Bases:
ImageModelMeta-model for semantic-based Place Recognition. Combines feature extraction backbone and head modules.
- forward(batch: Dict[str, Tensor]) Dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.base.SequenceLateFusionModel(late_fusion_model: LateFusionModel, temporal_fusion_module: Module | None = None)[source]
Bases:
ModuleMeta-model for sequence-based multimodal Place Recognition with late fusion.
- forward(batch: Dict[str, Tensor]) Dict[str, Tensor][source]
Process a sequence of frames efficiently by reshaping to batch processing.
- Parameters:
batch – Dictionary containing sequence data with shape [B, S, …] where B is batch size and S is sequence length
- Returns:
Dictionary with the final descriptor after temporal fusion
opr.models.place_recognition.cosplace
Implementation of CosPlace model.
- class opr.models.place_recognition.cosplace.CosPlaceModel(backbone: Literal['resnet18', 'resnet50', 'vgg16'] = 'resnet50', out_dim: int = 256)[source]
Bases:
ImageModelCosPlace: Rethinking Visual Geo-localization for Large-Scale Applications.
opr.models.place_recognition.minkloc
Implementations of MinkLoc models.
- class opr.models.place_recognition.minkloc.MinkLoc3D(in_channels: int = 1, out_channels: int = 256, num_top_down: int = 1, conv0_kernel_size: int = 5, block: str = 'BasicBlock', layers: Tuple[int, ...] = (1, 1, 1), planes: Tuple[int, ...] = (32, 64, 64), pooling: str = 'gem')[source]
Bases:
CloudModelMinkLoc3D: Point Cloud Based Large-Scale Place Recognition.
Paper: https://arxiv.org/abs/2011.04530 Code is adopted from the original repository: https://github.com/jac99/MinkLoc3Dv2, MIT License
- class opr.models.place_recognition.minkloc.MinkLoc3Dv2(in_channels: int = 1, out_channels: int = 256, num_top_down: int = 2, conv0_kernel_size: int = 5, block: str = 'ECABasicBlock', layers: Tuple[int, ...] = (1, 1, 1, 1), planes: Tuple[int, ...] = (64, 128, 64, 32), pooling: str = 'gem')[source]
Bases:
MinkLoc3DImproving Point Cloud Based Place Recognition with Ranking-based Loss and Large Batch Training.
Paper: https://arxiv.org/abs/2203.00972 Code is adopted from the original repository: https://github.com/jac99/MinkLoc3Dv2, MIT License
- class opr.models.place_recognition.minkloc.MinkLocMultimodal(lidar_in_channels: int = 1, lidar_out_channels: int = 256, lidar_num_top_down: int = 2, lidar_conv0_kernel_size: int = 5, lidar_block: str = 'ECABasicBlock', lidar_layers: Tuple[int, ...] = (1, 1, 1, 1), lidar_planes: Tuple[int, ...] = (64, 128, 64, 32), lidar_pooling: str = 'gem', image_in_channels: int = 3, image_out_channels: int = 256, image_num_top_down: int = 0, image_pooling: str = 'gem', image_pretrained: bool = True, fusion_type: str = 'concat')[source]
Bases:
LateFusionModelMinkLoc++: Lidar and Monocular Image Fusion for Place Recognition.
Paper: https://arxiv.org/pdf/2104.05327.pdf Code is adopted from the original repository: https://github.com/jac99/MinkLocMultimodal, MIT License
opr.models.place_recognition.netvlad
Implementation of NetVLAD model.
- class opr.models.place_recognition.netvlad.NetVLADModel(backbone: Literal['resnet18', 'resnet50', 'vgg16'] = 'resnet18', num_clusters: int = 64, normalize_input: bool = True, vladv2: bool = False)[source]
Bases:
ImageModelNetVLAD: CNN architecture for weakly supervised place recognition.
Paper: https://arxiv.org/abs/1511.07247v3 Code is adopted from the repository: https://github.com/Nanne/pytorch-NetVlad
opr.models.place_recognition.overlaptransformer
Implementation of OverlapTransformer model.
- class opr.models.place_recognition.overlaptransformer.OverlapTransformer(height: int = 64, width: int = 900, channels: int = 1, norm_layer: Module | None = None, use_transformer: bool = True)[source]
Bases:
ModuleOverlapTransformer: An Efficient and Yaw-Angle-Invariant Transformer Network for LiDAR-Based Place Recognition.
Paper: https://arxiv.org/abs/2203.03397 Adapted from original repository: https://github.com/haomo-ai/OverlapTransformer
- forward(batch: dict[str, Tensor]) dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- relu
MHSA num_layers=1 is suggested in our work.
opr.models.place_recognition.patchnetvlad
Implementation of PatchNetVLAD model.
- class opr.models.place_recognition.patchnetvlad.PatchNetVLAD(backbone: Literal['resnet18', 'resnet50', 'vgg16'] = 'vgg16', num_clusters: int = 64, normalize_input: bool = True, vladv2: bool = False, use_faiss: bool = True, patch_sizes: tuple[int] = (4,), strides: tuple[int] = (1,))[source]
Bases:
ImageModelPatch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition.
Paper: https://arxiv.org/abs/2103.01486 Code is adopted from original repository: https://github.com/QVPR/Patch-NetVLAD
- forward(batch: dict[str, Tensor]) dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- opr.models.place_recognition.patchnetvlad.get_integral_feature(feat_in: Tensor) Tensor[source]
Input/Output as [N,D,H,W] where N is batch size and D is descriptor dimensions For VLAD, D = K x d where K is the number of clusters and d is the original descriptor dimensions
- opr.models.place_recognition.patchnetvlad.get_square_regions_from_integral(feat_integral: Tensor, patch_size: int, patch_stride: int) Tensor[source]
Input as [N,D,H+1,W+1] where additional 1s for last two axes are zero paddings regSize and regStride are single values as only square regions are implemented currently
opr.models.place_recognition.pointnetvlad
Implementation of PointNetVLAD model.
- class opr.models.place_recognition.pointnetvlad.Flatten[source]
Bases:
ModuleFlatten layer.
- forward(input: Tensor) Tensor[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.pointnetvlad.GatingContext(dim: int, add_batch_norm: bool = True)[source]
Bases:
ModuleGating context layer.
- forward(x: Tensor) Tensor[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.pointnetvlad.NetVLADLoupe(feature_size: int, max_samples: int, cluster_size: int, output_dim: int, gating: bool = True, add_batch_norm: bool = True, is_training: bool = True)[source]
Bases:
ModuleNetVLAD aggregation layer with gating mechanism.
- forward(x: Tensor) Tensor[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.pointnetvlad.PointNetFeat(num_points: int = 2500, global_feat: bool = True, feature_transform: bool = False, max_pool: bool = True)[source]
Bases:
ModulePointNet feature extractor.
- forward(x: Tensor) Tensor[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.pointnetvlad.PointNetVLAD(num_points: int = 2500, global_feat: bool = True, feature_transform: bool = False, max_pool: bool = False, output_dim: int = 1024)[source]
Bases:
ModulePointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition.
Paper: https://arxiv.org/abs/1804.03492 Original repository: https://github.com/mikacuy/pointnetvlad Code is adopted from repository: https://github.com/cattaneod/PointNetVlad-Pytorch
- forward(batch: dict[str, Tensor]) dict[str, Tensor][source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class opr.models.place_recognition.pointnetvlad.STN3d(num_points: int = 2500, k: int = 3, use_bn: bool = True)[source]
Bases:
ModuleSpatial Transformer Network for 3D data.
- forward(x: Tensor) Tensor[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
opr.models.place_recognition.resnet
ResNet image models for Place Recognition.
- class opr.models.place_recognition.resnet.ResNet18(in_channels: int = 3, out_channels: int = 256, num_top_down: int = 0, pooling: str = 'gem', pretrained: bool = True)[source]
Bases:
ImageModelResNet18 image model for Place Recognition.
- class opr.models.place_recognition.resnet.SemanticResNet18(in_channels: int = 1, out_channels: int = 256, num_top_down: int = 0, pooling: str = 'gem')[source]
Bases:
SemanticModelResNet18 semantic mask model for Place Recognition.
opr.models.place_recognition.soc
Semantic-Object-Context modality model.
- class opr.models.place_recognition.soc.SOCMLP(num_classes: int, num_objects: int, embeddings_size: int | None = 256)[source]
Bases:
SOCModelSemantic-Object-Context modality model.
- class opr.models.place_recognition.soc.SOCMLPMixer(num_classes: int, num_objects: int, patch_size: int = 1, hidden_dim: int = 64, depth: int = 3, embeddings_size: int = 256)[source]
Bases:
SOCModelSemantic-Object-Context modality model based on MLP Mixer .
Kind of Attention-layer build on top of MLPs. Original paper: https://arxiv.org/abs/2105.01601 implementation: https://github.com/lucidrains/mlp-mixer-pytorch
- class opr.models.place_recognition.soc.SOCMLPMixerModel(model, forward_type='fp32')[source]
Bases:
Module- forward(batch)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
opr.models.place_recognition.svtnet
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition.
- Citation:
Fan, Zhaoxin, et al. “Svt-net: Super light-weight sparse voxel transformer for large scale place recognition.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 1. 2022.
Source: https://github.com/ZhenboSong/SVTNet Paper: https://arxiv.org/abs/2105.00149
- class opr.models.place_recognition.svtnet.SVTNet(in_channels: int = 1, out_channels: int = 256, conv0_kernel_size: int = 5, block: str = 'ECABasicBlock', asvt: bool = True, csvt: bool = True, layers: tuple[int, ...] = (1, 1, 1), planes: tuple[int, ...] = (32, 64, 64), pooling: str = 'gem')[source]
Bases:
CloudModelSVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition.
- Citation:
Fan, Zhaoxin, et al. “Svt-net: Super light-weight sparse voxel transformer for large scale place recognition.” Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 36. No. 1. 2022.
Source: https://github.com/ZhenboSong/SVTNet Paper: https://arxiv.org/abs/2105.00149