. 2021 Oct 27;21(21):7120. doi: 10.3390/s21217120

Table 18.

Literature Analysis: Camera-LiDAR Fusion-based 3D Object Recognition Methods.

Model	Detector Category	Environment	Scenario	Fusion Level	Advantage(s)	Limitation(s)
MV3D [73]	Two-stage	Outdoor	Multi-view feature fusion and 3D object proposal generation scenario	Early, Late, Deep	Introduces a deep fusion scheme for leveraging region-wise features from bird-eye and front view for multi-modalities’ interaction	The low LiDAR point density does not allow the detection of far objects that are captured by the camera The BEV-based region proposal network limits the recognition Detects cars only
BEVLFVC [74]	One-stage	Outdoor	Fusion scenario for LiDAR point cloud and camera-captured images in CNN	Middle	Exploits and fuses the whole feature map in contrast to previous fusion-based networks Generates high-quality proposal by fusion but boosts the speed by the fast one-stage fusion-based detector	Does not have superior LiDAR input representation Detects pedestrians only
D3PD [75]	Two-stage	Outdoor	3D person detection scenario in automotive scenes	Early, Late, Deep	Performs end-to-end learning on camera-LiDAR data and gives high-level sensor data representation	Dependent on ground plane estimation for finding 3D anchor proposals
MVX-Net [76]	One-stage.	Outdoor.	Integration scenario for RGB and point-cloud modalities.	Early, Middle.	Reduces false positives and negatives due to its effective multi-modal fusion.	Does not provide a multi-class detection network.
SharedNet [77]	One-stage.	Outdoor.	LiDAR-camera-based 3D object detection scenario with only one neural network for autonomous vehicles.	Early, Middle.	Achieving a good balance between accuracy and efficiency. Reduces the memory requirements and model training time.	Slightly inferior performance in case of car detection.