Table of Links

Abstract and I Introduction

II. Literature Review

III. Method of the Study

IV. Result and Discussion

V. Conclusion, Acknowledgment, and References

II. LITERATURE REVIEW

In recent years, the field of object detection particularly in applications related to fisheries and underwater environments has witnessed remarkable progress.

A study [2] introduced a novel multitask model that combined the YOLOv5 architecture with a semantic segmentation head for real-time fish detection and segmentation. Experiments conducted on the golden crucian carp dataset showcase an impressive precision of 95.4% in object detection and semantic segmentation accuracy of 98.5%. The model also exhibits competitive performance on the PASCAL VOC 2007 dataset, achieving object detection precision of 73.8% and semantic segmentation accuracy of 84.3%. The model worked with a remarkable speed by achieving processing rates of 116.6 FPS and 120 FPS on an RTX3060. The YOLOv5's capabilities were enhanced due to the addition of a segmentation head. The study followed a precise methodology, including model validation, selection, ablation experiments, and optimization, and compared its results with other models.

Researchers [3] addressed the challenge of detecting fish in dense groups and small underwater targets with the introduction of an enhanced CME-YOLOv5 network. This algorithm achieves a mean average precision (mAP@0.50) of 94.9%, surpassing the baseline YOLOv5 by 4.4 percentage points. Notably, it excels in detecting densely spaced fish and small targets, making it a promising tool for fishery resource investigation. The algorithm effectively mitigates issues related to missing detections in dense fish schools and enhances accuracy for small target fish with limited pixel and feature information. Moreover, it demonstrates proficiency in detecting small objects and effectively handling occluded and highly overlapping objects, outperforming YOLOv5 by detecting 49 additional objects and achieving a 24.6% increase in the detection ratio.

One study that centers on improving underwater object detection by evaluating various models, including EfficientDet, YOLOv5, YOLOv8, and Detectron2, on the "Brackish-Dataset" of annotated underwater images captured in Limfjorden water [4]. The research project aims to evaluate the efficiency of these models in terms of accuracy and inference time while proposing modifications to enhance EfficientDet's performance. The study conducts a thorough comparison of multiple object detection models, revealing that EfficientDet had the lead with a mean Average Precision (mAP) of 98.56%, followed closely by YOLOv8 with 98.20% mAP. YOLOv5 secures a mAP of 97.6%. Notably, the modified EfficientDet, incorporating a modified BiSkFPN mechanism and adversarial noise handling, achieves an outstanding mAP of 98.63%, with adversarial learning also elevating YOLOv5's accuracy to 98.04% mAP. The research also provided class activation map-based explanations (CAM) for the two models to promote explainability in black box models.

In terms of addressing inefficient, and potentially damaging manual sorting methods, another study [5] introduced an automated, non-contact sorting approach for crayfish using an enhanced YOLOv5 algorithm. The proposed algorithm attains a mean Average Precision (mAP) of 98.8% while reducing image processing time to a mere 2 ms. It was significant in crayfish sorting, offering an accurate, efficient, and fast alternative. The reduction in image processing time significantly enhances the overall speed of the algorithm, optimizing crayfish sorting operations.

A modified version of YOLOv5, called UTD-Yolov5 was designed and introduced in another study [6] to identify the Crown of Thorns Starfish (COTS) in underwater images. This algorithm is crucial for complex underwater operations. UTDYolov5 incorporates several modifications to the YOLOv5 network architecture, including a two-stage cascaded CSP backbone, a visual channel attention mechanism module, and a random anchor box similarity calculation method. Additionally, optimization methods such as Weighted Box Fusion (WBF) and iterative refinement are proposed to enhance network efficiency. Experiments conducted on the CSIRO dataset revealed that UTD-Yolov5 achieves an average accuracy of 78.54% which is a significant improvement over the baseline.

Authors:

(1) Mahmudul Islam, Masum School of Computing and Information Sciences, Florida International University Miami, USA (mmasu004@fiu.edu);

(2) Arif Sarwat, Department of Electrical and Computer Engineering, Florida International University Miami, USA (asarwat@fiu.edu);

(3) Hugo Riggs,Department of Electrical and Computer Engineering, Florida International University Miami, USA (hrigg002@fiu.edu);

(4) Alicia Boymelgreen, Department of Mechanical and Materials Engineering, Florida International University Miami, USA (aboymelg@fiu.edu);

(5) Preyojon Dey, Department of Mechanical and Materials Engineering, Florida International University Miami, USA (pdey004@fiu.edu).

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

Literature Review

About Author

Topics

Around The Web

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps