Comparative Study of YOLOv5 and YOLOv8 for Challenging Aquatic Object Detection

Authors:

(1) Mahmudul Islam, Masum School of Computing and Information Sciences, Florida International University Miami, USA ([email protected]);

(2) Arif Sarwat, Department of Electrical and Computer Engineering, Florida International University Miami, USA ([email protected]);

(3) Hugo Riggs,Department of Electrical and Computer Engineering, Florida International University Miami, USA ([email protected]);

(4) Alicia Boymelgreen, Department of Mechanical and Materials Engineering, Florida International University Miami, USA ([email protected]);

(5) Preyojon Dey, Department of Mechanical and Materials Engineering, Florida International University Miami, USA ([email protected]).

Table of Links

Abstract and I Introduction

II. Literature Review

III. Method of the Study

IV. Result and Discussion

V. Conclusion, Acknowledgment, and References

Abstract— This paper presents a comparative study of object detection using YOLOv5 and YOLOv8 for three distinct classes: artemia, cyst, and excrement. In this comparative study, we analyze the performance of these models in terms of accuracy, precision, recall, etc. where YOLOv5 often performed better in detecting Artemia and cysts with excellent precision and accuracy. However, when it came to detecting excrement, YOLOv5 faced notable challenges and limitations. This suggests that YOLOv8 offers greater versatility and adaptability in detection tasks while YOLOv5 may struggle in difficult situations and may need further fine-tuning or specialized training to enhance its performance. The results show insights into the suitability of YOLOv5 and YOLOv8 for detecting objects in challenging marine environments, with implications for applications such as ecological research.

I. INTRODUCTION

Object detection technology became a powerful tool in fisheries management and research that enabled automated identification and tracking of objects. This study focuses on the comparative analysis of two cutting-edge object detection models, YOLOv5 and YOLOv8. We analyzed their performance in detecting three important objects within aquatic environments: live Artemia (small zooplankton), cysts, and excrement. Accurate detection of these objects is essential for various fisheries-related tasks as it helps us understand ecosystem dynamics, assess fish health, and optimize aquaculture operations.

We designed an experimental setup and procedure to conduct the study. The setup was crucial for the free movement of Artemia (brine shrimp) within a controlled environment. heir high nutrition content. Their high nutritional density and ease of culture make it a valuable resource for feeding marine larvae with small mouth gapes. This contribution to the development and growth of marine organisms makes Artemia essential in aquaculture. The small size, rapid movement, and variable orientations of Artemia pose a unique challenge for object detection which makes them an ideal test subject for evaluating the capabilities of YOLOv5 and YOLOv8 in dynamic fisheries environments. The data for our dataset were obtained during Artemia cyst hatching in various nanoparticle-saturated saltwater samples. The uptake was captured with microscopic Image sampling as the nanoparticles are fluorescent.

Capturing nanoparticle uptakes in challenging environments faces a common set of challenges due to the limitations of microscopy. One of the well-known issues in microscopy is that the images captured in these challenging scenarios are often out of focus and blurry. These challenges are particularly common when studying nanoparticle uptake in microfluidic systems where small and dynamic subjects are dealt with. However, the microscopy approach remains advantageous for small-scale microfluidic studies.

In previous studies [1], researchers have shown that different components like Artemia, cysts, and excrements within a microfluidic environment offer valuable insights into the metabolic processes of these organisms. This microfluidic platform provides a high level of control over the microenvironment and enables direct observation of morphological changes. It helps researchers monitor and differentiate between various stages of the hatching process of Artemia. By subjecting Artemia to different temperatures and salinities, significant alterations in the duration of hatching stages, metabolic rates, and hatchability are observed. For example, it is found that higher temperatures and moderate salinity significantly enhance the metabolic resumption of dormant Artemia cysts. This demonstrates the critical role that environmental factors play in influencing metabolic activity.

Object detection models, such as YOLO (You Only Look Once) play a significant role in advancing the field of computer vision with their real-time capabilities and competitive accuracy. While YOLOv5 is the more optimized version, YOLOv8 further refines the architecture, aiming to enhance both accuracy and speed. By conducting a comparative study using this experimental setup, we aim to provide practical insights into how these models perform and their applicability to fisheries management.

II. LITERATURE REVIEW

In recent years, the field of object detection particularly in applications related to fisheries and underwater environments has witnessed remarkable progress.

A study [2] introduced a novel multitask model that combined the YOLOv5 architecture with a semantic segmentation head for real-time fish detection and segmentation. Experiments conducted on the golden crucian carp dataset showcase an impressive precision of 95.4% in object detection and semantic segmentation accuracy of 98.5%. The model also exhibits competitive performance on the PASCAL VOC 2007 dataset, achieving object detection precision of 73.8% and semantic segmentation accuracy of 84.3%. The model worked with a remarkable speed by achieving processing rates of 116.6 FPS and 120 FPS on an RTX3060. The YOLOv5's capabilities were enhanced due to the addition of a segmentation head. The study followed a precise methodology, including model validation, selection, ablation experiments, and optimization, and compared its results with other models.

Researchers [3] addressed the challenge of detecting fish in dense groups and small underwater targets with the introduction of an enhanced CME-YOLOv5 network. This algorithm achieves a mean average precision ([email protected]) of 94.9%, surpassing the baseline YOLOv5 by 4.4 percentage points. Notably, it excels in detecting densely spaced fish and small targets, making it a promising tool for fishery resource investigation. The algorithm effectively mitigates issues related to missing detections in dense fish schools and enhances accuracy for small target fish with limited pixel and feature information. Moreover, it demonstrates proficiency in detecting small objects and effectively handling occluded and highly overlapping objects, outperforming YOLOv5 by detecting 49 additional objects and achieving a 24.6% increase in the detection ratio.

One study that centers on improving underwater object detection by evaluating various models, including EfficientDet, YOLOv5, YOLOv8, and Detectron2, on the "Brackish-Dataset" of annotated underwater images captured in Limfjorden water [4]. The research project aims to evaluate the efficiency of these models in terms of accuracy and inference time while proposing modifications to enhance EfficientDet's performance. The study conducts a thorough comparison of multiple object detection models, revealing that EfficientDet had the lead with a mean Average Precision (mAP) of 98.56%, followed closely by YOLOv8 with 98.20% mAP. YOLOv5 secures a mAP of 97.6%. Notably, the modified EfficientDet, incorporating a modified BiSkFPN mechanism and adversarial noise handling, achieves an outstanding mAP of 98.63%, with adversarial learning also elevating YOLOv5's accuracy to 98.04% mAP. The research also provided class activation map-based explanations (CAM) for the two models to promote explainability in black box models.

In terms of addressing inefficient, and potentially damaging manual sorting methods, another study [5] introduced an automated, non-contact sorting approach for crayfish using an enhanced YOLOv5 algorithm. The proposed algorithm attains a mean Average Precision (mAP) of 98.8% while reducing image processing time to a mere 2 ms. It was significant in crayfish sorting, offering an accurate, efficient, and fast alternative. The reduction in image processing time significantly enhances the overall speed of the algorithm, optimizing crayfish sorting operations.

A modified version of YOLOv5, called UTD-Yolov5 was designed and introduced in another study [6] to identify the Crown of Thorns Starfish (COTS) in underwater images. This algorithm is crucial for complex underwater operations. UTDYolov5 incorporates several modifications to the YOLOv5 network architecture, including a two-stage cascaded CSP backbone, a visual channel attention mechanism module, and a random anchor box similarity calculation method. Additionally, optimization methods such as Weighted Box Fusion (WBF) and iterative refinement are proposed to enhance network efficiency. Experiments conducted on the CSIRO dataset revealed that UTD-Yolov5 achieves an average accuracy of 78.54% which is a significant improvement over the baseline.

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.