OPTIMIZING SMALL OBJECT DETECTION IN SIXES LACROSSE: THE SYNERGISTIC EFFECT OF HIGH-RESOLUTION INPUT AND DATA AUGMENTATION

Author(s): ZHENG, M., YANG, Q., LUO, C., WANG, H., WANG, Q., Institution: BEIJING SPORT UNIVERSITY, Country: CHINA, Abstract-ID: 2018

INTRODUCTION:
With Sixes Lacrosse's inclusion in the 2028 Los Angeles Olympics, the demand for quantitative tactical analysis is growing rapidly (1). However, this sport presents significant challenges for computer vision, particularly in Small Object Detection (3, 4). The ball occupies an extremely small screen area (<0.1%) and lacks textural detail. Furthermore, ball speeds reaching 100 mph induce severe motion blur (2), which, combined with complex occlusions, causes feature loss during the down-sampling process of Convolutional Neural Networks. Consequently, general algorithms often fail. This study proposes a data-centric optimization framework based on YOLOv8, aiming to improve detection performance by integrating input resolution and data augmentation strategies. Specifically, we constructed a dedicated dataset to investigate the interaction between resolution and augmentation for preserving spatial features and enhancing robustness.
METHODS:
Data were sourced from official World Lacrosse Sixes broadcast footage. We manually curated 347 high-resolution keyframes (2560×1600 pixels) and performed precise bounding-box annotation for the Ball, Player, and Stick Head via Roboflow. The dataset was split into training (316 images) and validation (31 images) sets, ensuring high-density instance distribution. To quantify resolution and augmentation effects, four YOLOv8-based experiments were conducted: Exp A (Baseline) used 640×640 input with basic augmentation; Exp B1 and B2 tested medium (1280×1280) and high (2560×2560) resolutions to verify feature preservation; Exp C applied strong augmentation (Mosaic and Mixup) at low resolution; and Exp D combined high resolution (2560×2560) with strong augmentation to explore the joint effect of both strategies.
RESULTS:
Baseline (640px) failed on small targets (Ball Recall 0.000). Increasing resolution to 1280px and 2560px improved Recall to 0.214 and 0.314, respectively, but reduced Precision (0.184 to 0.123) due to amplified background noise. The combined strategy (Exp D) achieved the best performance: Precision recovered to 0.197 and Recall increased to 0.443. Consequently, Overall mAP@0.5 improved from 39.5% to 54.8%. Inference time was ~43.4ms (~23 FPS), suitable for post-match analysis.

CONCLUSION:
This study validates a combined strategy of high-resolution input and strong data augmentation. High resolution preserves pixel features, while augmentation enhances robustness by suppressing false positives. Despite 23 FPS limitations, the breakthrough in Ball Recall (0 to 0.443) demonstrates efficacy for extremely small targets. Strategic method combination enables effective detection with limited samples, offering a reference for emerging sports analytics (5).
REFERENCES:
[1] Weldon 2023/ [2] Akiyama 2019/ [3] Zhu 2026/ [4] Ouyang 2026/ [5] Stein 2018