BEYOND PLAYING POSITIONS: A NEW APPROACH TO CATEGORIZE SOCCER PLAYERS BASED ON MACHINE LEARNING

Author(s): DE HAAN, M., VAN DER ZWAARD, S., SANDERS, J., BEEK, P., JASPERS, R., Institution: VU AMSTERDAM, Country: NETHERLANDS, Abstract-ID: 2453

INTRODUCTION:
It is unknown if categorizing soccer players by playing position is useful from a physiological perspective. This study examined sprint, endurance, and match-specific running performance of players, assessed alignment of match-specific running performance with playing positions, and evaluated the potential of unsupervised machine learning for identification of players with similar match-specific running profiles.
METHODS:
Forty elite male soccer players had match-specific running data collected over two seasons, with 31 undergoing exercise testing, consisting of a 20-meter sprint and a treadmill test to measure maximal oxygen uptake. k-means clustering identified subgroups based on players’ match-specific running performance. Differences in sprint, endurance, and match-specific running were compared between playing positions and between clusters. Both grouping methods were tested for their ability to identify subgroups with similar total distance (TD), low (LIR), moderate (MIR), high intensity running (HIR) and sprint distance in matches.
RESULTS:
Match-specific running performance differed between playing positions, although notable variation was observed per playing position. Clustering based on match-specific running performance revealed less variance within groups (TD: P = 0.049, LIR: P = 0.032, HIR: P = 0.033) and larger standardized differences between groups (LIR: P = 0.037, MIR: P = 0.041, HIR: P = 0.035, Sprint: P = 0.018) compared to grouping by playing position. Moreover, 20-m sprint speed differed between the sprint and high intensity endurance cluster (25.22 vs 23.75, P = 0.012), but not between playing positions.
CONCLUSION:
Utilizing unsupervised machine learning to group soccer players based on match-specific running performance enhances the identification of player groups with distinct physical profiles. This data-driven approach supports performance evaluation and enables training optimization towards groups of players with similar match-specific running performance.