...

Scientific Programme

Applied Sports Sciences

OP-AP08 - Machine Learning in Team Sports

Date: 02.07.2025, Time: 13:15 - 14:30, Session Room: Marina

Description

Chair TBA

Chair

TBA
TBA
TBA

ECSS Paris 2023: OP-AP08

Speaker A Xiangyu Ren

Speaker A

Xiangyu Ren
École normale supérieure de Rennes, le laboratoire Mouvement, sport, santé (M2S)
France
"GPS-Derived Metrics and Machine Learning Models for Injury Prediction in Professional Rugby Union players"

INTRODUCTION: In sports, injury prevention is a key factor for success. Although injuries are challenging to predict, new technologies and the application of data science can provide valuable insights [1]. This study aimed to predict injury risk among professional rugby union players using machine learning (ML) models and optimized injury prevention strategies by pinpointing effective models and position-specific risk factors. METHODS: Data from sixty-three professional rugby union players during three seasons were analyzed, categorizing into forwards and backs, further classified into five specific positions (tight five, back row, scrum-half, inside backs, outside backs). The dataset included GPS data and derived metrics such as total workload in the 1, 2, and 3 weeks prior to injury, acute-to-chronic workload ratio (ACWR) over different time windows, monotony, and strain. Five ML classification models—Logistic Regression, Naïve Bayes (NB), Support Vector Machine, Random Forest (RF), and eXtreme gradient boosting (XGBoost) — were applied separately for different player positions to assess injury prediction. RESULTS: For forwards and backs, NB excelled in recall and balanced accuracy, while RF and XGBoost led in F1 score and AUC. Position-specific analysis showed RF achieving the highest balanced accuracy (0.81) and F1 score (0.68) for tight five players, whereas XGBoost dominated AUC (0.85). Scrum-half positions yielded poor model performance across all metrics. Inside and outside backs saw competitive results from NB and XGBoost, with NB leading in recall (0.69) and balanced accuracy (0.78) for outside backs. Additionally, feature importance analysis highlighted position-specific risk factors: forwards prioritized sprint running (SR, >25 km·h-1) monotony and high-speed running (HSR, 18-21 km·h-1) strain, while backs emphasized low-intensity accelerations and total distance. Tight five players were most influenced by SR monotony and exponentially weighted moving average ACWR 3:14, whereas inside backs relied on very high-speed running (VHSR, 21-25 km·h-1) and accelerations >3 m·s-2. CONCLUSION: In conclusion, our ML-based approach can effectively predict injuries, particularly when applying a combination of GPS-derived metrics. Additionally, key characteristics indicative of injury risk for players in various positions have been successfully identified. These findings underscored the potential of ML to enhance injury prediction and inform tailored training strategies for athletes. 1. Van Eetvelde, H., et al., Machine learning methods in sport injury prediction and prevention:a systematic review. Journal of Experimental Orthopaedics, 2021. 8(1): p. 27.

Read CV Xiangyu Ren

ECSS Paris 2023: OP-AP08

Speaker B XIAO LI

Speaker B

XIAO LI
University of Lisbon, China Football College
Portugal
"A Comparative Analysis of Network Metrics in Men’s and Women’s FIFA World Cups: Evaluation of Contextual Influences on Performance"

INTRODUCTION: Despite the significant developments in women’s football, there remains an excessive parametrization based on men’s football, hereby creating a dearth of attention on women players performance and even more so on the disparities in performance between genders. In this study, we compared the women’s and men’s network metrics at both team and player levels according to contextual factors, aiming to determine how the women’s passing performance differs from the men’s. METHODS: We analyzed the 128 matches (2816 observations of players) from the 2023 Women’s World Cup and 2022 Men’s World Cup. A total of 11 network metrics from both the team’s and player’s level and five contextual factors were considered. Linear mixed models were used to identify the differences between the men’s and women’s passing performance. For the team and player level analysis, using Team ranking, Quality of opposition, and Match stage as independent variables, and network metrics as dependent variables, the team as a random effect. For the positional and formation analysis, Gender as the independent variable, network metrics were the dependent variable and team as the random effect. RESULTS: RESULTS: For the team-level analysis, according to the Team strength, Average Degree, Average Weighted Degree, and Density of the top 16 teams were significantly higher than the bottom 16 teams in both world cups (p<0.001). For the Quality of the opposition, only the women’s World Cup showed significant differences when facing the top 16 teams (p<0.001). According to formation analysis, all the variables showed a significant gender difference (p<0.05). For the player-level analysis, for Team strength, all variables exhibited statistical differences (p<0.001). For the analysis of position level, gender differences emerged in most of the variables except for women’s Centre Forward (p<0.05). CONCLUSION: The top 16 teams in both cups exhibited better team cooperation, suggesting the possession-based play style (Average Weighted Degree, Average Degree, DENSITY, Closeness Centrality, and Eigenvector Centrality increased). Women’s teams’ performance is less stable than men’s when facing the top 16 teams (Relative to the men’s data, the gap between the womens bottom 16 teams and top 16 teams is larger and statistically significant). Better team cooperation and coordination can emerge in men’s 1-4-4-2 and women’s 1-3-5-2 because the Average Degree, DENSITY, and Average Clustering Coefficient reached the highest level among all formations. Except for CF, most of the positions in the men’s World Cup had better passing performance than the women’s cup, CB showed a central role in team cooperation in both cups because Weighted-Out Degree, Weighted-In Degree, Weighted Degree, and Out Degree reached the highest level among all positions. This study provides a base for the specific analysis of performance indicators, informing coaches about specific demands of the womens game with consequences for practice design.

Read CV XIAO LI

ECSS Paris 2023: OP-AP08

Speaker C Carlo Simonelli

Speaker C

Carlo Simonelli
Università dell'Insubria, Department of Biotechnology and Life Sciences
Italy
"Prediction of next-day subjective fatigue in professional soccer players: a machine learning approach"

INTRODUCTION: Predicting the state of fatigue in soccer players is useful to design training and optimize performance. Therefore, the aim of this study was to explore, using a framework of big data analytics, the most important predictors of daily fatigue in a group of professional soccer players using inexpensive and practical data monitoring tools. METHODS: Six Italian third division (Serie C) professional soccer teams took part in this study, which included a total of 171 players for a total of more than 34.000 data collected. Every morning, the players rated fatigue, sleep quality, muscle soreness, stress and mood. After each training session or match, the session Rating of Perceived Exertion (sRPE) was obtained and multiplied by duration to calculate the Training Load (TL). Finally, some contextual factors, (i.e. distance to previous and next match) were also recorded. Four machine learning models were trained and tested to predict players’ subjective next day fatigue: i) Decision Tree classifier (DTC); ii) XGBoost classifier (XGB); iii) Random Forest Classifier (RFC); iv) Logistic regression (LR), in order to assess their ability to predict the players’ daily fatigue. RESULTS: Machine learning models can accurately predict the players’ subjective fatigue (accuracy 79-84%) using practical and inexpensive training monitoring tools. Specifically, in the prediction of next day subjective fatigue, the main influential factor was the fatigue rating of the previous day. The mediation analysis shows a statistically significant direct influence of TL on next day subjective fatigue. This relation is strongly mediated by muscle soreness, sleep quality and stress. The other perceived items do not mediate this relationship. CONCLUSION: Sport scientists and coaches can use the framework of big data analytics proposed in this paper to predict their players’ fatigue status, understanding the influence of other perceptions, judgements and TL characteristics in order to improve the decision-making process when designing a training plan. In particular, field experts could maximize the training effect by controlling the fatigue status of the soccer players simulating the scheduled training program, in order to maximize players’ readiness and reduce the potential drops in performance associated with daily fatigue in a real-world scenario. Finally, this approach can be very useful for practitioners of amateur and grassroots with a limited budget. Both status and internal load measures’ data collection, as shown above, is virtually free and it can be taken into account in assessing fatigue by our model.

Read CV Carlo Simonelli

ECSS Paris 2023: OP-AP08