MACHINE LEARNING IN MOTION: ENHANCING MOVEMENT COMPETENCE ASSESSMENTS USING WEARABLES AND NOVEL DATA SIMULATION TECHNIQUES

Author(s): SWAIN, T.1, MCNARRY, M.1, BARNES, C.1, SIMPSON, C.2, THOM, G.3, RYAN, L.1, MACKINTOSH, K.1, Institution: SWANSEA UNIVERSITY, Country: UNITED KINGDOM, Abstract-ID: 2447

INTRODUCTION:
Assessing movement quality in real-world settings remains challenging. Indeed, previous research has questioned the reliability and accuracy of directly quantifying motion characteristics (e.g. range of motion) using wearables (1, 2). Whilst machine-learning classification techniques offer promising alternatives, they are often limited by a reliance on small sample sizes, thereby impacting robustness and real-world transferability. The aims of this study were therefore two-fold: i) to generate a machine-learning algorithm to classify foundational motor skill competence, using the bodyweight squat as a representative movement; and ii) to develop a novel data simulation method to artificially boost the sample size and enhance classification accuracy.
METHODS:
Twenty-three participants (28.1 ± 7.3 years; 17 males) performed three sets of 10 repetitions of squats. Data were captured using three Polar Verity Sense magnetic, angular rate, and gravity (MARG) sensors on the chest and both ankles. Three United Kingdom Strength & Conditioning Association accredited coaches classified each repetition as ‘good’, ‘average’, or ‘poor’, based on pre-determined movement criteria, with the modal score for each repetition used in data labelling. To expand and balance the dataset, original data were augmented with simulated data using Weibull distributions or Gaussian mixture models depending on the feature frequency distributions of the most informative features. A support vector machine (SVM) ensemble with a modal voting system was developed to classify squats with raw, then augmented raw and simulated data, for a comparative analysis.
RESULTS:
Using only the original data for training, overall SVM ensemble classification accuracy using a hold-out dataset was 40%. However, the model yielded 0% accuracy for the repetitions labelled as ‘good’ due to dataset imbalance. Data-boosting improved sensitivity, increasing ‘good’ accuracy to >95%. However, overall accuracy did not improve due to large ‘average’ class inaccuracies (10%).
CONCLUSION:
This study introduces a novel data-simulation method for improving movement classification by addressing imbalances often inherent in small datasets. Data-boosting was effective, improving sensitivity and accuracy for ‘good’ and ‘poor’ squat repetitions, even with small training datasets. However, overall accuracy did not improve due to classification issues in the ‘average ’ category. This underlines the challenges of using subjective scoring as the standard for algorithm training. Recognising the limitations of the current dataset, future research should refine data-labelling methods and seek larger, more balanced datasets to mitigate intermediate class ambiguity and further the pursuit of practical application.

References
1. T. A. Swain et al., Sports Medicine, 53, 2477–2504 (2023)
2. I. Poitras et al., Sensors. 19, 1555 (2019).