MARKER-LESS 3D HUMAN POSE ESTIMATION FOR ANALYZING LOWER LIMB MUSCLE FORCE DURING DEEP SQUAT

Author(s): LEE, Y.H., CHANG, J.2, LEE, D.3, LEE, H.1, Institution: DANKOOK UNIVERSITY, Country: KOREA, SOUTH, Abstract-ID: 824

INTRODUCTION:
Analyzing muscle force during weightlifting is a crucial indicator for injury prevention and performance enhancement, and understanding individual muscle forces can prevent overloading and provide optimal training stimuli [1]. Marker-based methods using optoelectronic motion capture systems (OMS) and inertial measurement units (IMU) are primarily utilized for analyzing muscle loads. However, these methods have limitations such as the requirement for spacious laboratory environments with equipment, as well as time-consuming experimental setups and analysis [2]. Therefore, marker-free 3D human pose estimation (HPE) using deep learning is advantageous for application in actual environments since it requires only one camera.
METHODS:
In this study, one healthy male (28years, 173cm, 73kg) performed deep squat for 3 set of 5 repetitions. Marker trajectories during deep squat were obtained through a OMS (Vicon Motion Systems), and simultaneously, a 2D camera was used for marker-less HPE analysis. Jointformer model [3], which was pre-trained on the H3WB dataset [4], has been used to transform 2D camera data into 3D coordinates. Lower limb muscle force (Gluteus maximus, gluteus medialis, rectus femoris, vastus lateralis, vastus intermedius vastus medialis) were calculated and compared using coordinates obtained through OMS and HPE, employing Opensim full-body squat model.
RESULTS:
As a result, the muscle force of OMS was observed to be 355N in gluteus maximus, 91N in gluteus medialis, 399N in rectus femoris, 150N in vastus intermedius, 78N in vastus lateralis, 237N in vastus medialis, respectively. In addition, the muscle force of HPE is analyzed after the predicted 3D coordinates, and the root mean squared error (RMSE) between the ground truth coordinates measured by 3D markers (OMS) and the predicted 2D-3D coordinates using the Jointformer showed a slight difference.
CONCLUSION:
The proposed method analyzed muscle force using marker-less techniques, enabling immediate provision of safety guidelines and methods for training in the sports field. Moreover, it will serve as a foundation for developing models applicable for real-time analysis using 2D single cameras in future research.
REFERENCES:
1. Kemler E, Noteboom L, Beijsterveldt A. (2022). Injuries sustained during fitness activities in the Netherlands: results of a retrospective study.
2. Van der Kruk, E., & Reijne, M. M. (2018). Accuracy of human motion capture systems for sport applications; state-of-the-art review. Eur J Sport Sci, 18(6), 806-819.
3. Zhu, Y., Samet, N., & Picard, D. (2023). H3wb: Human3. 6m 3d wholebody dataset and benchmark. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 20166-20177).
4. Lutz, S., Blythman, R., Ghosal, K., Moynihan, M., Simms, C., & Smolic, A. (2022, August). Jointformer: Single-frame lifting transformer with error prediction and refinement for 3d human pose estimation. In 2022 26th International Conference on Pattern Recognition (ICPR) (pp. 1156-1163). IEEE.