INTRODUCTION:
Energy expenditure (EE) assessment is essential for exercise prescription and metabolic health management. While wearable devices have become popular for EE monitoring, their accuracy varies substantially across manufacturers due to differences in sensor technologies and proprietary algorithms. Most validation studies have used fixed-speed protocols that do not reflect individual aerobic capacity differences, and sex-stratified analyses remain limited despite known metabolic differences between males and females.
METHODS:
To systematically evaluate the accuracy of three commercial wearable devices (POLAR Vantage V3, GARMIN Forerunner 265, and VIVO Watch GT) for EE estimation across different running protocols, with particular emphasis on sex-specific differences.Thirty healthy adults (16 males, 14 females; age 18-33 years; body fat <32%) completed multiple running experiments: fixed-speed protocol (6.7-12.0 km/h), maximal speed test, and individualized-speed protocol (35-70% of maximum speed). EE from wearables was compared against indirect calorimetry (Metamax 3B-R2) as the criterion measure. Performance metrics included mean absolute percentage error (MAPE), accuracy (proportion within ±20% of criterion), bias, and root mean square error (RMSE). All devices demonstrated high heart rate measurement accuracy (r=0.997 with ECG).
RESULTS:
Device performance varied substantially by sex and intensity. Overall, all three devices systematically overestimated EE, with significantly lower accuracy in females than males. POLAR excelled at rest (accuracy >50%; MAPE=9.1%) and maximal speed conditions (male accuracy: 66.67%; MAPE: 16.50±16.63%). VIVO demonstrated superior performance during fixed-speed protocols (accuracy ≥71.43%; lowest MAPE and RMSE) and individualized protocols (male accuracy >93.3%, female >41.67%). GARMIN exhibited the largest errors across all conditions (male bias: 1.04-3.09 kcal/min; female accuracy: 21.43-25.00%; MAPE up to 38.38%). Longer exercise duration (10 vs 3 minutes) improved estimation accuracy for all devices. MAPE followed a U-shaped pattern with increasing intensity, suggesting better accuracy at low and high speeds compared to moderate intensities.
CONCLUSION:
Current wearables show substantial sex-specific and intensity-dependent errors in EE estimation during running. VIVO performed best overall in males, while POLAR showed advantages at rest and maximal intensity. All devices demonstrated lower accuracy in females, likely reflecting sex-related differences in metabolism, body composition, and aerobic capacity. Sex-specific algorithm calibration is urgently needed, particularly for female users and moderate-intensity exercise. Users should interpret wearable-derived EE estimates with caution, accounting for sex and exercise characteristics. Manufacturers should optimize algorithms specifically for women and incorporate greater sex weighting in their models.