Factors associated with work-related musculoskeletal disorders using machine learning approaches: a systematic review

Muhammad Irfan Mohd Sallehhudin; Siti Munira Yasin; Mohamad Rodi Isa; Tajul Rosli Razak; Muhamad Syazni Mohamad Asraff; Nur Adilla Che Rameli; Muhammad Muaz Shahriman-Teruna; Muhammad Muzzammil Mohamad Salleh; Mohamad Zuhair Mohamed Yusoff; Muhammad Hariz Ammar Khebir

doi:10.35371/aoem.2026.38.e10

Articles

Page Path: HOME > Ann Occup Environ Med > Volume 38; 2026 > Article

Review Factors associated with work-related musculoskeletal disorders using machine learning approaches: a systematic review: Muhammad Irfan Mohd Sallehhudin¹, Siti Munira Yasin¹^,*, Mohamad Rodi Isa¹, Tajul Rosli Razak², Muhamad Syazni Mohamad Asraff¹, Nur Adilla Che Rameli¹, Muhammad Muaz Shahriman-Teruna¹, Muhammad Muzzammil Mohamad Salleh¹, Mohamad Zuhair Mohamed Yusoff¹, Muhammad Hariz Ammar Khebir¹; Annals of Occupational and Environmental Medicine 2026;38:e10.
DOI: https://doi.org/10.35371/aoem.2026.38.e10
Published online: March 19, 2026

¹Department of Public Health Medicine, Faculty of Medicine, Universiti Teknologi MARA, Jalan Hospital, Sungai Buloh, Malaysia

²Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Malaysia

*Corresponding author Siti Munira Yasin Department of Public Health Medicine, Faculty of Medicine, Universiti Teknologi MARA, Jalan Hospital, 47000 Sungai Buloh, Selangor, Malaysia E-mail: smunira@uitm.edu.my

• Received: December 19, 2025 • Revised: March 6, 2026 • Accepted: March 12, 2026

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

965 Views
56 Download

prev next

Full Article

Download PDF

Abstract
BACKGROUND
METHODS
RESULTS
DISCUSSION
CONCLUSIONS
Abbreviations
NOTES
SUPPLEMENTARY MATERIAL
REFERENCES

Abstract

Background
Work-related musculoskeletal disorders (WRMSDs) remain a major cause of occupational disability and productivity loss worldwide. Traditional statistical methods have identified numerous associated factors; however, they often struggle to capture complex non-linear relationships and interactions across multiple domains of risk. Machine learning (ML) offers an alternative analytical approach for modelling such multidimensional relationships.
Methods
Following the PRISMA 2020 guidelines (PROSPERO: CRD420250605234), literature searches were conducted in Web of Science, Scopus, and PubMed for studies published between 2020 and 2025. Eligible studies applied ML methods to identify factors associated with WRMSDs using cross-sectional study designs. Included studies were appraised using the Joanna Briggs Institute Critical Appraisal Checklist for analytical cross-sectional studies.
Results
Ten studies met the inclusion criteria, representing workers from healthcare, transport, manufacturing, and service sectors across Asia, Africa, and Europe. Frequently applied ML algorithms included random forest, support vector machine, and artificial neural networks, demonstrating strong internal discriminative performance (area under the receiver operating characteristic curve: 0.80–0.99), although the absence of external validation in several studies suggests a potential risk of overfitting. Commonly identified factors included age, sex, awkward posture, vibration exposure, prolonged working hours, stress, and burnout. Psychosocial factors, including post-traumatic stress disorder, job stress, and depression, were ranked among the most influential predictors within ML models.
Conclusions
ML models demonstrate strong capability in discriminating WRMSDs risk and identifying multidimensional risk factors compared with traditional statistical approaches. These models highlight complex interrelationships between ergonomic and psychosocial exposures. Future research should incorporate external validation, objective exposure measurements, and standardized ML reporting frameworks to enhance methodological transparency and generalizability.
Keywords: Artificial intelligence; Factor AND associated; Machine learning; Work AND related; Musculoskeletal diseases

BACKGROUND

Work-related musculoskeletal disorders (WRMSDs) are injuries or disorders of the soft tissue due to exposure to risk factors at work.¹ Nowadays, WRMSDs are a burden and are among the most common occupational health issues in the world which contribute to worker disability, absenteeism, and productivity loss.² The prevalence of WRMSDs in regions like Europe ranges with an average of 42%–60% from various working sectors with food sector were found to be among the most prevalent.³ However, the prevalence of WRMSDs from low- and middle-income country which usually originated from region of Asia and Africa were higher which was found to be more than 70%.⁴ This huge burden has a significant impact on workers all around the world. It has been estimated that productivity losses resulting from WRMSDs among individuals of working age across the European Union may amount to approximately 2% of the region’s gross domestic product (GDP), underscoring the substantial economic burden posed by these conditions on top of the individual suffering.²

The multifactorial causation of WRMSDs is well documented and encompasses combination of factors or variables including physical, psychological, non-work-related activities, biomechanical, organizational, and workplace related factor with their multifactorial nature makes prevention and management very challenging.⁵ These factors had been found to be significant from various research using traditional statistical methods in a cross-sectional study which mostly uses logistic regression.⁶^-⁸ However, with current time and technology, other methods of analysis such as machine learning (ML) algorithms has become a dominant tool to identify “predictors” due to its various advantages.⁹ In ML models, variables are conventionally described as predictors or features rather than associated factors because the primary objective of these models is risk prediction and classification where the model estimate the probability of an outcome in individuals by optimizing model performance, discrimination and calibration rather than establishing etiological relationships.¹⁰ However, to prevent confusion of temporal prediction in cross-sectional studies, the term factors and discriminative features are used in this review.

Nowadays, workplace data gives more advanced analytical approaches which include ergonomic assessments, sensor data, and administrative records in addition to traditional data to better capture the interactions among these variables. Thus, ML offers new opportunities to address these various factors and their interactions with the outcome.

Traditional statistical approaches such as logistic regression have long been used to identify associated factors for WRMSDs. Although these models can accommodate nonlinear effects and interactions through techniques such as spline functions, interaction terms, or penalized regression, specifying these relationships becomes increasingly challenging in large and high-dimensional datasets.¹¹ In occupational health research, where numerous factors interact simultaneously, a priori specification of all possible interactions becomes impractical. In contrast, machine learning approaches offer greater flexibility for analysing high-dimensional data and detecting complex nonlinear relationships without strict parametric assumptions, making them a complementary methodological approach in epidemiological and occupational health research.¹²^,¹³

Although ML is increasingly applied in occupational health research, a comprehensive synthesis of studies examining factors associated with WRMSDs using ML approaches remains limited, highlighting the need for systematic evidence synthesis.

In this systematic review, our objectives are to quantitatively evaluate latest paper published on factors of WRMSDs using ML approach. The objectives of our systematic review are (1) to describe the population characteristics and prevalence of WRMSDs, (2) to describe factors associated with WRMSDs using ML approach, and (3) to evaluate quantitatively ML performance.

METHODS

Criteria of Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols (PRISMA-P) was used in conducting this systematic review. This systematic review protocol was registered with PROSPERO with registration ID CRD420250605234.

Eligibility criteria

This systematic review includes only observational studies. Other inclusion criteria are (1) studies to find factors associated with WRMSDs, (2) studies using ML for assessment of factors in cross-sectional studies including primary or secondary data analysis, (3) recent publication (2020–2025), (4) English language, (5) open access articles, and (6) free full text. The publication period taken is from 1 January 2020 to 30 September 2025.

Exclusion criteria include (1) randomized or non-randomize trial, (2) diagnostic studies, (3) studies using sensors or cameras to find risk factors for WRMSDs, (4) grey literature (thesis, government reports) or unpublished work (pre-prints) editorials, (5) letters, opinions, brief communications, and short reports were omitted from the systematic review.

Information sources

A systematic search was conducted to identify relevant studies from three electronic databases: Web of Science, Scopus, and PubMed. The search was performed using institutional access through the PERMATA portal provided by Perpustakaan Tun Abdul Razak, Universiti Teknologi MARA (UiTM).

Search strategy

A comprehensive literature search was conducted on 30 September 2025 to identify relevant previous studies examining factors of WRMSDs using ML approaches from the three databases mentioned. The search strategy was structured around three main concepts: (1) work-related musculoskeletal disorders, (2) associated factors or predictors, and (3) machine learning approaches.

Controlled vocabulary (e.g., MeSH terms) and free-text keywords were combined using Boolean operators (AND, OR) to maximize sensitivity and specificity. The general search string applied across the databases was:

("work-related musculoskeletal disorders" OR "occupational musculoskeletal disorders" OR WRMSDs OR WMSD OR "work-related MSD")

AND

("risk factors" OR predictors OR determinants OR "associated factors" OR "predictive features")

AND

("machine learning" OR "artificial intelligence" OR "deep learning" OR "neural network" OR "support vector machine" OR "random forest" OR "decision tree" OR "predictive model").

Study selection and screening

All retrieved articles were screened according to predefined eligibility criteria: publication year (2020–2025), open-access status, English language, and availability of free full text across the three selected databases. Records were exported into a single Microsoft Excel file for duplicate identification and removal, followed by independent title and abstract screening by four reviewers (MSMA, NACR, MMST, and MMMS). Studies with uncertain relevance were retained for full-text assessment to ensure conservative selection. Full texts were managed using EndNote 21, and eligibility was independently evaluated by all four reviewers, with discrepancies resolved through consensus. Data extraction was performed using a structured Excel-based form, and two additional reviewers (MZMY and MHAK) independently verified the extracted data for accuracy and completeness prior to analysis. The PRISMA flow diagram. Fig. 1 illustrates the study selection process.

Data analysis

We examined factors associated with WRMSDs identified through ML across diverse workplace settings. The term prediction in result section refers to the model's discriminative ability or diagnostic classification to stratify workers into WRMSDs or non-WRMSDs based on current exposure profiles, rather than predicting future injury incidence as the papers extracted are from cross-sectional studies. The discriminative ability is also explained via the area under the receiver operating characteristic (ROC) curve (AUC-ROC) curve or ML’s model explainability including factors, feature selection, importance and interpretation.

Data extraction was methodologically challenging due to substantial heterogeneity in machine learning algorithms, performance metrics, and the limited use of external validation, which constrained cross-study comparability. Although many studies provided detailed descriptions of factors, standardized reporting of performance indicators was frequently absent. Extracted data were systematically organized to align with the review objectives and included: (1) study characteristics (year, country, setting, population, design); (2) population details (sample size, age distribution, work setting); (3) type and anatomical location of WRMSDs; (4) assessed risk factors, including reported statistical associations; (5) ML algorithms applied; and (6) principal outcomes and conclusions. Given the methodological heterogeneity, no meta-analysis or sensitivity analysis was undertaken. Effect synthesis was conducted narratively, guided by reported feature selection approaches and model explainability methods.

Risk of bias assessment

The risk of bias of each included study was independently assessed by four authors (MSMA, NACR, MMST, and MMMS) using Joanna Briggs Institute (JBI) Critical Appraisal Checklist for cross-sectional studies.¹⁴ Overall, each study was classified as having low, moderate, or high risk of bias based on domain-level judgments and an overall appraisal. We did not formally assess publication bias or small-study effects.

Quality assessments

The Effective Public Health Practice Project (EPHPP) Quality Assessment Tool for quantitative studies was employed to appraise the methodological robustness of the included research.¹⁵ This structured instrument evaluates key domains such as selection bias, study design, confounding, blinding, data collection methods and participant attrition providing a comprehensive assessment of study quality. Each study was rated as ‘strong,’ ‘moderate,’ or ‘weak’ within each domain according to established EPHPP criteria. Studies with no domains rated as ‘weak’ were categorised as high quality, those with one ‘weak’ domain as moderate quality, and those with two or more ‘weak’ domains as low quality. Four reviewers (MSMA, NACR, MMST, and MMMS) independently conducted the ratings, with any discrepancies resolved through consensus-based discussion.

Ethics statement

Ethical approval was not required for this study as it is a systematic review of previously published studies and did not involve the collection of new data from human participants or animals.

RESULTS

Search outcome

Records retrieved from Web of Science, Scopus, and PubMed were screened according to the predefined eligibility criteria. Following removal of 18 duplicates, 411 articles underwent title and abstract screening based on study relevance and inclusion–exclusion criteria. Nineteen studies were advanced to full-text review, where methodological quality, outcome reporting, and overall relevance were critically appraised. After final evaluation and consensus among reviewers, 10 studies met the eligibility criteria and were included in the review. Inter-rater reliability was not assessed.

Study synthesis

The ten studies are all cross-sectional studies. Nine studies used primary data collection while only one study by Byeon¹⁶ used secondary data collection from a nationwide study. All studies employed clearly defined inclusion criteria, with similar outcome of WRMSDs. Some studies focus on certain region of WRMSDs while others describe multi-site pain of WRMSDs. All the papers were compared based on their location of study, sample size, target population, working sector, tools used for outcome of WRMSDs, main region of body for WRMSDs, prevalence and factors for WRMSDs, division of data during ML phase, best ML algorithm and ML model performance.

Overview of studies included

A total of ten studies published between 2020 and 2025 were included in this review, covering three major continents which includes Asia, Africa and Europe. Four studies were published in India¹⁷^-²⁰ and 1 study each was published from Spain²¹, China²², Tunisia²³, South Korea¹⁶, Bangladesh²⁴, and Iran²⁵. Sample sizes ranged from 56¹⁸ to 6,885 participants¹⁶, representing a broad range of occupational sectors including healthcare, manufacturing, transportation, mining and quarrying, services, and finance. Spain and China focused on healthcare workers, Tunisia investigated sewing machine operators, South Korea examined male office workers, and India contributed several studies involving transportation workers, mining operators, and drivers. Additional studies from Iran assessed firefighters, while Bangladesh examined bank office workers. Eight out of ten studies utilized the Nordic Musculoskeletal Questionnaire (NMQ) or its modified versions as the primary screening tool for identifying WRMSDs. The number of variables studied ranged from 6 to 59, with ergonomic and demographic factors forming the core dataset. Only one single gendered study was included, which originated from South Korea.

Prevalence of WRMSDs

Prevalence rates of WRMSDs varied markedly across occupations and countries, ranging from 17% among male office workers in South Korea¹⁶ to over 90% among surgeons in Spain²¹ and sewing machine operators in Tunisia²³. Four studies did not report the location of musculoskeletal disease, and two studies only focused on specific region of neck, shoulder and lower back. Other studies involving all body regions have shown that the areas most affected by WRMSDs are the neck, lower back, shoulders, and knees. The location of WRMSDs corresponded to job task of different working sectors. For instance, both healthcare professionals and bus drivers demonstrated dominant lower back and shoulder involvement.¹⁷^,²²

Factors associated with WRMSDs using ML

Across studies, several recurrent factors emerged irrespective of occupational category. These factors are summarized from the 10 studies and categorized into Sociodemographic, Workplace related, Biomechanical, and Psychosocial factors for easier references. The factors are tabulated in Table 1 and Supplementary Table 1.

Sociodemographic factors

Socio demographic factors identified were age, female gender, body mass index, height, underlying chronic illness, previous assessment of health status, history of musculoskeletal disease, tobacco consumption, involvement in physical activity, working experience, length of employment, educational level, sleeping duration, marriage status, multiple sick leave and alcohol consumption.

Workplace related factors

Findings were categorized into four distinct sub-domains: Organizational and Task Demands, Ergonomic and Equipment Factors, Environmental Factors, and Recovery and Interpersonal Factors.

Organizational and task demands

Across the included studies, workplace-related determinants of WRMSDs reflected the intensity and structure of job tasks. Key factors included long working hours, prolonged work duration, increased work frequency, high job demand, inability to keep up with job rhythm, and fast-paced work design. Continuous computer usage and repetitive production tasks further contributed to sustained biomechanical loading with limited task variation.

Ergonomic and equipment factors

Several studies highlighted poor workstation ergonomics and poorly designed equipment as major contributors. These included suboptimal seating or cabin design, poor instrument ergonomics, and prolonged computer-based work, which promote static postures and repetitive upper-limb movements.

Environmental factors

Workplace environmental stressors such as loud noise and repeated physical movements (e.g., frequent entry and exit from vehicles or workspaces) were also associated with WRMSD risk.

Recovery and interpersonal factors

Limited opportunities for recovery, including inadequate rest periods, absence of work breaks, post-duty fatigue, and frequent interpersonal interactions, may further exacerbate cumulative musculoskeletal strain.

Biomechanical factors

Biomechanical factors were summarized into high task repetition, awkward posture, painful postures, prolonged static and sustained body posture, multiple or frequent change of body posture during work, forceful exertion, and exposure to vibrations.

Psychosocial factors

Psychosocial factors identified via ML were feels tired after work, having work-related stress, having burnout, having depression, having post-traumatic stress disorders.

Machine learning models

A variety of ML algorithms were employed to predict WRMSDs risk from the 10 papers. These included random forest (RF), support vector machine (SVM), gradient boosting (GBM/XGBoost), single hidden-layer neural network models (MLP), elastic net (ENet), artificial neural networks (ANN), decision tree (DT), Bayesian network, and logistic regression. RF was the most frequently used and among the better performing algorithm in five studies. SVM, ANN and gradient boosting machines (GBM) achieved satisfactory results from mining and healthcare settings. Bayesian network models were applied in psychosocial-focused studies to enhance interpretability. For validation analysis, several studies applied cross-validation methods (commonly 10-fold) and data splits between 70%–80% training and 20%–30% testing datasets however, none of the studies had external validation. According to study by Hanumegowda and Gnanasekaran¹⁷, data splitting of 70% training and 30% testing led to overfitting results and changing to splitting ration of 60:40. 80:20 and 90:10 gave good accuracy of discrimination however still need external validation.

Model performance

Model performance was generally high from the ML algorithm across the 10 studies. AUC-ROC values ranged from 0.80 to 0.99, indicating satisfactory discriminative ability though the risk of overfitting remains a substantial concern. RF models achieved acceptable performance consistently (AUC ≥ 0.85), while ANN achieved the highest AUC (0.996) among shuttle car operators however, with a sample size of only 56, it mathematically guaranteed an overfitting model. Accuracy rates between models typically ranged from 0.78 to 1.00, with sensitivity and specificity values mostly above 0.75. Among the models evaluated, RFs and SVM-based models consistently demonstrated superior or comparable performance across sectors within specific datasets. For instance, in Luo et al.,²² SVM achieved the highest AUC (86.6%) and sensitivity (80.2%) for shoulder musculoskeletal disorder model performance during training while Byeon¹⁶ reported the robust and sparse twin SVM (RSTSVM) as the most accurate model (AUC: 0.84). Similarly, Kar et al.²⁰ identified RF as the best-performing model among dumper operators (AUC: 0.82). In contrast, Hanumegowda and Gnanasekaran¹⁷ achieved perfect training accuracy (100%) using DT and RF algorithms, though their external validation was not reported—suggesting possible overfitting. ANN-based models in the mining sector¹⁸ also yielded high discriminative performance (AUC: 0.996; accuracy: 0.975) however there is a high risk of overfitting as internal validation techniques were often insufficient in small sample sizes. Without independent external validation, models could capture noise or sampling artefacts, which can lead to inflated performance estimates. This phenomenon has been widely documented in applied ML studies in epidemiology and occupational health.

Studies incorporating validation or advanced model tuning achieved more stable metrics across folds. Only two studies via Hanumegowda and Gnanasekaran¹⁷ and Luo et al.²² used mean absolute error (MAE) and root mean square error (RMSE) to evaluate model with least prediction error. Study by Hanumegowda and Gnanasekaran¹⁷ showed DT and RF are the best model with MAE and RMSE with less than 0.01 while study by Luo et al.²² showed model by SVM, MLP and RF have prediction error of less than 1 for neck WRMSDs.

Despite encouraging discriminative outputs, methodological heterogeneity between studies remains evident and no algorithm was superior to others due to different study dataset, methodology, risk of overfitting and absence of external validation that were not standardized. Most studies lacked external validation, variable ranking, or consistent performance metrics (AUC-ROC, F1-score, MAE, RMSE) reporting. This variation limits direct comparability and underscores the need for standardized ML reporting frameworks in WRMSDs research according to WRMSDs theoretical framework and strength of study. Summary of model performance by algorithm type form the 10 studies are summarized in Table 2 and Supplementary Table 2 provide a comprehensive comparison.

Methodological quality

The ten included studies were appraised using the JBI Critical Appraisal Checklist for Analytical Cross-Sectional Studies, encompassing eight domains related to methodological rigor, validity, confounding, and statistical analysis. Overall, the studies demonstrated acceptable quality, with most rated as having moderate risk of bias. Inclusion criteria, participant characteristics, and study settings were clearly described across studies, and exposure and outcome measurements were generally valid and reliable, commonly employing the NMQ.

The principal methodological limitations involved incomplete reporting and suboptimal adjustment for confounding variables, contributing to moderate risk ratings in these domains. Nevertheless, all studies applied appropriate statistical analyses, and none were excluded for critical methodological deficiencies. Overall, the risk of bias was judged to be low to moderate as shown in Table 3.

Additional concerns included selection bias from single-industry sampling, reliance on self-reported WRMSD outcomes and exposures (increasing recall and reporting bias), use of non-validated instruments in some cases, and inadequate control of relevant confounders such as prior injury and psychosocial factors. Small sample sizes, class imbalance, and absence of external validation further limited generalisability and increased overfitting risk. At the review level, potential publication bias, as well as language and availability bias is acknowledged due to restriction to English-language open-access studies.

Quality assessment system for EPHPP

The methodological quality of the ten included studies was assessed using the EPHPP Quality Assessment Tool for Quantitative Studies. Overall, the studies demonstrated common limitations inherent to observational occupational health research, with most receiving a global “weak” rating primarily due to their cross-sectional design. Nevertheless, several strengths were identified, including the use of validated measurement instruments and consideration of multiple confounders.

Selection bias varied across studies. Investigations involving specific occupational groups (e.g., machine operators, drivers, and firefighters) demonstrated stronger internal validity due to high participation rates, although external generalisability remained limited. In contrast, studies relying on volunteer or low-response samples (e.g., surgeons and bank employees) were rated weaker in this domain. Confounder control was generally moderate, as most studies adjusted for various factors; however, explicit justification and comprehensive measurement of confounders were often insufficient. Data collection methods were consistently strong, supported by validated tools such as the NMQ and standardized ergonomic assessments. Although global ratings were uniformly “weak,” the findings remain informative, particularly given the advanced analytical capacity of machine learning approaches applied within these studies. Table 4 summarises the EPHPP assessment. The JBI tool focused on how well the study was conducted as a cross-sectional study, while the EPHPP evaluates the study's overall strength of evidence for public health practice, inherently penalizing cross-sectional designs.

DISCUSSION

This systematic review reinforces the growing evidence that ML offers a powerful and pragmatic solution in anticipating factors associated with WRMSDs. Across the included studies, ML models consistently achieved reliable discriminative performance for WRMSDs outcomes, supporting their suitability in interpreting complex, non-linear datasets with interactions between multiple known or unknown factors.²⁶ Recent work in occupational health and WRMSDs factors similarly demonstrates adequate performance of algorithms such as RF, SVM, neural networks in classifying musculoskeletal risk particularly when multiple exposures are modelled simultaneously.²⁷

Each of the ML algorithm has their own strength and weakness as described by Alzubi et al.²⁸ but will not be discussed in detail. In this review, we had found that various authors had used different ML algorithms to predict risk of WRMSDs however four models stood out based on model performance and consistency which are RF, SVM, ANN, and GBM. These ML models are classified under supervised learning, whereby algorithms are trained using labelled outcome data to learn a mapping function between factors and outcomes; RF and GBM are ensemble tree-based approaches, with RF reducing variance through bootstrap aggregation and GBM sequentially minimising discriminative error via gradient-based optimisation, whereas SVM and ANN capture complex non-linear decision boundaries through margin maximisation and multilayer backpropagation, respectively.²⁸

However, they are no universally superior algorithm and comparative performance of the algorithms is only within specific methodology and datasets of the 10 studies. RF is widely valued for its robustness to outliers, ability to model nonlinear relationships, and resistance to overfitting in addition to providing straightforward variable-importance measures that can be used to enhance interpretability.²⁹^,³⁰ SVMs however, remain effective for data with multiple variables, perform very well in classification and regression tasks, reduce overfitting to achieve strong generalisation performance even with limited or small sample sizes however their accuracy maybe affected by outliers or missing values.³¹ Meanwhile, ANNs serve as powerful universal function approximators capable of modelling complex nonlinear patterns, known for high discriminative accuracy and model performance, robust to missing data but are sensitive to small sample size, risk of overfitting and sometimes have lack of interpretability.³² GBM algorithms—including modern variants such as XGBoost excel in high discriminative accuracy across diverse data types with missing values however had high risk of overfitting and sensitive to hyperparameters.³³ Collectively, all these algorithms provide complementary strengths, enabling precise risk stratification models in research.

The interpretation of the model performance results from this review should be done carefully as the outcome measured is only based on screening tool and not confirmed cases of WRMSDs which limits generalizability and within specific datasets with different methodology. Sadly, this review is unable to give a final comment on the best ML algorithm to find factors associated with WRMSDs as there are variables that need to be considered such as sample size, number of variables studied, outcome of WRMSDs and external validation as all the studies in this review have different variables mentioned. In general, it can be concluded that best ML algorithm to find factors to WRMSDs should be based on the models best MAE, RMSE and model’s discriminative performance according to researcher’s dataset and external validation results. A notable observation in this review is the discrepancy between the JBI Checklist, which indicated moderate to low risk of bias, and the uniformly “Weak” global ratings from the EPHPP Quality Assessment Tool. Because EPHPP strongly penalizes selection bias and inadequate confounder control, the reported high discriminative performance from this review should be interpreted cautiously, as machine learning models may capture dataset biases rather than clinically meaningful relationships potentially inflating their apparent utility in real-world application.

The patterns observed also aligns with known evidence that WRMSDs arise from a complex interaction of physical load, biomechanical strain, and psychosocial stressors rather than from single exposures in isolation. Multiple previous studies using traditional statistical method consistently highlight similar determinants of musculoskeletal outcomes across sectors which was also to be found from this review where the primary aim is etiologic inference.⁷^,⁸^,³⁴ ML provides tangible advantages mainly in scenarios where the analytic goal is risk stratification or classification using complex, potentially non-linear exposure mixtures (e.g., posture × vibration × work–rest cycles × psychosocial stress), where interaction structures are not well-known a priori, or where the number and correlation of factors complicate conventional modelling.³⁵ Therefore, ML should be interpreted as complementary or improvement to traditional approaches. It can be valuable for screening-oriented decision support and exploratory pattern discovery, while conventional modelling remains essential for causal interpretation and policy inference.

Another advantage of ML is it allows for “feature-importance-based approach” ranking post exploration of interaction, which provides an advantage in occupational health by prioritizing interventions for the most critical factors (e.g., job stress or vibration dose) rather than simply reporting odds ratios like traditional statistical methods.³⁶ This will provide occupational-health practitioners or policymaker with more accurate decision support for example identifying specific combinations of posture duration, workload intensity, and recovery deficit that substantially increase WRMSDs risk enables targeted specific preventive action. On top of that a predictive risk score can be developed with risk stratification using ML to expand its screening usage.

However, the main limitation of ML model is overfitting where it is consistently recognised as a major methodological risk and can be addressed using multiple complementary strategies. From the review, most studies employed k-fold cross-validation, data splitting and, in several cases, additional testing on completely unseen datasets to ensure model stability and generalisability beyond the training sample including resampling technique.¹⁶^-¹⁹^,²²^,²⁵ In addition, one study includes sparse SVM models to handle high-dimensional data while another study uses pruning techniques to prevent overly complex DTs for algorithmic regularisation while another study uses hyperparameter tuning technique to prevent overfitting.¹⁶^-¹⁹ On top of that, feature-selection approaches, such as the Boruta algorithm and variable-importance ranking, were widely used to reduce noise and limit models to the most relevant factors.²⁰^,²²^,²⁴ Collectively, these layered strategies demonstrate attempts to prevent overfitting however none of the studies external validate their ML model with real cases of WRMSDs for a true discriminative capability.

Neck, shoulder, and low back pain in the included studies is highly prevalent and unsurprising as it reflects a recurring risk profile from specific work pattern whether in precision tasks such as surgery, driving, digital office work, or even industrial operations.³⁷ On top of that, newer research emphasises that unmanaged psychosocial risks (e.g., stress, burnout, poor support, job insecurity, poor job satisfaction) amplify physical strain, increasing both the onset and persistence of symptoms, which is consistent with the psychosocial factors detected by ML models in the review.³⁸^,³⁹ Thus, there is a need for future policymakers to consider psychosocial intervention in prevention occupational disease.

ML discriminative models for WRMSDs have evolved from theoretical frameworks into practical tools with substantial real-world applicability. Newer research has shown that ML algorithms can analyse complex interactions of musculoskeletal disease as described with greater precision according to their model performances.²⁷^,⁴⁰ This analytic strength supports new proactive prevention strategies across workplace settings. For example, nowadays ML can enhance occupational risk management by identifying hazardous postures, predicting cumulative strain, and detecting early injury risk through data from wearable sensors, video-based ergonomic assessments, and workplace monitoring systems that was not available previously.⁴¹^,⁴² These insights enable employers to implement targeted interventions and personalized care such as optimised job rotation, improved workstation design and personalised work–rest cycles, ultimately reducing WRMSD incidence and improving productivity. On top of that, ML tools can also be used to assess WRMSDs severity, predicting symptom progression, and tailoring ergonomic interventions using clinical, biomechanical, or sensor-derived data in terms of clinical intervention.⁴³

Thinking further, ML offers significant value for strengthening occupational health surveillance. Using ML, we can analyse large-scale administrative data such as workers’ compensation claims, sector-wide ergonomic information, emerging trends to detect high-risk sectors and improve return to work policy and resource allocation for national policy planning.⁴⁴^,⁴⁵ Such capabilities make surveillance systems more responsive, timely, and effective in addition to current national and policy level administration plan. Collectively, these application shows that ML usage for WRMSD research is not merely theoretical but a practical, scalable, and impactful tool for workplace prevention, national surveillance, and clinical prognosis.

From this review, we found that methodological and practical limitations remain. Many models were developed in single-occupation or single-country samples, often with modest or small sample sizes, raising concerns about overfitting and limited generalisability. On top of that, there is a lack of standardization of ML model and standardized metric or model performance reporting from all the studies. Several studies relied solely on internal cross-validation without external validation or standardized objective biomechanical measurement, limiting confidence in real-world deployment of model.⁴⁶ Additionally, self-reported symptoms and exposures, while convenient, introduces potential recall and reporting bias. These weaknesses mirror long-standing challenges in WRMSDs epidemiology, underscoring that the path to improved outcome lies not merely in adopting new analytical tools but in strengthening data quality and representativeness.

CONCLUSIONS

Going forward, ML-based WRMSDs research should evolve towards transparent, explainable, and ethically grounded frameworks embedded within participatory ergonomics and occupational-health systems. Several suggestions could be made for future studies which include standardizing and harmonizing research methodology and minimum reporting standards of ML performance such as AUC-ROC, calibration, accuracy, sensitivity and specificity of best ML model. We can recommend the use of RF, SVM, ANN, and GBM ML model to find the factors associated with WRMSDs as it has been shown to have good discriminative performance however it must be done via data-specific fitting and validation. Future research with regards to ML model design should also include prospective and multi centred study with real case external validation to reduce overfitting and better generalisability.

Abbreviations

AUC

area under the receiver operating characteristic curve

ANN

artificial neural network

BMI

body mass index

decision tree

EPHPP

Effective Public Health Practice Project

GBM

gradient boosting machine

JBI

Joanna Briggs Institute

MAE

mean absolute error

machine learning

MLP

multilayer perceptron

NMQ

Nordic Musculoskeletal Questionnaire

PRISMA-P

Preferred Reporting Items for Systematic Reviews and Meta-Analysis Protocols

PROSPERO

International Prospective Register of Systematic Reviews

random forest

RMSE

root mean square error

ROC

receiver operating characteristic

RSTSVM

robust and sparse twin support vector machine

SVM

support vector machine

WRMSD

work-related musculoskeletal disorder

NOTES

Competing interests

The authors declare that they have no competing interests.

Author contributions

Conceptualization: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Data curation: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR, Asraff MSM, Rameli NAC, Shahriman-Teruna MM, Salleh MMM, Yusoff MZM, Khebir MHA. Formal analysis: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Investigation: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Methodology: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Project administration: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Software: Sallehhudin MIM, Isa MR, Razak TR. Supervision: Yasin SM, Isa MR, Razak TR. Validation: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR. Visualization: Sallehhudin MIM. Writing - original draft: Sallehhudin MIM. Writing - review & editing: Sallehhudin MIM, Yasin SM, Isa MR, Razak TR.

Acknowledgments

We would like to express our sincere gratitude to Department of Public Health Medicine and School of Computing Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA (UiTM) for their valuable support throughout the study.

SUPPLEMENTARY MATERIAL

Supplementary Table 1.

Factors associated with WRMSDs using machine learning.

aoem-2026-38-e10_Supplementary-Table-1.pdf

Supplementary Table 2.

Comprehensive comparison of findings of studies included.

aoem-2026-38-e10_Supplementary-Table-2.pdf

Fig. 1.

PRISMA 2020 flow diagram of study selection. Records were identified through searches of Web of Science, Scopus, and PubMed databases. After removal of duplicates, records were screened by title and abstract, followed by fulltext assessment for eligibility. Studies meeting the inclusion criteria were included in the final systematic review.

Table 1.

Factors associated with WRMSDs across studies

Study	Biomechanical	Psychosocial	Workplace related	Sociodemographic	Individual
Sanchez-Guillen et al.²¹ (2024)	Awkward posture, prolonged procedures, poor instrument ergonomics, screen malposition	-	High-frequency surgeries, long procedures without breaks	Age, sex, BMI, height	-
Luo et al.²² (2024)	Neck flexion/extension, wrist bending/twisting, repetitive movement, prolonged sitting, static posture	Work-related stress	Insufficient rest, repeated body turning, long sitting duration	Chronic diseases	Sick leave, poor self-rated health
Rmadi et al.²³ (2024)	Repetitive motion, constrained posture	-	Production rhythm difficulty, workstation design	Age > 25, job seniority	History of MSDs
Byeon¹⁶ (2024)	Repetitive arm movement, standing posture, painful posture, lifting	Job stress	Long working hours, computer/internet use, noise	Education level	Absenteeism
Hanumegowda and Gnanasekaran¹⁷ (2022)	Vibration, posture change, ingress/egress movement, seat ergonomics	-	Break frequency	Tobacco, fatigue, sleeping in bus	-
Shaikh and Mandal¹⁸ (2025)	Posture, vibration (RMS and VDV)	-	Shift duration	Age, BMI, experience	-
Ali et al.²⁴ (2020)	Prolonged sitting	-	>9-hour workdays	Age, long employment duration	Chronic illness, physical inactivity
Raza et al.¹⁹ (2024)	Posture (drivers), vibration	-	Work hours	Age	Sleeping duration
Khoshakhlagh et al.²⁵ (2024)	-	Job stress, PTSD, burnout, depression	-	-	-
Kar et al.²⁰ (2023)	Awkward posture	Job demand	Work design	Age, experience, marital status	Alcohol use, smoking

WRMSD: work-related musculoskeletal disorder; BMI: body mass index; MSD: musculoskeletal disorder; RMS: root mean square; VDV: vibration dose value; PTSD: post-traumatic stress disorder.

“–” indicates that the factor was not assessed or not reported in the respective study. Factors are grouped according to biomechanical, psychosocial, workplace-related, and sociodemographic or individual domains as reported by the original studies.

Table 2.

Summary model performance by machine learning algorithm type

Algorithm type	Studies using this algorithm	Best model performance reported	Key notes/Strengths
Random forest (RF)	Sanchez-Guillen et al.²¹, Luo et al.²², Hanumegowda and Gnanasekaran¹⁷, Kar et al.²⁰, Ali et al.²⁴	Accuracy 0.60–0.786; AUC up to 0.822; 100% (training in bus drivers)	Consistent across sectors; strong accuracy; handles high-dimensional predictors well
Gradient boosting/XGBoost	Sanchez-Guillen et al.²¹, Luo et al.²², Byeon¹⁶, Kar et al.²⁰	Accuracy up to 84.9%; AUC up to 0.866	High discrimination power; strong with ergonomic datasets
Support vector machines (SVM)	Luo et al.²², Byeon¹⁶, Kar et al.²⁰	Accuracy up to 85.2% (RSTSVM); AUC up to 0.84	Excellent for high-dimensional datasets; strong generalization
Logistic regression	Kar et al.²⁰, Raza et al.¹⁹	Accuracy 0.63–0.69; AUC 0.65–0.74	Baseline model; lower performance vs other ML models
Artificial neural network (ANN)	Shaikh and Mandal¹⁸	Accuracy 0.975; recall 1.000; AUC 0.996	Top performer overall; best for nonlinear exposures
Bayesian network	Khoshakhlagh et al.²⁵	Accuracy 0.742; sensitivity 0.887; AUC 0.759	Best for causal pathway modelling
CART/decision tree	Rmadi et al.²³, Hanumegowda and Gnanasekaran¹⁷, Kar et al.²⁰	Accuracy up to 100% in training	High interpretability but risk of overfitting

Reported performance metrics are based on the original studies and are not directly comparable due to methodological heterogeneity.

AUC: area under the receiver operating characteristic curve; RSTSVM: robust and sparse twin support vector machine; CART: classification and regression tree.

Table 3.

JBI risk of bias assessment result

No.	Study	Year	Design	Q1	Q2	Q3	Q4	Q5	Q6	Q7	Q8	Overall risk of bias
1	Sanchez-Guillen et al.²¹ (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
2	Luo et al.²² (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
3	Rmadi et al.²³ (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
4	Byeon¹⁶ (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
5	Hanumegowda and Gnanasekaran¹⁷ (2022)	2022	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
6	Shaikh and Mandal¹⁸ (2025)	2025	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
7	Raza et al.¹⁹ (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
8	Khoshakhlagh et al.²⁵ (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
9	Kar et al.20 (2023)	2023	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
10	Ali et al.²⁴ (2020)	2020	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Partial	Yes	Yes	Low

Risk of bias categories were assigned based criteria met, consistent with PRISMA 2020 guidelines.

Q1: Clear inclusion criteria; Q2: Subjects and setting described in detail; Q3: Exposure measured validly and reliably; Q4: Standard criteria used for outcome measurement; Q5: Confounding factors identified; Q6: Strategies to deal with confounding; Q7: Outcomes measured validly and reliably; Q8: Appropriate statistical analysis.

Table 4.

Quality assessment result against the effective public health practice project quality assessment tool

No.	Study	Population/Setting	Selection bias^a	Study design^b	Confounders^c	Data collection^d	Withdrawals^e	Global rating^f
1	Sanchez-Guillen et al.²¹ (2024)	Surgeons (Spain)	Weak	Weak	Moderate	Strong	Weak	Weak
2	Luo et al.²² (2024)	Healthcare staff (China)	Moderate	Weak	Moderate	Strong	Moderate	Weak
3	Rmadi et al.²³ (2024)	Sewing operators (Tunisia)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
4	Byeon¹⁶ (2024)	Office workers (Korea)	Moderate	Weak	Moderate	Strong	Moderate	Weak
5	Hanumegowda and Gnanasekaran¹⁷ (2022)	Bus drivers (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
6	Shaikh and Mandal¹⁸ (2025)	Shuttle car operators (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
7	Ali et al.²⁴ (2020)	Bank employees (Bangladesh)	Moderate	Weak	Moderate	Strong	Moderate	Weak
8	Raza et al.¹⁹ (2024)	Heavy vehicle drivers & office workers	Moderate	Weak	Moderate	Strong	Strong	Weak
9	Khoshakhlagh et al.²⁵ (2024)	Firefighters (Iran)	Moderate	Weak	Moderate	Strong	Strong	Weak
10	Kar et al.²⁰ (2023)	Dumper operators (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak

^aSelection bias: weak = volunteer or low response; moderate = reasonable participation; moderate–strong = exhaustive workplace sampling;

^bStudy design: all cross-sectional → rated weak under Effective Public Health Practice Project;

^cConfounders: most include ≥2 key confounders but not fully justified → moderate;

^dData collection: validated tools (Nordic Musculoskeletal Questionnaire, Karasek, post-traumatic stress disorder, stress scales) → Strong;

^eWithdrawals: strong = ≥80% participation; weak < 60%; moderate = unclear;

^fGlobal rating: all studies weak due to ≥2 weak domains (study design + blinding).

REFERENCES

1. Greggi C, Visconti VV, Albanese M, Gasperini B, Chiavoghilefu A, Prezioso C, et al. Work-related musculoskeletal disorders: a systematic review and meta-analysis. J Clin Med 2024;13(13):3964.Article PubMed PMC
2. Bevan S. Economic impact of musculoskeletal disorders (MSDs) on work in Europe. Best Pract Res Clin Rheumatol 2015;29(3):356–73.Article PubMed
3. Govaerts R, Tassignon B, Ghillebert J, Serrien B, De Bock S, Ampe T, et al. Prevalence and incidence of work-related musculoskeletal disorders in secondary industries of 21st century Europe: a systematic review and meta-analysis. BMC Musculoskelet Disord 2021;22(1):751.Article PubMed PMC PDF
4. Geto AK, Daba C, Desye B, Berihun G, Berhanu L. Prevalence of work-related musculoskeletal disorder and its associated factors among weavers in low- and middle-income countries: a systematic review and meta-analysis. BMJ Open 2025;15(8):e093124.Article PubMed PMC
5. National Research Council. Work-Related Musculoskeletal Disorders: Report, Workshop Summary, and Workshop Papers. Washington, DC: National Academies Press; 1999.
6. Hazana Abdullah N, Aziati Abdul Hamid N, Wahab E, Shamsuddin A, Asmawi R. Work-related musculoskeletal disorder (WRMD) among production operators: studies of differences in age and gender. J Phys Conf Ser 2018;1049:012023.Article PDF
7. Anwer S, Li H, Antwi-Afari MF, Wong AY. Associations between physical or psychosocial risk factors and work-related musculoskeletal disorders in construction workers based on literature in the last 20 years: a systematic review. Int J Ind Ergon 2021;83:103113.Article
8. Alie M, Abich Y, Demissie SF, Weldetsadik FK, Kassa T, Shiferaw KB, et al. Magnitude and possible risk factors of musculoskeletal disorders among street cleaners and solid waste workers: a cross-sectional study. BMC Musculoskelet Disord 2023;24(1):493.Article PubMed PMC PDF
9. Rahman Y, Dua P. A machine learning framework for predicting healthcare utilization and risk factors. Healthc Anal 2025;8:100411.Article
10. Ghasemi A, Hashtarkhani S, Schwartz DL, Shaban-Nejad A. Explainable artificial intelligence in breast cancer detection and risk prediction: a systematic scoping review. Cancer Innov 2024;3(5):e136.Article PMC
11. Yan Y, Yang Z, Semenkovich TR, Kozower BD, Meyers BF, Nava RG, et al. Comparison of standard and penalized logistic regression in risk model development. JTCVS Open 2022;9:303–16.Article PubMed PMC
12. Rajula HS, Verlato G, Manchia M, Antonucci N, Fanos V. Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Medicina (Kaunas) 2020;56(9):455.Article PubMed PMC
13. Janse RJ, Abu-Hanna A, Vagliano I, Stel VS, Jager KJ, Tripepi G, et al. When the whole is greater than the sum of its parts: why machine learning and conventional statistics are complementary for predicting future health outcomes. Clin Kidney J 2025;18(4):sfaf059.Article PubMed PMC PDF
14. Joanna Briggs Institute. Joanna Briggs Institute Critical Appraisal Tools for Use in JBI Systematic Reviews: Checklist for Analytical Cross Sectional Studies. Adelaide, Australia: Joanna Briggs Institute; 2017.
15. Thomas BH, Ciliska D, Dobbins M, Micucci S. A process for systematically reviewing the literature: providing the research evidence for public health nursing interventions. Worldviews Evid Based Nurs 2004;1(3):176–84.Article PubMed PDF
16. Byeon H. Predicting occupational musculoskeletal disorders in South Korean male office workers using a robust and sparse twin support vector machine. J Mens Health 2024;20:41–9.Article
17. Hanumegowda PK, Gnanasekaran S. Prediction of work-related risk factors among bus drivers using machine learning. Int J Environ Res Public Health 2022;19(22):15179.Article PubMed PMC
18. Shaikh AM, Mandal BB. Predictive modeling of work-related musculoskeletal disorders among shuttle car operators using artificial neural network. J Mines Met Fuels 2025;73:65–74.Article PDF
19. Raza M, Bhushan RK, Khan AA, Ali AM, Khamaj A, Alam MM. Prevalence of musculoskeletal disorders in heavy vehicle drivers and office workers: a comparative analysis using a machine learning approach. Healthcare (Basel) 2024;12(24):2560.Article PubMed PMC
20. Kar MB, Aruna M, Kunar BM. Risk factors associated with work-related musculoskeletal disorders among dumper operators: a machine learning approach. Clin Epidemiol Glob Health 2023;24:101438.Article
21. Sanchez-Guillen L, Lozano-Quijada C, Soler-Silva A, Hernandez-Sanchez S, Barber X, Toledo-Marhuenda JV, et al. A calculator for musculoskeletal injuries prediction in surgeons: a machine learning approach. Surg Endosc 2024;38(11):6577–85.Article PubMed PMC PDF
22. Luo N, Xu X, Jiang B, Zhang Z, Huang J, Zhang X, et al. Explainable machine learning framework to predict the risk of work-related neck and shoulder musculoskeletal disorders among healthcare professionals. Front Public Health 2024;12:1414209.Article PubMed PMC
23. Rmadi N, Sellami I, Feki A, Hammami KJ, Masmoudi ML, Hajjaji M. Exploring multisite musculoskeletal symptoms among sewing machine operators in a tunisian leather and footwear industry using decision tree models. Clin Epidemiol Glob Health 2024;27:101575.Article
24. Ali M, Ahsan GU, Hossain A. Prevalence and associated occupational factors of low back pain among the bank employees in Dhaka City. J Occup Health 2020;62(1):e12131.Article PubMed PMC PDF
25. Khoshakhlagh AH, Sulaie SA, Yazdanirad S, Orr RM, Laal F. Relationships between job stress, post-traumatic stress and musculoskeletal symptoms in firefighters and the role of job burnout and depression mediators: a bayesian network model. BMC Public Health 2024;24(1):468.Article PubMed PMC PDF
26. Fouad DM, Mahfouz MM, Mohamed MM, Elzanaty MY, Abd El-Hafeez T. Classification of musculoskeletal pain using machine learning. Sci Rep 2025;15(1):27158.Article PubMed PMC PDF
27. Chan VC, Ross GB, Clouthier AL, Fischer SL, Graham RB. The role of machine learning in the primary prevention of work-related musculoskeletal disorders: a scoping review. Appl Ergon 2022;98:103574.Article PubMed
28. Alzubi J, Nayyar A, Kumar A. Machine learning from theory to algorithms: an overview. J Phys Conf Ser 2018;1142:012012.Article PDF
29. Breiman L. Random forests. Mach Learn 2001;45:5–32.Article PDF
30. Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform 2023;24(2):bbad002.Article PubMed PMC PDF
31. Guido R, Ferrisi S, Lofaro D, Conforti D. An overview on the advancements of support vector machine models in healthcare applications: a review. Information 2024;15:235.Article
32. Shahid N, Rappon T, Berta W. Applications of artificial neural networks in health care organizational decision-making: a scoping review. PLoS One 2019;14(2):e0212356.Article PubMed PMC
33. Natekin A, Knoll A. Gradient boosting machines, a tutorial. Front Neurorobot 2013;7:21.Article PubMed PMC
34. Berg-Beckhoff G, Ostergaard H, Jepsen JR. Prevalence and predictors of musculoskeletal pain among Danish fishermen: results from a cross-sectional survey. J Occup Med Toxicol 2016;11:51.Article PubMed PMC PDF
35. Deshpande UU, Araujo SD, Deshpande S, Kangralkar V, Patil R, Chate RA, et al. A review of machine learning techniques for ergonomic risk assessment based on human pose estimation. Discov Artif Intell 2025;5:287.Article PDF
36. Sohrabi MS, Khotanlou H, Heidarimoghadam R, Mohammadfam I, Babamiri M, Soltanian AR. Modeling the impact of ergonomic interventions and occupational factors on work-related musculoskeletal disorders in the neck of office workers with machine learning methods. J Res Health Sci 2024;24(3):e00623.Article PubMed PMC PDF
37. Qi LM, Ramalingam V. Prevalence of musculoskeletal disorders and associated risk factors among selected factory workers in Penang, Malaysia. INTI J 2019;2019:22.
38. Bezzina A, Austin E, Nguyen H, James C. Workplace psychosocial factors and their association with musculoskeletal disorders: a systematic review of longitudinal studies. Workplace Health Saf 2023;71(12):578–88.Article PubMed PMC PDF
39. Roquelaure Y. Musculoskeletal Disorders and Psychosocial Factors at Work. ETUI Research Paper, Report No. 142. Brussels, Belgium: ETUI; 2018.
40. Murugan S, Kumar SP, Kalaiarasi G, Saritha B, Rubini B. Comparison of machine learning models for injury prediction in athletes. In: 2025 International Conference on Visual Analytics and Data Visualization (ICVADV); 2025 Mar 4-6; Tirunelveli, India. Piscataway, NJ: Institute of Electrical and Electronics Engineers; 2025, 107-12 Article
41. Alenjareghi MJ, Sekkay F, Dadouchi C, Keivanpour S. Wearable sensors in Industry 4.0: preventing work-related musculoskeletal disorders. Sens Int 2026;7:100343.Article
42. Azimi NN, Hashim H. Vision-based real-time ergonomic detection in SMEs. Prog Eng Appl Technol 2025;6:152–62.
43. Zmudzki F, Smeets RJ. Machine learning clinical decision support for interdisciplinary multimodal chronic musculoskeletal pain treatment. Front Pain Res (Lausanne) 2023;4:1177070.Article PubMed PMC
44. Obasi IC, Cheng P, Varianou-Mikellidou C, Dimopoulos C, Boustras G. Machine learning for occupational accident analysis: applications, challenges, and future directions. J Saf Sci Resil 2026;7:100250.Article
45. Vivian GA, Bauder RA, Khoshgoftaar TM. A comprehensive survey on machine learning for workplace injury analysis: risk prediction, return to work strategies, and demographic insights. J Big Data 2025;12:167.Article PDF
46. Manoochehri S, Zamani M, Afshari M, Soltanian AR, Manoochehri Z. Evaluating the performance of different machine learning algorithms based on SMOTE in predicting musculoskeletal disorders in elementary school students. BMC Med Res Methodol 2025;25(1):227.Article PubMed PMC PDF

Figure & Data

REFERENCES

Citations

Citations to this article as recorded by

Cite

CITE: export Copy Download Format; Close

Download Citation

Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

Format:

RIS — For EndNote, ProCite, RefWorks, and most other reference management software
BibTeX — For JabRef, BibDesk, and other BibTeX-specific software

Include:

Citation for the content below
Citation and abstract for the content below

Factors associated with work-related musculoskeletal disorders using machine learning approaches: a systematic review

Ann Occup Environ Med. 2026;38:e10 Published online March 19, 2026

DOI: https://doi.org/10.35371/aoem.2026.38.e10

XML Download

Figure

Factors associated with work-related musculoskeletal disorders using machine learning approaches: a systematic review

Fig. 1. PRISMA 2020 flow diagram of study selection. Records were identified through searches of Web of Science, Scopus, and PubMed databases. After removal of duplicates, records were screened by title and abstract, followed by fulltext assessment for eligibility. Studies meeting the inclusion criteria were included in the final systematic review.

Fig. 1.

Factors associated with work-related musculoskeletal disorders using machine learning approaches: a systematic review

Study	Biomechanical	Psychosocial	Workplace related	Sociodemographic	Individual
Sanchez-Guillen et al.21 (2024)	Awkward posture, prolonged procedures, poor instrument ergonomics, screen malposition	-	High-frequency surgeries, long procedures without breaks	Age, sex, BMI, height	-
Luo et al.22 (2024)	Neck flexion/extension, wrist bending/twisting, repetitive movement, prolonged sitting, static posture	Work-related stress	Insufficient rest, repeated body turning, long sitting duration	Chronic diseases	Sick leave, poor self-rated health
Rmadi et al.23 (2024)	Repetitive motion, constrained posture	-	Production rhythm difficulty, workstation design	Age > 25, job seniority	History of MSDs
Byeon16 (2024)	Repetitive arm movement, standing posture, painful posture, lifting	Job stress	Long working hours, computer/internet use, noise	Education level	Absenteeism
Hanumegowda and Gnanasekaran17 (2022)	Vibration, posture change, ingress/egress movement, seat ergonomics	-	Break frequency	Tobacco, fatigue, sleeping in bus	-
Shaikh and Mandal18 (2025)	Posture, vibration (RMS and VDV)	-	Shift duration	Age, BMI, experience	-
Ali et al.24 (2020)	Prolonged sitting	-	>9-hour workdays	Age, long employment duration	Chronic illness, physical inactivity
Raza et al.19 (2024)	Posture (drivers), vibration	-	Work hours	Age	Sleeping duration
Khoshakhlagh et al.25 (2024)	-	Job stress, PTSD, burnout, depression	-	-	-
Kar et al.20 (2023)	Awkward posture	Job demand	Work design	Age, experience, marital status	Alcohol use, smoking

Algorithm type	Studies using this algorithm	Best model performance reported	Key notes/Strengths
Random forest (RF)	Sanchez-Guillen et al.21, Luo et al.22, Hanumegowda and Gnanasekaran17, Kar et al.20, Ali et al.24	Accuracy 0.60–0.786; AUC up to 0.822; 100% (training in bus drivers)	Consistent across sectors; strong accuracy; handles high-dimensional predictors well
Gradient boosting/XGBoost	Sanchez-Guillen et al.21, Luo et al.22, Byeon16, Kar et al.20	Accuracy up to 84.9%; AUC up to 0.866	High discrimination power; strong with ergonomic datasets
Support vector machines (SVM)	Luo et al.22, Byeon16, Kar et al.20	Accuracy up to 85.2% (RSTSVM); AUC up to 0.84	Excellent for high-dimensional datasets; strong generalization
Logistic regression	Kar et al.20, Raza et al.19	Accuracy 0.63–0.69; AUC 0.65–0.74	Baseline model; lower performance vs other ML models
Artificial neural network (ANN)	Shaikh and Mandal18	Accuracy 0.975; recall 1.000; AUC 0.996	Top performer overall; best for nonlinear exposures
Bayesian network	Khoshakhlagh et al.25	Accuracy 0.742; sensitivity 0.887; AUC 0.759	Best for causal pathway modelling
CART/decision tree	Rmadi et al.23, Hanumegowda and Gnanasekaran17, Kar et al.20	Accuracy up to 100% in training	High interpretability but risk of overfitting

No.	Study	Year	Design	Q1	Q2	Q3	Q4	Q5	Q6	Q7	Q8	Overall risk of bias
1	Sanchez-Guillen et al.21 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
2	Luo et al.22 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
3	Rmadi et al.23 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
4	Byeon16 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
5	Hanumegowda and Gnanasekaran17 (2022)	2022	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
6	Shaikh and Mandal18 (2025)	2025	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
7	Raza et al.19 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
8	Khoshakhlagh et al.25 (2024)	2024	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Low
9	Kar et al.20 (2023)	2023	Cross-sectional	Yes	Yes	Yes	Yes	Partial	Partial	Yes	Yes	Moderate
10	Ali et al.24 (2020)	2020	Cross-sectional	Yes	Yes	Yes	Yes	Yes	Partial	Yes	Yes	Low

No.	Study	Population/Setting	Selection bias^a	Study design^b	Confounders^c	Data collection^d	Withdrawals^e	Global rating^f
1	Sanchez-Guillen et al.21 (2024)	Surgeons (Spain)	Weak	Weak	Moderate	Strong	Weak	Weak
2	Luo et al.22 (2024)	Healthcare staff (China)	Moderate	Weak	Moderate	Strong	Moderate	Weak
3	Rmadi et al.23 (2024)	Sewing operators (Tunisia)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
4	Byeon16 (2024)	Office workers (Korea)	Moderate	Weak	Moderate	Strong	Moderate	Weak
5	Hanumegowda and Gnanasekaran17 (2022)	Bus drivers (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
6	Shaikh and Mandal18 (2025)	Shuttle car operators (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak
7	Ali et al.24 (2020)	Bank employees (Bangladesh)	Moderate	Weak	Moderate	Strong	Moderate	Weak
8	Raza et al.19 (2024)	Heavy vehicle drivers & office workers	Moderate	Weak	Moderate	Strong	Strong	Weak
9	Khoshakhlagh et al.25 (2024)	Firefighters (Iran)	Moderate	Weak	Moderate	Strong	Strong	Weak
10	Kar et al.20 (2023)	Dumper operators (India)	Moderate–Strong	Weak	Moderate	Strong	Strong	Weak

Table 1. Factors associated with WRMSDs across studies

WRMSD: work-related musculoskeletal disorder; BMI: body mass index; MSD: musculoskeletal disorder; RMS: root mean square; VDV: vibration dose value; PTSD: post-traumatic stress disorder.

Table 2. Summary model performance by machine learning algorithm type

Reported performance metrics are based on the original studies and are not directly comparable due to methodological heterogeneity.

AUC: area under the receiver operating characteristic curve; RSTSVM: robust and sparse twin support vector machine; CART: classification and regression tree.

Table 3. JBI risk of bias assessment result

Risk of bias categories were assigned based criteria met, consistent with PRISMA 2020 guidelines.

Table 4. Quality assessment result against the effective public health practice project quality assessment tool

Selection bias: weak = volunteer or low response; moderate = reasonable participation; moderate–strong = exhaustive workplace sampling;

Study design: all cross-sectional → rated weak under Effective Public Health Practice Project;

Confounders: most include ≥2 key confounders but not fully justified → moderate;

Data collection: validated tools (Nordic Musculoskeletal Questionnaire, Karasek, post-traumatic stress disorder, stress scales) → Strong;

Withdrawals: strong = ≥80% participation; weak < 60%; moderate = unclear;

Global rating: all studies weak due to ≥2 weak domains (study design + blinding).