Scientists have used machine learning to identify new predictors of post-menopausal breast cancer. Breast cancer is one of the most common types of cancer among women globally, with multiple predictors identified for the disease, including inherited genetic factors, reproductive factors, and lifestyle.
Previous studies differentiated between pre-and post-menopausal breast cancers. In a recent study, scientists combined varied approaches to accurately predict breast cancer in women using machine learning (ML) methods. These methods were applied to large datasets to recognize complex non-linear relationships.
The UK Biobank (UKB) presented a unique opportunity to adopt hypothesis-free approaches to identify novel predictors for breast cancer. The team used a technique called polygenic risk scores (PRS) to project the effect of hundreds and thousands of genetic variants associated with specific diseases or traits using genome-wide association studies (GWAS).
Incorporating PRS added precision to existing coronary artery disease risk predictors. According to a recent Scientific Reports study that employed machine learning (ML) methods for feature selection, followed by Cox models for risk prediction, various risk factors were identified, including five unknown ones. These factors included blood biomarkers, blood counts, urine biomarkers, and basal metabolic rate.
Plasma urea, associated with kidney function, was also among the identified risk factors. The XGBoost model selected a detailed body composition measure instead of body mass index (BMI), implying that precise body composition is an essential predictor of breast cancer. Polygenic risk scores are significant predictors for post-menopausal breast cancer.
The study identified five statistically significant novel correlations with post-menopausal breast cancer, and upon adding these five novel features to the baseline Cox model, the discrimination performance was maintained. External validation of the results is the next important step ahead of implementation in clinical practice.
In summary, these findings motivate further research on the use of more precise anthropometry measures to improve breast cancer prediction. Scientists are keen on external validation of their results before they implement them in clinical practice.