Understanding Prostate Cancer Risk Using Statistical and Machine Learning Approaches: A Comparative Methodological Analysis
Abstrak
Background: Prostate cancer is one of the most common and lethal malignancies among men worldwide, making accurate risk prediction tools essential for early diagnosis and personalized care. This study aimed to compare the predictive ability of traditional binary logistic regression with machine learning (ML) algorithms, including support vector machines (SVM), K-nearest neighbors (KNN), chi-squared automatic interaction detection (CHAID), and C5.0, in identifying key risk factors and classifying prostate cancer status. Materials and Methods: The study included 501 male participants (248 diagnosed cases and 253 controls) who completed a structured 20-item questionnaire covering demographic, clinical, and lifestyle characteristics. Results: Age, smoking status, and family history of cancer consistently emerged as significant predictors across models. Additional indicators included blood in semen or urine, frequency of urination, and daily activity level. Logistic regression achieved the highest accuracy (92.2%), followed by CHAID (91.36%), SVM (89.92%), KNN (88.48%), and C5.0 (88%). Conclusion: Logistic regression provided the best accuracy and interpretability for structured clinical data, while ML models offered complementary insights by identifying complex, nonlinear associations.
Topik & Kata Kunci
Penulis (4)
Selman Aktaş
Murat Kirişci
Muzaffer Akçay
Muhammet Çiçek
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.4274/hamidiyemedj.galenos.2025.73745
- Akses
- Open Access ✓