I. Bratko
Hasil untuk "machine learning"
Menampilkan 20 dari ~10337256 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar
H. Larochelle, Yoshua Bengio
Lipo Wang
Dongling Gu, Yi Feng, Hongyan Li et al.
Immunotherapy plays a crucial role in cancer treatment, but its efficacy varies among patients, with some showing suboptimal responses. Recent studies indicate that radiotherapy not only kills tumor cells locally but also induces immunogenic cell death and modulates the tumor immune microenvironment, acting like an “in situ vaccine.” This provides a strong biological basis for combining radiotherapy and immunotherapy. However, challenges remain, including individual variability in responses, complex treatment regimens, and overlapping toxicities. Artificial intelligence (AI), especially through machine learning, presents new solutions by processing high-dimensional multi-omics data. This article explores how AI enhances radiotherapy and immunotherapy combinations by optimizing synergistic effects, developing predictive biomarkers, and elucidating the regulatory mechanisms of radiotherapy on the immune microenvironment, while also discussing future directions for AI in oncology.
Junaid Muhammad, Mitra Ghergherehchi, Shiraz Ali et al.
Parkinson's disease (PD) is a neurodegenerative disorder characterized by motor and non-motor symptoms, including tremor, rigidity, and postural instability. Machine learning (ML) models have shown promise for the diagnosis of PD; however, many existing approaches do not explicitly address fairness and robustness. As a result, these models can lead to biased outcomes across demographic groups and vulnerability to adversarial attacks. In this study, we used the Parkinson's Progression Markers Initiative (PPMI) cohort, which includes clinical and demographic information from 1,084 participants spanning diverse age, sex, and racial groups. Our study addresses the key challenge of developing robust and equitable ML models to diagnose the progression of PD. We evaluated the performance of two fairness-optimized classifiers, namely, Random Forest (RF) and Decision Tree (DT). To evaluate model vulnerability, we applied adversarial techniques, specifically label leakage and data poisoning attacks, which simulate intentional or erroneous data alterations that can amplify biases and degrade accuracy. These adversarial manipulations substantially degraded model performance; specifically, DT accuracy declined by more than 10% between sensitive groups. The accuracy of the RF model decreased by 20%. Moreover, under attack, fairness metrics such as Statistical Parity Difference (SPD), which looks at differences in the chances of getting a positive prediction across demographic groups, and Equal Opportunity Difference (EOD) for differences in true positive rates between groups, both showed a decline. This pattern suggests that adversarial perturbations increased bias and widened performance disparities across demographic groups. Our results demonstrated that adversarial attacks increased the incidence of false positives and false negatives, thereby lowering the accuracy and fairness of the PD diagnostic predictions. These findings underscore the urgent need for robust and fairness-aware defenses in medical AI to mitigate racial, age, and gender disparities and ensure a reliable clinical decision-making process.
S. Thrun, L. Pratt
B. Scholkopf
Osita U. Omeje, Olanrewaju M. Bankole, Daniel E. Okojie et al.
Abstract Accurate fault location in transmission lines remains a critical challenge for modern power systems, particularly as networks become increasingly complex with the integration of renewable energy sources and smart grid technologies. Traditional fault location methods often need help with high-impedance faults, non-homogeneous line parameters, and dynamic system conditions, leading to extended outage durations and reduced system reliability. This study addresses these challenges by developing an enhanced fault location method that combines conventional electromagnetic principles with advanced machine learning techniques. The methodology employs an approach that integrates modified impedance-based calculations with convolutional neural networks and machine learning regression. The method was validated using a modified IEEE 39-bus test system through simulations, incorporating several fault scenarios and system conditions. Testing utilised synchronised measurements from both transmission line ends, with data captured at 4096 samples per second. Results demonstrate significant improvements over existing techniques, achieving a 99.1% fault detection rate, 98.2% classification accuracy, and 1.2% mean absolute percentage error in location estimation. The method showed particular strength in challenging scenarios, reducing errors by 79.4% for high-impedance faults and maintaining accuracy under variable renewable generation conditions. The proposed method advances power system protection by providing a robust, adaptive solution suitable for modern grid requirements. Its conventional instrumentation implementation facilitates practical adoption, offering improved reliability and reduced outage durations in real-world applications.
Yanrong Chen, Zhiwen Shi, Anwar Eziz et al.
High-accuracy Digital Elevation Models (DEMs) are critical for hydrological and ecological applications in low-relief arid basins, yet Interferometric Synthetic Aperture Radar (InSAR)-derived DEMs suffer from significant altitudinal errors due to temporal decorrelation and phase unwrapping artifacts, particularly in flat terrains. To address these limitations, we developed a novel machine learning framework that synergizes Sentinel-1 InSAR, UAV photogrammetry, Sentinel-2 spectral indices, and ALOS topographic features to enhance DEM accuracy. The approach was validated in Northwest China’s Taitema Lake basin across 13 sample plots covering diverse arid surface types (dunes, wetlands, playas). Four algorithms – Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Polynomial Regression (PR) – were rigorously evaluated. Without topographic data, SVM achieved the highest accuracy (test-set R2 = 0.8564). Integrating terrain features with RF further improved performance (R2 = 0.8634, MAE = 1.0683 m), reducing errors from approximately [−10, 27] m to predominantly ±6 m. The RF-corrected DEM exhibited a 42.8% decrease in standard deviation (2.60 m → 1.49 m) and a substantial R2 increase (16.4% → 89.1%). Shapley Additive exPlanations (SHAP) interpretability analysis identified slope and near-infrared reflectance as dominant error-correction features. The corrected DEMs demonstrate enhanced terrain continuity, minimized elevation noise, and offer a scalable, efficient solution for InSAR post-processing in ecologically sensitive arid regions.
Mohit D. Gupta, Dixit Goyal, Shekhar Kunal et al.
Background: Risk stratification is an integral component of ST-segment-elevation myocardial infarction (STEMI) management practices. This study aimed to derive a machine learning (ML) model for risk stratification and identification of factors associated with in-hospital and 30-day mortality in patients with STEMI and compare it with traditional TIMI score. Methods: This was a single center prospective study wherein subjects >18 years with STEMI (n = 1700) were enrolled. Patients were divided into two groups: training (n = 1360) and validation dataset (n = 340). Six ML algorithms (Extra Tree, Random Forest, Multiple Perceptron, CatBoost, Logistic Regression and XGBoost) were used to train and tune the ML model and to determine the predictors of worse outcomes using feature selection. Additionally, the performance of ML models both for in-hospital and 30-day outcomes was compared to that of TIMI score. Results: Of the 1700 patients, 168 (9.88 %) had in-hospital mortality while 30-day mortality was reported in 210 (12.35 %) subjects. In terms of in-hospital mortality, Random Forest ML model (sensitivity: 80 %; specificity: 74 %; AUC: 80.83 %) outperformed the TIMI score (sensitivity: 70 %; specificity: 64 %; AUC:70.7 %). Similarly, Random Forest ML model (sensitivity: 81.63 %; specificity: 78.35 %; AUC: 78.29 %) had better performance as compared to TIMI score (sensitivity: 63.26 %; specificity: 63.91 %; AUC: 63.59 %) for 30-day mortality. Key predictors for worse outcomes at 30-days included mitral regurgitation on presentation, smoking, cardiogenic shock, diabetes, ventricular septal rupture, Killip class, age, female gender, low blood pressure and low ejection fraction. Conclusions: ML model outperformed the traditional regression based TIMI score as a risk stratification tool in patients with STEMI.
Renyue Ji, Haisheng Wu, Hongli Lin et al.
Abstract Background Epidemiological research on the association between heavy metals and congestive heart failure (CHF) in individuals with abnormal glucose metabolism is scarce. The study addresses this research gap by examining the link between exposure to heavy metals and the odds of CHF in a population with dysregulated glucose metabolism. Method This cross-sectional study includes 7326 patients with diabetes and prediabetes from the National Health and Nutrition Examination Survey from 2011 to 2018. The exposure variables are five environmental heavy metals—cadmium (Cd), lead (Pb), mercury (Hg), selenium (Se), and manganese (Mn)—and the endpoint is CHF, determined via face-to-face interviews. Logistic regression, weighted quantile sum (WQS), and Bayesian kernel machine learning (BKMR) models were employed to investigate the association between exposure to mixtures of five heavy metals and the odds of having CHF in individuals with diabetes and prediabetes. Result Multivariate logistic regression analysis Shows that only blood Cd exhibited a significant linear positive correlation with CHF odds (OR: 1.26, 95%CI 1.07–1.47, p = 0.005), there was a significant 14% decrease in the odds rate of CHF for each additional standard deviation of log10 Se (OR: 0.86,95%CI 0.76–0.96, P = 0.009). The WQS index for the metal mixture only marginally increased the odds of CHF by 1% (OR = 1.01, 95% CI 1.00–1.02, P = 0.032). BKMR analysis demonstrated a positive association between Cd levels and the odds of CHF, an inverse relationship with Se levels in patients with diabetes and prediabetes. However, no significant association was observed between the metal mixture and CHF. Conclusion This cross-sectional study demonstrates that increased Cd levels are associated with a higher odds of CHF in patients with diabetes and pre-diabetes, whereas elevated blood Se levels significantly mitigate this odds.
Nursena Baygin
<b>Background</b>: Pattern recognition and machine learning-based classification approaches are frequently used, especially in the health field. In this research, a new feature extraction model inspired by the melatonin hormone (sleep hormone) and named MelPat (melatonin pattern) has been developed. The developed model has been tested on an open access dataset. <b>Materials and Methods</b>: An open access sleep deprivation electroencephalography (EEG) dataset was tested to evaluate the MelPat method. There are two classes in the dataset. These are (a) sleep deprivation (SD) and (b) healthy control (HC) groups, respectively. In this study, EEG signals were divided into 15 s segments, thus obtaining 1377 SD and 1378 HC samples. In the next phase of the research, a new feature extraction model was proposed, and this model was named MelPat as it was inspired by the melatonin hormone. Additionally, the feature vector was expanded using the statistical moment approach. In the signal decomposition phase of the model, the Tunable Q-Wavelet Transform (TQWT) method was used. Thus, the signal was decomposed into sub-bands, and feature extraction was applied to each band. Neighborhood Component Analysis (NCA) and Chi2 methods were used together to reduce the dimension of the feature vector and select the most significant features. In this phase, the most significant features from both feature selection algorithms were combined, and the final feature vector was obtained. In the classification phase of the model, the Support Vector Machine (SVM) algorithm, which is a shallow classifier, was used. The dataset used in the research has 61 channels. Therefore, after obtaining channel-based results, the iterative majority voting (IMV) algorithm was applied to achieve higher classification performance and generalize the results, and the most accurate results were automatically selected. <b>Results</b>: With the proposed MelPat algorithm, a high classification success of 97.71% was achieved on the open access sleep deprivation dataset. <b>Conclusions</b>: The obtained results show that the MelPat-based new classification approach is highly effective on the dataset collected for SD detection. Moreover, the fact that the proposed method is inspired by the melatonin chemical, which is the sleep hormone, makes the method attractive and ironic.
Thavavel Vaiyapuri, Zohra Sbai
Employee attrition is considered a persistent and significant problem across all the leading businesses globally. This is evidenced by the fact that the issue negatively impacted not only production but also impeded the ability of businesses to maintain continuity and adopt strategic planning. Typically, employee attrition occurs when employees are dissatisfied with respective work experiences. To effectively address this issue, proactive measures can be implemented to enhance employee retention through early identification and mitigation of factors that contribute to perceived dissatisfaction in work places. In the current era of big data, people analytics has been widely adopted by human resource (HR) departments across various businesses with the aim of understanding the different workforces across distinct fields and reducing the attrition rate. As a result, organizations are presently incorporating machine learning (ML) and artificial intelligence (AI) into HR practices to help decision-makers make better, well-informed decisions about respective human resources. The application of ML has been confirmed to be the optimal method for predicting employee attrition, but the optimization of its hyperparameter can further improve the prediction accuracy. Therefore, this novel study aimed to tune the hyperparameters of boosting ML algorithm family and develop a potential tool for employee attrition prediction through the adoption of Bayesian optimization (BO). Using IBM HR Analytics dataset, the exploration compared the performance of six ensemble classifiers and identified categorical boosting (CB) as the superior model which achieved the highest accuracy of 95.8% and AUC of 0.98 with optimized hyperparameters, showing its comprehensiveness and reliability. The comparison results showed how various boosting ML variants could be used to build a promising tool that is capable of accurately predicting employee attrition and enabling HR managers to enhance employee retention as well as satisfaction.
Masahito Mochizuki, Yusuke Miyajima
The Berezinskii-Kosterlitz-Thouless (BKT) transition is a typical topological phase transition defined between binding and unbinding states of vortices and antivortices, which is not accompanied by spontaneous symmetry breaking. It is known that the BKT transition is difficult to detect from thermodynamic quantities such as specific heat and magnetic susceptibility because of the absence of anomaly in free energy and significant finite-size effects. Therefore, methods based on statistical mechanics which are commonly used to discuss phase transitions cannot be easily applied to the BKT transition. In recent years, several attempts to detect the BKT transition using machine-learning methods based on image recognition techniques have been reported. However, it has turned out that the detection is difficult even for machine learning methods because of the absence of trivial order parameters and symmetry breaking. Most of the methods proposed so far require prior knowledge about the models and/or preprocessing of input data for feature engineering, which is problematic in terms of the general applicability. In this article, we introduce recent development of the machine-learning methods to detect the BKT transitions in several spin models. Specifically, we demonstrate the success of two new methods named temperature-identification method and phase-classification method for detecting the BKT transitions in the q-state clock model and the XXZ model. This progress is expected to sublimate the machine-learning-based study of spin models for exploring new physics beyond simple benchmark test.
Radford M. Neal
Umberto Michelucci
Hassan Shabani Mputu, Ahmed Abdel-Mawgood, Atsushi Shimada et al.
The demand for high-quality tomatoes to meet consumer and market standards, combined with large-scale production, has necessitated the development of an inline quality grading. Since manual grading is time-consuming, costly, and requires a substantial amount of labor. This study introduces a novel approach for tomato quality sorting and grading. The method leverages pre-trained convolutional neural networks (CNNs) for feature extraction and traditional machine-learning algorithms for classification (hybrid model). The single-board computer NVIDIA Jetson TX1 was used to create a tomato image dataset. Image preprocessing and fine-tuning techniques were applied to enable deep layers to learn and concentrate on complex and significant features. The extracted features were then classified using traditional machine learning algorithms namely: support vector machines (SVM), random forest (RF), and k-nearest neighbors (KNN) classifiers. Among the proposed hybrid models, the CNN-SVM method has outperformed other hybrid approaches, attaining an accuracy of 97.50% in the binary classification of tomatoes as healthy or rejected and 96.67% in the multiclass classification of them as ripe, unripe, or rejected when Inceptionv3 was used as feature extractor. Once another dataset (public dataset) was used, the proposed hybrid model CNN-SVM achieved an accuracy of 97.54% in categorizing tomatoes as ripe, unripe, old, or damaged outperforming other hybrid models when Inceptionv3 was used as a feature extractor. The performance metrics accuracy, recall, precision, specificity, and F1-score of the best-performing proposed hybrid model were evaluated.
Rong Shen, Shaoxiong Ming, Wei Qian et al.
Abstract Objectives To establish a predictive model for sepsis after percutaneous nephrolithotomy (PCNL) using machine learning to identify high-risk patients and enable early diagnosis and intervention by urologists. Methods A retrospective study including 694 patients who underwent PCNL was performed. A predictive model for sepsis using machine learning was constructed based on 22 preoperative and intraoperative parameters. Results Sepsis occurred in 45 of 694 patients, including 16 males (35.6%) and 29 females (64.4%). Data were randomly segregated into an 80% training set and a 20% validation set via 100-fold Monte Carlo cross-validation. The variables included in this study were highly independent. The model achieved good predictive power for postoperative sepsis (AUC = 0.89, 87.8% sensitivity, 86.9% specificity, and 87.4% accuracy). The top 10 variables that contributed to the model prediction were preoperative midstream urine bacterial culture, sex, days of preoperative antibiotic use, urinary nitrite, preoperative blood white blood cell (WBC), renal pyogenesis, staghorn stones, history of ipsilateral urologic surgery, cumulative stone diameters, and renal anatomic malformation. Conclusion Our predictive model is suitable for sepsis estimation after PCNL and could effectively reduce the incidence of sepsis through early intervention.
Hamed Taherdoost
Amidst an unprecedented period of technological progress, incorporating digital platforms into diverse domains of existence has become indispensable, fundamentally altering the operational processes of governments, businesses, and individuals. Nevertheless, the swift process of digitization has concurrently led to the emergence of cybercrime, which takes advantage of weaknesses in interconnected systems. The growing dependence of society on digital communication, commerce, and information sharing has led to the exploitation of these platforms by malicious actors for hacking, identity theft, ransomware, and phishing attacks. With the growing dependence of organizations, businesses, and individuals on digital platforms for information exchange, commerce, and communication, malicious actors have identified the susceptibilities present in these systems and have begun to exploit them. This study examines 28 research papers focusing on intrusion detection systems (IDS), and phishing detection in particular, and how quickly responses and detections in cybersecurity may be made. We investigate various approaches and quantitative measurements to comprehend the link between reaction time and detection time and emphasize the necessity of minimizing both for improved cybersecurity. The research focuses on reducing detection and reaction times, especially for phishing attempts, to improve cybersecurity. In smart grids and automobile control networks, faster attack detection is important, and machine learning can help. It also stresses the necessity to improve protocols to address increasing cyber risks while maintaining scalability, interoperability, and resilience. Although machine-learning-based techniques have the potential for detection precision and reaction speed, obstacles still need to be addressed to attain real-time capabilities and adjust to constantly changing threats. To create effective defensive mechanisms against cyberattacks, future research topics include investigating innovative methodologies, integrating real-time threat intelligence, and encouraging collaboration.
Gabriel Pedroza
This work proposes a mathematical approach that (re)defines a property of Machine Learning models named stability and determines sufficient conditions to validate it. Machine Learning models are represented as functions, and the characteristics in scope depend upon the domain of the function, what allows us to adopt topological and metric spaces theory as a basis. Finally, this work provides some equivalences useful to prove and test stability in Machine Learning models. The results suggest that whenever stability is aligned with the notion of function smoothness, then the stability of Machine Learning models primarily depends upon certain topological, measurable properties of the classification sets within the ML model domain.
Halaman 49 dari 516863