Hasil untuk "machine learning"

Menampilkan 20 dari ~2797278 hasil · dari CrossRef, arXiv

JSON API
CrossRef Open Access 2025
How does Quantum Machine Learning (QML) improve optimization in complex systems compared to classical machine learning algorithms?

Satish Pise

Abstract Quantum Machine Learning (QML) represents a transformative paradigm that leverages quantum mechanical principles to enhance computational optimization in complex systems beyond the capabilities of classical machine learning algorithms. This paper investigates the optimization advantages of QML through a comprehensive analysis of quantum neural networks, quantum support vector machines, and hybrid quantum-classical architectures applied to complex system optimization problems. Our methodology employs variational quantum circuits with optimized feature encoding strategies and compares performance against classical baselines across multiple complexity scales. Experimental results demonstrate that QML algorithms achieve superior accuracy (95.1% for hybrid approaches vs 89.5% for classical neural networks), faster convergence rates (10x improvement in optimization iterations), and exponential scalability advantages for large-scale problems (31.7x speedup for problems with 5000 + parameters). The findings reveal that quantum algorithms excel in exploring high-dimensional solution spaces through superposition and entanglement, enabling more efficient navigation of complex optimization landscapes. Key contributions include a novel hybrid quantum-classical optimization framework, systematic performance benchmarking across problem complexities, and identification of quantum advantage thresholds for practical deployment. Results indicate that QML provides significant computational benefits for complex system optimization, particularly in scenarios involving large parameter spaces, non-convex optimization landscapes, and real-time processing requirements.

arXiv Open Access 2025
Adversarial Machine Learning Attacks on Financial Reporting via Maximum Violated Multi-Objective Attack

Edward Raff, Karen Kukla, Michel Benaroch et al.

Bad actors, primarily distressed firms, have the incentive and desire to manipulate their financial reports to hide their distress and derive personal gains. As attackers, these firms are motivated by potentially millions of dollars and the availability of many publicly disclosed and used financial modeling frameworks. Existing attack methods do not work on this data due to anti-correlated objectives that must both be satisfied for the attacker to succeed. We introduce Maximum Violated Multi-Objective (MVMO) attacks that adapt the attacker's search direction to find $20\times$ more satisfying attacks compared to standard attacks. The result is that in $\approx50\%$ of cases, a company could inflate their earnings by 100-200%, while simultaneously reducing their fraud scores by 15%. By working with lawyers and professional accountants, we ensure our threat model is realistic to how such frauds are performed in practice.

en cs.LG, stat.ML
arXiv Open Access 2025
Physics-consistent machine learning: output projection onto physical manifolds

Matilde Valente, Tiago C. Dias, Vasco Guerra et al.

Data-driven machine learning models often require extensive datasets, which can be costly or inaccessible, and their predictions may fail to comply with established physical laws. Current approaches for incorporating physical priors mitigate these issues by penalizing deviations from known physical laws, as in physics-informed neural networks, or by designing architectures that automatically satisfy specific invariants. However, penalization approaches do not guarantee compliance with physical constraints for unseen inputs, and invariant-based methods lack flexibility and generality. We propose a novel physics-consistent machine learning method that directly enforces compliance with physical principles by projecting model outputs onto the manifold defined by these laws. This procedure ensures that predictions inherently adhere to the chosen physical constraints, improving reliability and interpretability. Our method is demonstrated on two systems: a spring-mass system and a low-temperature reactive plasma. Compared to purely data-driven models, our approach significantly reduces errors in physical law compliance, enhances predictive accuracy of physical quantities, and outperforms alternatives when working with simpler models or limited datasets. The proposed projection-based technique is versatile and can function independently or in conjunction with existing physics-informed neural networks, offering a powerful, general, and scalable solution for developing fast and reliable surrogate models of complex physical systems, particularly in resource-constrained scenarios.

en cs.LG, cs.AI
arXiv Open Access 2024
Learning Decision Trees and Forests with Algorithmic Recourse

Kentaro Kanamori, Takuya Takagi, Ken Kobayashi et al.

This paper proposes a new algorithm for learning accurate tree-based models while ensuring the existence of recourse actions. Algorithmic Recourse (AR) aims to provide a recourse action for altering the undesired prediction result given by a model. Typical AR methods provide a reasonable action by solving an optimization task of minimizing the required effort among executable actions. In practice, however, such actions do not always exist for models optimized only for predictive performance. To alleviate this issue, we formulate the task of learning an accurate classification tree under the constraint of ensuring the existence of reasonable actions for as many instances as possible. Then, we propose an efficient top-down greedy algorithm by leveraging the adversarial training techniques. We also show that our proposed algorithm can be applied to the random forest, which is known as a popular framework for learning tree ensembles. Experimental results demonstrated that our method successfully provided reasonable actions to more instances than the baselines without significantly degrading accuracy and computational efficiency.

en cs.LG, stat.ML
arXiv Open Access 2024
No $D_{\text{train}}$: Model-Agnostic Counterfactual Explanations Using Reinforcement Learning

Xiangyu Sun, Raquel Aoki, Kevin H. Wilson

Machine learning (ML) methods have experienced significant growth in the past decade, yet their practical application in high-impact real-world domains has been hindered by their opacity. When ML methods are responsible for making critical decisions, stakeholders often require insights into how to alter these decisions. Counterfactual explanations (CFEs) have emerged as a solution, offering interpretations of opaque ML models and providing a pathway to transition from one decision to another. However, most existing CFE methods require access to the model's training dataset, few methods can handle multivariate time-series, and none of model-agnostic CFE methods can handle multivariate time-series without training datasets. These limitations can be formidable in many scenarios. In this paper, we present NTD-CFE, a novel model-agnostic CFE method based on reinforcement learning (RL) that generates CFEs when training datasets are unavailable. NTD-CFE is suitable for both static and multivariate time-series datasets with continuous and discrete features. NTD-CFE reduces the CFE search space from a multivariate time-series domain to a lower dimensional space and addresses the problem using RL. Users have the flexibility to specify non-actionable, immutable, and preferred features, as well as causal constraints. We demonstrate the performance of NTD-CFE against four baselines on several datasets and find that, despite not having access to a training dataset, NTD-CFE finds CFEs that make significantly fewer and significantly smaller changes to the input time-series. These properties make CFEs more actionable, as the magnitude of change required to alter an outcome is vastly reduced. The code is available in the supplementary material.

en cs.LG, stat.ME
arXiv Open Access 2023
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms

Surbhi Mittal, Kartik Thakral, Richa Singh et al.

Artificial Intelligence (AI) has made its way into various scientific fields, providing astonishing improvements over existing algorithms for a wide variety of tasks. In recent years, there have been severe concerns over the trustworthiness of AI technologies. The scientific community has focused on the development of trustworthy AI algorithms. However, machine and deep learning algorithms, popular in the AI community today, depend heavily on the data used during their development. These learning algorithms identify patterns in the data, learning the behavioral objective. Any flaws in the data have the potential to translate directly into algorithms. In this study, we discuss the importance of Responsible Machine Learning Datasets and propose a framework to evaluate the datasets through a responsible rubric. While existing work focuses on the post-hoc evaluation of algorithms for their trustworthiness, we provide a framework that considers the data component separately to understand its role in the algorithm. We discuss responsible datasets through the lens of fairness, privacy, and regulatory compliance and provide recommendations for constructing future datasets. After surveying over 100 datasets, we use 60 datasets for analysis and demonstrate that none of these datasets is immune to issues of fairness, privacy preservation, and regulatory compliance. We provide modifications to the ``datasheets for datasets" with important additions for improved dataset documentation. With governments around the world regularizing data protection laws, the method for the creation of datasets in the scientific community requires revision. We believe this study is timely and relevant in today's era of AI.

en cs.LG, cs.CV
arXiv Open Access 2023
Fast Slate Policy Optimization: Going Beyond Plackett-Luce

Otmane Sakhi, David Rohde, Nicolas Chopin

An increasingly important building block of large scale machine learning systems is based on returning slates; an ordered lists of items given a query. Applications of this technology include: search, information retrieval and recommender systems. When the action space is large, decision systems are restricted to a particular structure to complete online queries quickly. This paper addresses the optimization of these large scale decision systems given an arbitrary reward function. We cast this learning problem in a policy optimization framework and propose a new class of policies, born from a novel relaxation of decision functions. This results in a simple, yet efficient learning algorithm that scales to massive action spaces. We compare our method to the commonly adopted Plackett-Luce policy class and demonstrate the effectiveness of our approach on problems with action space sizes in the order of millions.

en cs.LG, cs.IR
arXiv Open Access 2023
Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees

James Queeney, Erhan Can Ozcan, Ioannis Ch. Paschalidis et al.

Robustness and safety are critical for the trustworthy deployment of deep reinforcement learning. Real-world decision making applications require algorithms that can guarantee robust performance and safety in the presence of general environment disturbances, while making limited assumptions on the data collection process during training. In order to accomplish this goal, we introduce a safe reinforcement learning framework that incorporates robustness through the use of an optimal transport cost uncertainty set. We provide an efficient implementation based on applying Optimal Transport Perturbations to construct worst-case virtual state transitions, which does not impact data collection during training and does not require detailed simulator access. In experiments on continuous control tasks with safety constraints, our approach demonstrates robust performance while significantly improving safety at deployment time compared to standard safe reinforcement learning.

en cs.LG, cs.AI
arXiv Open Access 2023
Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic Forecasting

Zibo Liu, Parshin Shojaee, Chandan K Reddy

There is a recent surge in the development of spatio-temporal forecasting models in the transportation domain. Long-range traffic forecasting, however, remains a challenging task due to the intricate and extensive spatio-temporal correlations observed in traffic networks. Current works primarily rely on road networks with graph structures and learn representations using graph neural networks (GNNs), but this approach suffers from over-smoothing problem in deep architectures. To tackle this problem, recent methods introduced the combination of GNNs with residual connections or neural ordinary differential equations (ODE). However, current graph ODE models face two key limitations in feature extraction: (1) they lean towards global temporal patterns, overlooking local patterns that are important for unexpected events; and (2) they lack dynamic semantic edges in their architectural design. In this paper, we propose a novel architecture called Graph-based Multi-ODE Neural Networks (GRAM-ODE) which is designed with multiple connective ODE-GNN modules to learn better representations by capturing different views of complex local and global dynamic spatio-temporal dependencies. We also add some techniques like shared weights and divergence constraints into the intermediate layers of distinct ODE-GNN modules to further improve their communication towards the forecasting task. Our extensive set of experiments conducted on six real-world datasets demonstrate the superior performance of GRAM-ODE compared with state-of-the-art baselines as well as the contribution of different components to the overall performance. The code is available at https://github.com/zbliu98/GRAM-ODE

en cs.LG, cs.AI
arXiv Open Access 2022
Machine Learning-based EEG Applications and Markets

Weiqing Gu, Bohan Yang, Ryan Chang

This paper addresses both the various EEG applications and the current EEG market ecosystem propelled by machine learning. Increasingly available open medical and health datasets using EEG encourage data-driven research with a promise of improving neurology for patient care through knowledge discovery and machine learning data science algorithm development. This effort leads to various kinds of EEG developments and currently forms a new EEG market. This paper attempts to do a comprehensive survey on the EEG market and covers the six significant applications of EEG, including diagnosis/screening, drug development, neuromarketing, daily health, metaverse, and age/disability assistance. The highlight of this survey is on the compare and contrast between the research field and the business market. Our survey points out the current limitations of EEG and indicates the future direction of research and business opportunity for every EEG application listed above. Based on our survey, more research on machine learning-based EEG applications will lead to a more robust EEG-related market. More companies will use the research technology and apply it to real-life settings. As the EEG-related market grows, the EEG-related devices will collect more EEG data, and there will be more EEG data available for researchers to use in their study, coming back as a virtuous cycle. Our market analysis indicates that research related to the use of EEG data and machine learning in the six applications listed above points toward a clear trend in the growth and development of the EEG ecosystem and machine learning world.

en cs.LG, stat.ML
arXiv Open Access 2022
PAC-Net: A Model Pruning Approach to Inductive Transfer Learning

Sanghoon Myung, In Huh, Wonik Jang et al.

Inductive transfer learning aims to learn from a small amount of training data for the target task by utilizing a pre-trained model from the source task. Most strategies that involve large-scale deep learning models adopt initialization with the pre-trained model and fine-tuning for the target task. However, when using over-parameterized models, we can often prune the model without sacrificing the accuracy of the source task. This motivates us to adopt model pruning for transfer learning with deep learning models. In this paper, we propose PAC-Net, a simple yet effective approach for transfer learning based on pruning. PAC-Net consists of three steps: Prune, Allocate, and Calibrate (PAC). The main idea behind these steps is to identify essential weights for the source task, fine-tune on the source task by updating the essential weights, and then calibrate on the target task by updating the remaining redundant weights. Under the various and extensive set of inductive transfer learning experiments, we show that our method achieves state-of-the-art performance by a large margin.

en cs.LG, cs.AI
arXiv Open Access 2022
Nonlinear MCMC for Bayesian Machine Learning

James Vuckovic

We explore the application of a nonlinear MCMC technique first introduced in [1] to problems in Bayesian machine learning. We provide a convergence guarantee in total variation that uses novel results for long-time convergence and large-particle ("propagation of chaos") convergence. We apply this nonlinear MCMC technique to sampling problems including a Bayesian neural network on CIFAR10.

en stat.ML, cs.LG
arXiv Open Access 2021
Active Learning in Incomplete Label Multiple Instance Multiple Label Learning

Tam Nguyen, Raviv Raich

In multiple instance multiple label learning, each sample, a bag, consists of multiple instances. To alleviate labeling complexity, each sample is associated with a set of bag-level labels leaving instances within the bag unlabeled. This setting is more convenient and natural for representing complicated objects, which have multiple semantic meanings. Compared to single instance labeling, this approach allows for labeling larger datasets at an equivalent labeling cost. However, for sufficiently large datasets, labeling all bags may become prohibitively costly. Active learning uses an iterative labeling and retraining approach aiming to provide reasonable classification performance using a small number of labeled samples. To our knowledge, only a few works in the area of active learning in the MIML setting are available. These approaches can provide practical solutions to reduce labeling cost but their efficacy remains unclear. In this paper, we propose a novel bag-class pair based approach for active learning in the MIML setting. Due to the partial availability of bag-level labels, we focus on the incomplete-label MIML setting for the proposed active learning approach. Our approach is based on a discriminative graphical model with efficient and exact inference. For the query process, we adapt active learning criteria to the novel bag-class pair selection strategy. Additionally, we introduce an online stochastic gradient descent algorithm to provide an efficient model update after each query. Numerical experiments on benchmark datasets illustrate the robustness of the proposed approach.

en cs.LG
arXiv Open Access 2021
Predicting Visual Improvement after Macular Hole Surgery: a Cautionary Tale on Deep Learning with Very Limited Data

M. Godbout, A. Lachance, F. Antaki et al.

We investigate the potential of machine learning models for the prediction of visual improvement after macular hole surgery from preoperative data (retinal images and clinical features). Collecting our own data for the task, we end up with only 121 total samples, putting our work in the very limited data regime. We explore a variety of deep learning methods for limited data to train deep computer vision models, finding that all tested deep vision models are outperformed by a simple regression model on the clinical features. We believe this is compelling evidence of the extreme difficulty of using deep learning on very limited data.

en eess.IV, cs.CV
arXiv Open Access 2020
Estimating Uncertainty Intervals from Collaborating Networks

Tianhui Zhou, Yitong Li, Yuan Wu et al.

Effective decision making requires understanding the uncertainty inherent in a prediction. In regression, this uncertainty can be estimated by a variety of methods; however, many of these methods are laborious to tune, generate overconfident uncertainty intervals, or lack sharpness (give imprecise intervals). We address these challenges by proposing a novel method to capture predictive distributions in regression by defining two neural networks with two distinct loss functions. Specifically, one network approximates the cumulative distribution function, and the second network approximates its inverse. We refer to this method as Collaborating Networks (CN). Theoretical analysis demonstrates that a fixed point of the optimization is at the idealized solution, and that the method is asymptotically consistent to the ground truth distribution. Empirically, learning is straightforward and robust. We benchmark CN against several common approaches on two synthetic and six real-world datasets, including forecasting A1c values in diabetic patients from electronic health records, where uncertainty is critical. In the synthetic data, the proposed approach essentially matches ground truth. In the real-world datasets, CN improves results on many performance metrics, including log-likelihood estimates, mean absolute errors, coverage estimates, and prediction interval widths.

en stat.ML, cs.LG

Halaman 55 dari 139864