Machine Learning: Algorithms, Real-World Applications and Research Directions
Iqbal H. Sarker
In the current age of the Fourth Industrial Revolution (4IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning, which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study’s key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.
4210 sitasi
en
Computer Science, Medicine
Deep Reinforcement Learning for Autonomous Driving: A Survey
B. R. Kiran, Ibrahim Sobh, V. Talpaert
et al.
With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks where (D)RL methods have been employed, while addressing key computational challenges in real world deployment of autonomous driving agents. It also delineates adjacent domains such as behavior cloning, imitation learning, inverse reinforcement learning that are related but are not classical RL algorithms. The role of simulators in training agents, methods to validate, test and robustify existing solutions in RL are discussed.
2249 sitasi
en
Computer Science
Deep Clustering for Unsupervised Learning of Visual Features
Mathilde Caron, Piotr Bojanowski, Armand Joulin
et al.
Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large-scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.
2254 sitasi
en
Computer Science
All-optical machine learning using diffractive deep neural networks
Xing Lin, Y. Rivenson, N. Yardimci
et al.
All-optical deep learning Deep learning uses multilayered artificial neural networks to learn digitally from large datasets. It then performs advanced identification and classification tasks. To date, these multilayered neural networks have been implemented on a computer. Lin et al. demonstrate all-optical machine learning that uses passive optical components that can be patterned and fabricated with 3D-printing. Their hardware approach comprises stacked layers of diffractive optical elements analogous to an artificial neural network that can be trained to execute complex functions at the speed of light. Science, this issue p. 1004 All-optical deep learning can be implemented with 3D-printed passive optical components. Deep learning has been transforming our ability to execute advanced inference tasks using computers. Here we introduce a physical mechanism to perform machine learning by demonstrating an all-optical diffractive deep neural network (D2NN) architecture that can implement various functions following the deep learning–based design of passive diffractive layers that work collectively. We created 3D-printed D2NNs that implement classification of images of handwritten digits and fashion products, as well as the function of an imaging lens at a terahertz spectrum. Our all-optical deep learning framework can perform, at the speed of light, various complex functions that computer-based neural networks can execute; will find applications in all-optical image analysis, feature detection, and object classification; and will also enable new camera designs and optical components that perform distinctive tasks using D2NNs.
2138 sitasi
en
Computer Science, Medicine
Off-Policy Deep Reinforcement Learning without Exploration
Scott Fujimoto, D. Meger, Doina Precup
Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection. In this paper, we demonstrate that due to errors introduced by extrapolation, standard off-policy deep reinforcement learning algorithms, such as DQN and DDPG, are incapable of learning with data uncorrelated to the distribution under the current policy, making them ineffective for this fixed batch setting. We introduce a novel class of off-policy algorithms, batch-constrained reinforcement learning, which restricts the action space in order to force the agent towards behaving close to on-policy with respect to a subset of the given data. We present the first continuous control deep reinforcement learning algorithm which can learn effectively from arbitrary, fixed batch data, and empirically demonstrate the quality of its behavior in several tasks.
1951 sitasi
en
Mathematics, Computer Science
Rainbow: Combining Improvements in Deep Reinforcement Learning
Matteo Hessel, Joseph Modayil, H. V. Hasselt
et al.
The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined. This paper examines six extensions to the DQN algorithm and empirically studies their combination. Our experiments show that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance. We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.
2569 sitasi
en
Computer Science
Deep Learning for Tomato Diseases: Classification and Symptoms Visualization
Mohammed Brahimi, K. Boukhalfa, A. Moussaoui
794 sitasi
en
Computer Science
Deep learning in fluid dynamics
J. Kutz
781 sitasi
en
Computer Science
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
P. Vincent, H. Larochelle, Isabelle Lajoie
et al.
7482 sitasi
en
Mathematics, Computer Science
Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep learning
J. Heaton
735 sitasi
en
Computer Science
A survey of deep learning-based network anomaly detection
Donghwoon Kwon, Hyunjoo Kim, Jinoh Kim
et al.
727 sitasi
en
Computer Science
Deep learning for computational chemistry
Garrett B. Goh, Nathan Oken Hodas, Abhinav Vishnu
The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including quantitative structure activity relationship, virtual screening, protein structure prediction, quantum chemistry, materials design, and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against non‐neural networks state‐of‐the‐art models across disparate research topics, and deep neural network‐based models often exceeded the “glass ceiling” expectations of their respective tasks. Coupled with the maturity of GPU‐accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry. © 2017 Wiley Periodicals, Inc.
727 sitasi
en
Computer Science, Mathematics
State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow’s Intelligent Network Traffic Control Systems
Z. Fadlullah, Fengxiao Tang, Bomin Mao
et al.
719 sitasi
en
Computer Science
Deep Learning for Image-Based Cassava Disease Detection
Amanda Ramcharan, Kelsee Baranowski, Peter McClowsky
et al.
Cassava is the third largest source of carbohydrates for human food in the world but is vulnerable to virus diseases, which threaten to destabilize food security in sub-Saharan Africa. Novel methods of cassava disease detection are needed to support improved control which will prevent this crisis. Image recognition offers both a cost effective and scalable technology for disease detection. New deep learning models offer an avenue for this technology to be easily deployed on mobile devices. Using a dataset of cassava disease images taken in the field in Tanzania, we applied transfer learning to train a deep convolutional neural network to identify three diseases and two types of pest damage (or lack thereof). The best trained model accuracies were 98% for brown leaf spot (BLS), 96% for red mite damage (RMD), 95% for green mite damage (GMD), 98% for cassava brown streak disease (CBSD), and 96% for cassava mosaic disease (CMD). The best model achieved an overall accuracy of 93% for data not used in the training process. Our results show that the transfer learning approach for image recognition of field images offers a fast, affordable, and easily deployable strategy for digital plant disease detection.
656 sitasi
en
Medicine, Computer Science
Deep Transfer Learning for Bearing Fault Diagnosis: A Systematic Review Since 2016
Xiaohan Chen, Rui Yang, Yihao Xue
et al.
The traditional deep learning-based bearing fault diagnosis approaches assume that the training and test data follow the same distribution. This assumption, however, is not always true for the bearing data collected in practical scenarios, leading to a significant decline in fault diagnosis performance. In order to satisfy this assumption, the transfer learning concept is introduced in deep learning by transferring the knowledge learned from other data or models. Due to the excellent capability of feature learning and domain transfer, deep transfer learning methods have gained widespread attention in bearing fault diagnosis in recent years. This review presents a comprehensive review of the development of deep transfer learning-based bearing fault diagnosis approaches since 2016. In this review, a novel taxonomy of deep transfer learning-based bearing fault diagnosis methods is proposed from the perspective of target domain data properties divided by labels, machines, and faults. By covering the whole life cycle of deep transfer learning-based fault diagnosis and discussing the research challenges and opportunities, this review provides a systematic guideline for researchers and practitioners to efficiently identify suitable deep transfer learning models based on the actual problems encountered in bearing fault diagnosis.
448 sitasi
en
Computer Science
VAMPnets for deep learning of molecular kinetics
Andreas Mardt, Luca Pasquali, Hao Wu
et al.
There is an increasing demand for computing the relevant structures, equilibria, and long-timescale kinetics of biomolecular processes, such as protein-drug binding, from high-throughput molecular dynamics simulations. Current methods employ transformation of simulated coordinates into structural features, dimension reduction, clustering the dimension-reduced data, and estimation of a Markov state model or related model of the interconversion rates between molecular structures. This handcrafted approach demands a substantial amount of modeling expertise, as poor decisions at any step will lead to large modeling errors. Here we employ the variational approach for Markov processes (VAMP) to develop a deep learning framework for molecular kinetics using neural networks, dubbed VAMPnets. A VAMPnet encodes the entire mapping from molecular coordinates to Markov states, thus combining the whole data processing pipeline in a single end-to-end framework. Our method performs equally or better than state-of-the-art Markov modeling methods and provides easily interpretable few-state kinetic models. Extracting kinetic models from high-throughput molecular dynamics (MD) simulations is laborious and prone to human error. Here the authors introduce a deep learning framework that automates construction of Markov state models from MD simulation data.
645 sitasi
en
Computer Science, Medicine
What is the Effect of Importance Weighting in Deep Learning?
Jonathon Byrd, Zachary Chase Lipton
Importance-weighted risk minimization is a key ingredient in many machine learning algorithms for causal inference, domain adaptation, class imbalance, and off-policy reinforcement learning. While the effect of importance weighting is well-characterized for low-capacity misspecified models, little is known about how it impacts over-parameterized, deep neural networks. This work is inspired by recent theoretical results showing that on (linearly) separable data, deep linear networks optimized by SGD learn weight-agnostic solutions, prompting us to ask, for realistic deep networks, for which many practical datasets are separable, what is the effect of importance weighting? We present the surprising finding that while importance weighting impacts models early in training, its effect diminishes over successive epochs. Moreover, while L2 regularization and batch normalization (but not dropout), restore some of the impact of importance weighting, they express the effect via (seemingly) the wrong abstraction: why should practitioners tweak the L2 regularization, and by how much, to produce the correct weighting effect? Our experiments confirm these findings across a range of architectures and datasets.
522 sitasi
en
Computer Science, Mathematics
To understand deep learning we need to understand kernel learning
Mikhail Belkin, Siyuan Ma, Soumik Mandal
Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of deep learning. Using six real-world and two synthetic datasets, we establish experimentally that kernel machines trained to have zero classification or near zero regression error perform very well on test data, even when the labels are corrupted with a high level of noise. We proceed to give a lower bound on the norm of zero loss solutions for smooth kernels, showing that they increase nearly exponentially with data size. We point out that this is difficult to reconcile with the existing generalization bounds. Moreover, none of the bounds produce non-trivial results for interpolating solutions. Second, we show experimentally that (non-smooth) Laplacian kernels easily fit random labels, a finding that parallels results for ReLU neural networks. In contrast, fitting noisy data requires many more epochs for smooth Gaussian kernels. Similar performance of overfitted Laplacian and Gaussian classifiers on test, suggests that generalization is tied to the properties of the kernel function rather than the optimization process. Certain key phenomena of deep learning are manifested similarly in kernel methods in the modern "overfitted" regime. The combination of the experimental and theoretical results presented in this paper indicates a need for new theoretical ideas for understanding properties of classical kernel methods. We argue that progress on understanding deep learning will be difficult until more tractable "shallow" kernel methods are better understood.
442 sitasi
en
Mathematics, Computer Science
Artificial Intelligence in Healthcare: Advancing Innovation and Ethics to Foster Well-Being
Bayram B., Leventi N., Vodenicharova A.
et al.
Artificial intelligence (AI) is reshaping healthcare by enhancing diagnostic precision, treatment personalization, and overall patient care. By leveraging technologies such as machine learning, deep learning, natural language processing, and computer vision, AI enables faster and more accurate decision-making, supports drug discovery and development, and facilitates remote patient monitoring. Beyond improving clinical outcomes, AI also contributes to holistic well-being by addressing physical, mental, social, occupational, and environmental health. Wearable AI devices promote proactive health management, virtual assistants improve mental health accessibility, and predictive analytics enable early intervention for disease prevention. However, the integration of AI in healthcare presents challenges, including data privacy concerns, algorithmic bias, and the need for transparency and trust. Ensuring the responsible and equitable deployment of AI requires robust ethical guidelines, interdisciplinary collaboration, and policies that safeguard patient rights while maximizing the technology’s benefits. By exploring both the transformative potential and inherent challenges of AI, this paper aims to highlight the critical role of AI in shaping the future of healthcare and human well-being.
Exploring Student Expectations and Confidence in Learning Analytics
Hayk Asatryan, Basile Tousside, Janis Mohr
et al.
Learning Analytics (LA) is nowadays ubiquitous in many educational systems, providing the ability to collect and analyze student data in order to understand and optimize learning and the environments in which it occurs. On the other hand, the collection of data requires to comply with the growing demand regarding privacy legislation. In this paper, we use the Student Expectation of Learning Analytics Questionnaire (SELAQ) to analyze the expectations and confidence of students from different faculties regarding the processing of their data for Learning Analytics purposes. This allows us to identify four clusters of students through clustering algorithms: Enthusiasts, Realists, Cautious and Indifferents. This structured analysis provides valuable insights into the acceptance and criticism of Learning Analytics among students.