Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions
Iqbal H. Sarker
Deep learning (DL), a branch of machine learning (ML) and artificial intelligence (AI) is nowadays considered as a core technology of today’s Fourth Industrial Revolution (4IR or Industry 4.0). Due to its learning capabilities from data, DL technology originated from artificial neural network (ANN), has become a hot topic in the context of computing, and is widely applied in various application areas like healthcare, visual recognition, text analytics, cybersecurity, and many more. However, building an appropriate DL model is a challenging task, due to the dynamic nature and variations in real-world problems and data. Moreover, the lack of core understanding turns DL methods into black-box machines that hamper development at the standard level. This article presents a structured and comprehensive view on DL techniques including a taxonomy considering various types of real-world tasks like supervised or unsupervised. In our taxonomy, we take into account deep networks for supervised or discriminative learning, unsupervised or generative learning as well as hybrid learning and relevant others. We also summarize real-world application areas where deep learning techniques can be used. Finally, we point out ten potential aspects for future generation DL modeling with research directions. Overall, this article aims to draw a big picture on DL modeling that can be used as a reference guide for both academia and industry professionals.
2179 sitasi
en
Computer Science, Medicine
Shortcut learning in deep neural networks
Robert Geirhos, J. Jacobsen, Claudio Michaelis
et al.
Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today’s machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this Perspective we seek to distil how many of deep learning’s failures can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in comparative psychology, education and linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications. Deep learning has resulted in impressive achievements, but under what circumstances does it fail, and why? The authors propose that its failures are a consequence of shortcut learning, a common characteristic across biological and artificial systems in which strategies that appear to have solved a problem fail unexpectedly under different circumstances.
2691 sitasi
en
Computer Science, Biology
The future of digital health with federated learning
Nicola Rieke, Jonny Hancox, Wenqi Li
et al.
Data-driven machine learning (ML) has emerged as a promising approach for building accurate and robust statistical models from medical data, which is collected in huge volumes by modern healthcare systems. Existing medical data is not fully exploited by ML primarily because it sits in data silos and privacy concerns restrict access to this data. However, without access to sufficient data, ML will be prevented from reaching its full potential and, ultimately, from making the transition from research to clinical practice. This paper considers key factors contributing to this issue, explores how federated learning (FL) may provide a solution for the future of digital health and highlights the challenges and considerations that need to be addressed.
2563 sitasi
en
Computer Science, Medicine
Towards Federated Learning at Scale: System Design
Keith Bonawitz, Hubert Eichner, W. Grieskamp
et al.
Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and future directions.
3096 sitasi
en
Computer Science, Mathematics
Deep Learning for Computer Vision: A Brief Review
A. Voulodimos, N. Doulamis, A. Doulamis
et al.
Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.
3220 sitasi
en
Computer Science, Medicine
Geometric Deep Learning: Going beyond Euclidean data
M. Bronstein, Joan Bruna, Yann LeCun
et al.
Many scientific fields study data with an underlying structure that is non-Euclidean. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions) and are natural targets for machine-learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural-language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure and in cases where the invariances of these structures are built into networks used to model them.
3744 sitasi
en
Computer Science
Deep learning in neural networks: An overview
J. Schmidhuber
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
17359 sitasi
en
Computer Science, Medicine
Distributed GraphLab: A Framework for Machine Learning in the Cloud
Yucheng Low, Joseph Gonzalez, Aapo Kyrola
et al.
While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations.
1083 sitasi
en
Computer Science
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
M. Alsheikh, Shaowei Lin, D. Niyato
et al.
Wireless sensor networks (WSNs) monitor dynamic environments that change rapidly over time. This dynamic behavior is either caused by external factors or initiated by the system designers themselves. To adapt to such conditions, sensor networks often adopt machine learning techniques to eliminate the need for unnecessary redesign. Machine learning also inspires many practical solutions that maximize resource utilization and prolong the lifespan of the network. In this paper, we present an extensive literature review over the period 2002-2013 of machine learning methods that were used to address common issues in WSNs. The advantages and disadvantages of each proposed algorithm are evaluated against the corresponding problem. We also provide a comparative guide to aid WSN designers in developing suitable machine learning solutions for their specific application challenges.
877 sitasi
en
Computer Science
Machine Learning for Aerial Image Labeling
Geoffrey E. Hinton, Volodymyr Mnih
878 sitasi
en
Computer Science, Engineering
Machine learning for medical diagnosis: history, state of the art and perspective
I. Kononenko
1602 sitasi
en
Medicine, Computer Science
Incremental and Decremental Support Vector Machine Learning
G. Cauwenberghs, T. Poggio
1419 sitasi
en
Mathematics, Computer Science
Multiagent Systems: A Survey from a Machine Learning Perspective
P. Stone, M. Veloso
1491 sitasi
en
Computer Science
Map-Reduce for Machine Learning on Multicore
Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin
et al.
1286 sitasi
en
Computer Science
Communication Efficient Distributed Machine Learning with the Parameter Server
Mu Li, D. Andersen, Alex Smola
et al.
695 sitasi
en
Computer Science
Machine Learning Methods for Attack Detection in the Smart Grid
M. Ozay, I. Esnaola, Fatos Tunay
et al.
Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semisupervised) are employed with decision- and feature-level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than attack detection algorithms that employ state vector estimation methods in the proposed attack detection framework.
540 sitasi
en
Computer Science, Medicine
Scikit-learn: Machine Learning Without Learning the Machinery
G. Varoquaux, L. Buitinck, Gilles Louppe
et al.
499 sitasi
en
Computer Science
Hyperparameter Search in Machine Learning
M. Claesen, B. Moor
We introduce the hyperparameter search problem in the field of machine learning and discuss its main challenges from an optimization perspective. Machine learning methods attempt to build models that capture some element of interest based on given data. Most common learning algorithms feature a set of hyperparameters that must be determined before training commences. The choice of hyperparameters can significantly affect the resulting model's performance, but determining good values can be complex; hence a disciplined, theoretically sound search strategy is essential.
474 sitasi
en
Computer Science, Mathematics
MLaaS: Machine Learning as a Service
Mauro Ribeiro, Katarina Grolinger, Miriam A. M. Capretz
414 sitasi
en
Computer Science
Machine Learning in Preclinical Development of Antiviral Peptide Candidates: A Review of the Current Landscape
Hannah Hargrove, Bei Tong, Amr Hussein Elkabanny
et al.
In the field of antiviral peptide (AVP) design, one of the most prominent limiting factors is the time and material cost required to perform the initial screening of novel AVPs. In particular, traditional target identification as well as traditional preclinical screening of novel drug candidates can be a very lengthy and expensive process. In recent decades, target identification and initial screening of AVPs has been increasingly carried out using machine learning (ML). The use of ML to initially screen potential interactions reduces the financial cost and lengthy time scale of preclinical AVP development, allowing for candidate peptides to be identified and screened faster, at a lower cost to both manufacturer and consumer. Additionally, the use of ML in generating and screening AVP candidates allows a more diverse chemical space to be explored than high-throughput screening methodologies allow. In silico generation and validation of AVP candidates also limits researcher contact with high BSL-rated viruses, thereby increasing the safety and accessibility of AVP design. This review seeks to provide a broad overview of the current uses of ML in early-stage AVP design, and to shed some light on the future direction of the field.