Hasil untuk "deep learning"

Menampilkan 20 dari ~11041534 hasil · dari CrossRef, DOAJ, Semantic Scholar

JSON API
S2 Open Access 2019
A Survey of Deep Learning-Based Object Detection

L. Jiao, Fan Zhang, Fang Liu et al.

Object detection is one of the most important and challenging branches of computer vision, which has been widely applied in people’s life, such as monitoring security, autonomous driving and so on, with the purpose of locating instances of semantic objects of a certain class. With the rapid development of deep learning algorithms for detection tasks, the performance of object detectors has been greatly improved. In order to understand the main development status of object detection pipeline thoroughly and deeply, in this survey, we analyze the methods of existing typical detection models and describe the benchmark datasets at first. Afterwards and primarily, we provide a comprehensive overview of a variety of object detection methods in a systematic manner, covering the one-stage and two-stage detectors. Moreover, we list the traditional and new applications. Some representative branches of object detection are analyzed as well. Finally, we discuss the architecture of exploiting these object detection methods to build an effective and efficient system and point out a set of development trends to better follow the state-of-the-art algorithms and further research.

1113 sitasi en Computer Science
S2 Open Access 2018
Learning deep representations by mutual information estimation and maximization

R. Devon Hjelm, A. Fedorov, Samuel Lavoie-Marchildon et al.

This work investigates unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality in the input into the objective can significantly improve a representation’s suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and compares favorably with fully-supervised learning on several classification tasks in with some standard architectures. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation learning objectives for specific end-goals.

2956 sitasi en Computer Science, Mathematics
S2 Open Access 2018
Deep Learning in Spiking Neural Networks

A. Tavanaei, M. Ghodrati, S. R. Kheradpisheh et al.

In recent years, deep learning has revolutionized the field of machine learning, for computer vision in particular. In this approach, a deep (multilayer) artificial neural network (ANN) is trained, most often in a supervised manner using backpropagation. Vast amounts of labeled training examples are required, but the resulting classification accuracy is truly impressive, sometimes outperforming humans. Neurons in an ANN are characterized by a single, static, continuous-valued activation. Yet biological neurons use discrete spikes to compute and transmit information, and the spike times, in addition to the spike rates, matter. Spiking neural networks (SNNs) are thus more biologically realistic than ANNs, and are arguably the only viable option if one wants to understand how the brain computes at the neuronal description level. The spikes of biological neurons are sparse in time and space, and event-driven. Combined with bio-plausible local learning rules, this makes it easier to build low-power, neuromorphic hardware for SNNs. However, training deep SNNs remains a challenge. Spiking neurons' transfer function is usually non-differentiable, which prevents using backpropagation. Here we review recent supervised and unsupervised methods to train deep SNNs, and compare them in terms of accuracy and computational cost. The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNNs typically require many fewer operations and are the better candidates to process spatio-temporal data.

1301 sitasi en Medicine, Computer Science
S2 Open Access 2017
Deep Learning for Hate Speech Detection in Tweets

Pinkesh Badjatiya, Shashank Gupta, Manish Gupta et al.

Hate speech detection on Twitter is critical for applications like controversial event extraction, building AI chatterbots, content recommendation, and sentiment analysis. We define this task as being able to classify a tweet as racist, sexist or neither. The complexity of the natural language constructs makes this task very challenging. We perform extensive experiments with multiple deep learning architectures to learn semantic word embeddings to handle this complexity. Our experiments on a benchmark dataset of 16K annotated tweets show that such deep learning methods outperform state-of-the-art char/word n-gram methods by ~18 F1 points.

1240 sitasi en Computer Science
S2 Open Access 2017
Deep Learning for Massive MIMO CSI Feedback

Chao-Kai Wen, Wan-Ting Shih, Shi Jin

In frequency division duplex mode, the downlink channel state information (CSI) should be sent to the base station through feedback links so that the potential gains of a massive multiple-input multiple-output can be exhibited. However, such a transmission is hindered by excessive feedback overhead. In this letter, we use deep learning technology to develop CsiNet, a novel CSI sensing and recovery mechanism that learns to effectively use channel structure from training samples. CsiNet learns a transformation from CSI to a near-optimal number of representations (or codewords) and an inverse transformation from codewords to CSI. We perform experiments to demonstrate that CsiNet can recover CSI with significantly improved reconstruction quality compared with existing compressive sensing (CS)-based methods. Even at excessively low compression regions where CS-based methods cannot work, CsiNet retains effective beamforming gain.

1056 sitasi en Computer Science, Mathematics
S2 Open Access 2017
Deep learning for universal linear embeddings of nonlinear dynamics

Bethany Lusch, J. Kutz, S. Brunton

Identifying coordinate transformations that make strongly nonlinear dynamics approximately linear has the potential to enable nonlinear prediction, estimation, and control using linear theory. The Koopman operator is a leading data-driven embedding, and its eigenfunctions provide intrinsic coordinates that globally linearize the dynamics. However, identifying and representing these eigenfunctions has proven challenging. This work leverages deep learning to discover representations of Koopman eigenfunctions from data. Our network is parsimonious and interpretable by construction, embedding the dynamics on a low-dimensional manifold. We identify nonlinear coordinates on which the dynamics are globally linear using a modified auto-encoder. We also generalize Koopman representations to include a ubiquitous class of systems with continuous spectra. Our framework parametrizes the continuous frequency using an auxiliary network, enabling a compact and efficient embedding, while connecting our models to decades of asymptotics. Thus, we benefit from the power of deep learning, while retaining the physical interpretability of Koopman embeddings. It is often advantageous to transform a strongly nonlinear system into a linear one in order to simplify its analysis for prediction and control. Here the authors combine dynamical systems with deep learning to identify these hard-to-find transformations.

1552 sitasi en Computer Science, Mathematics
S2 Open Access 2017
Supervised Speech Separation Based on Deep Learning: An Overview

Deliang Wang, Jitong Chen

Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning to supervised speech separation has dramatically accelerated progress and boosted separation performance. This paper provides a comprehensive overview of the research on deep learning based supervised speech separation in the last several years. We first introduce the background of speech separation and the formulation of supervised separation. Then, we discuss three main components of supervised separation: learning machines, training targets, and acoustic features. Much of the overview is on separation algorithms where we review monaural methods, including speech enhancement (speech-nonspeech separation), speaker separation (multitalker separation), and speech dereverberation, as well as multimicrophone techniques. The important issue of generalization, unique to supervised learning, is discussed. This overview provides a historical perspective on how advances are made. In addition, we discuss a number of conceptual issues, including what constitutes the target source.

1553 sitasi en Computer Science, Medicine
S2 Open Access 2020
Deep Learning-based Human Pose Estimation: A Survey

Ce Zheng, Wenhan Wu, Taojiannan Yang et al.

Human pose estimation aims to locate the human body parts and build human body representation (e.g., body skeleton) from input data such as images and videos. It has drawn increasing attention during the past decade and has been utilized in a wide range of applications including human-computer interaction, motion analysis, augmented reality, and virtual reality. Although the recently developed deep learning-based solutions have achieved high performance in human pose estimation, there still remain challenges due to insufficient training data, depth ambiguities, and occlusion. The goal of this survey article is to provide a comprehensive review of recent deep learning-based solutions for both 2D and 3D pose estimation via a systematic analysis and comparison of these solutions based on their input data and inference procedures. More than 260 research papers since 2014 are covered in this survey. Furthermore, 2D and 3D human pose estimation datasets and evaluation metrics are included. Quantitative performance comparisons of the reviewed methods on popular datasets are summarized and discussed. Finally, the challenges involved, applications, and future research directions are concluded. A regularly updated project page is provided: https://github.com/zczcwh/DL-HPE.

880 sitasi en Computer Science
S2 Open Access 2019
Segmentation-based deep-learning approach for surface-defect detection

Domen Tabernik, Samo Sela, J. Skvarc et al.

Automated surface-anomaly detection using machine learning has become an interesting and promising area of research, with a very high and direct impact on the application domain of visual inspection. Deep-learning methods have become the most suitable approaches for this task. They allow the inspection system to learn to detect the surface anomaly by simply showing it a number of exemplar images. This paper presents a segmentation-based deep-learning architecture that is designed for the detection and segmentation of surface anomalies and is demonstrated on a specific domain of surface-crack detection. The design of the architecture enables the model to be trained using a small number of samples, which is an important requirement for practical applications. The proposed model is compared with the related deep-learning methods, including the state-of-the-art commercial software, showing that the proposed approach outperforms the related methods on the specific domain of surface-crack detection. The large number of experiments also shed light on the required precision of the annotation, the number of required training samples and on the required computational cost. Experiments are performed on a newly created dataset based on a real-world quality control case and demonstrates that the proposed approach is able to learn on a small number of defected surfaces, using only approximately 25–30 defective training samples, instead of hundreds or thousands, which is usually the case in deep-learning applications. This makes the deep-learning method practical for use in industry where the number of available defective samples is limited. The dataset is also made publicly available to encourage the development and evaluation of new methods for surface-defect detection.

864 sitasi en Computer Science
S2 Open Access 2020
Introduction to Machine Learning, Neural Networks, and Deep Learning

Rene Y. Choi, Aaron S. Coyner, Jayashree Kalpathy-Cramer et al.

Purpose To present an overview of current machine learning methods and their use in medical research, focusing on select machine learning techniques, best practices, and deep learning. Methods A systematic literature search in PubMed was performed for articles pertinent to the topic of artificial intelligence methods used in medicine with an emphasis on ophthalmology. Results A review of machine learning and deep learning methodology for the audience without an extensive technical computer programming background. Conclusions Artificial intelligence has a promising future in medicine; however, many challenges remain. Translational Relevance The aim of this review article is to provide the nontechnical readers a layman's explanation of the machine learning methods being used in medicine today. The goal is to provide the reader a better understanding of the potential and challenges of artificial intelligence within the field of medicine.

825 sitasi en Computer Science, Medicine
S2 Open Access 2020
Deep Learning for Sensor-based Human Activity Recognition

Kaixuan Chen, Dalin Zhang, Lina Yao et al.

The vast proliferation of sensor devices and Internet of Things enables the applications of sensor-based activity recognition. However, there exist substantial challenges that could influence the performance of the recognition system in practical scenarios. Recently, as deep learning has demonstrated its effectiveness in many areas, plenty of deep methods have been investigated to address the challenges in activity recognition. In this study, we present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets that can be used for evaluation in different challenge tasks. We then propose a new taxonomy to structure the deep methods by challenges. Challenges and challenge-related deep methods are summarized and analyzed to form an overview of the current research progress. At the end of this work, we discuss the open issues and provide some insights for future directions.

810 sitasi en Computer Science
S2 Open Access 2019
Point-Voxel CNN for Efficient 3D Deep Learning

Zhijian Liu, Haotian Tang, Yujun Lin et al.

We present Point-Voxel CNN (PVCNN) for efficient, fast 3D deep learning. Previous work processes 3D data using either voxel-based or point-based NN models. However, both approaches are computationally inefficient. The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution. As for point-based networks, up to 80% of the time is wasted on structuring the irregular data which have rather poor memory locality, not on the actual feature extraction. In this paper, we propose PVCNN that represents the 3D input data in points to reduce the memory consumption, while performing the convolutions in voxels to largely reduce the irregular data access and improve the locality. Our PVCNN model is both memory and computation efficient. Evaluated on semantic and part segmentation datasets, it achieves much higher accuracy than the voxel-based baseline with 10x GPU memory reduction; it also outperforms the state-of-the-art point-based models with 7x measured speedup on average. Remarkably, narrower version of PVCNN achieves 2x speedup over PointNet (an extremely efficient model) on part and scene segmentation benchmarks with much higher accuracy. We validate the general effectiveness of our PVCNN on 3D object detection: by replacing the primitives in Frustrum PointNet with PVConv, it outperforms Frustrum PointNet++ by 2.4% mAP on average with 1.5x measured speedup and GPU memory reduction.

818 sitasi en Computer Science
S2 Open Access 2020
Bayesian Deep Learning and a Probabilistic Perspective of Generalization

A. Wilson, Pavel Izmailov

The key distinguishing property of a Bayesian approach is marginalization, rather than using a single setting of weights. Bayesian marginalization can particularly improve the accuracy and calibration of modern deep neural networks, which are typically underspecified by the data, and can represent many compelling but different solutions. We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization, and propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction, without significant overhead. We also investigate the prior over functions implied by a vague distribution over neural network weights, explaining the generalization properties of such models from a probabilistic perspective. From this perspective, we explain results that have been presented as mysterious and distinct to neural network generalization, such as the ability to fit images with random labels, and show that these results can be reproduced with Gaussian processes. We also show that Bayesian model averaging alleviates double descent, resulting in monotonic performance improvements with increased flexibility. Finally, we provide a Bayesian perspective on tempering for calibrating predictive distributions.

772 sitasi en Computer Science, Mathematics
S2 Open Access 2020
Normalized Loss Functions for Deep Learning with Noisy Labels

Xingjun Ma, Hanxun Huang, Yisen Wang et al.

Robust loss functions are essential for training accurate deep neural networks (DNNs) in the presence of noisy (incorrect) labels. It has been shown that the commonly used Cross Entropy (CE) loss is not robust to noisy labels. Whilst new loss functions have been designed, they are only partially robust. In this paper, we theoretically show by applying a simple normalization that: any loss can be made robust to noisy labels. However, in practice, simply being robust is not sufficient for a loss function to train accurate DNNs. By investigating several robust loss functions, we find that they suffer from a problem of underfitting. To address this, we propose a framework to build robust loss functions called Active Passive Loss (APL). APL combines two robust loss functions that mutually boost each other. Experiments on benchmark datasets demonstrate that the family of new loss functions created by our APL framework can consistently outperform state-of-the-art methods by large margins, especially under large noise rates such as 60% or 80% incorrect labels.

541 sitasi en Computer Science, Mathematics
S2 Open Access 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

Zeyuan Allen-Zhu, Yuanzhi Li

We formally study how Ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model using Knowledge Distillation. We consider the challenging case where the ensemble is simply an average of the outputs of a few independently trained neural networks with the SAME architecture, trained using the SAME algorithm on the SAME data set, and they only differ by the random seeds used in the initialization. We empirically show that ensemble/knowledge distillation in deep learning works very differently from traditional learning theory, especially differently from ensemble of random feature mappings or the neural-tangent-kernel feature mappings, and is potentially out of the scope of existing theorems. Thus, to properly understand ensemble and knowledge distillation in deep learning, we develop a theory showing that when data has a structure we refer to as "multi-view", then ensemble of independently trained neural networks can provably improve test accuracy, and such superior test accuracy can also be provably distilled into a single model by training a single model to match the output of the ensemble instead of the true label. Our result sheds light on how ensemble works in deep learning in a way that is completely different from traditional theorems, and how the "dark knowledge" is hidden in the outputs of the ensemble -- that can be used in knowledge distillation -- comparing to the true data labels. In the end, we prove that self-distillation can also be viewed as implicitly combining ensemble and knowledge distillation to improve test accuracy.

453 sitasi en Computer Science, Mathematics
S2 Open Access 2021
Autonomous Driving Architectures: Insights of Machine Learning and Deep Learning Algorithms

M. Bachute, Javed Subhedar

Abstract Research in Autonomous Driving is taking momentum due to the inherent advantages of autonomous driving systems. The main advantage being the disassociation of the driver from the vehicle reducing the human intervention. However, the Autonomous Driving System involves many subsystems which need to be integrated as a whole system. Some of the tasks include Motion Planning, Vehicle Localization, Pedestrian Detection, Traffic Sign Detection, Road-marking Detection, Automated Parking, Vehicle Cybersecurity, and System Fault Diagnosis. This paper aims to the overview of various Machine Learning and Deep Learning Algorithms used in Autonomous Driving Architectures for different tasks like Motion Planning, Vehicle Localization, Pedestrian Detection, Traffic Sign Detection, Road-marking Detection, Automated Parking, Vehicle Cybersecurity and Fault Diagnosis. This paper surveys the technical aspects of Machine Learning and Deep Learning Algorithms used for Autonomous Driving Systems. Comparison of these algorithms is done based on the metrics like mean Intersect in over Union (mIoU), Average Precision (AP)missed detection rate, miss rate False Positives Per Image (FPPI), and average number for false frame detection. This study contributes to picture a review of the Machine Learning and Deep Learning Algorithms used for Autonomous Driving Systems and is organized based on the different tasks of the system.

234 sitasi en Computer Science

Halaman 13 dari 552077