Hasil untuk "deep learning"

Menampilkan 20 dari ~11053884 hasil · dari DOAJ, Semantic Scholar, CrossRef

JSON API
S2 Open Access 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning

Justin Fu, Aviral Kumar, Ofir Nachum et al.

The offline reinforcement learning (RL) problem, also known as batch RL, refers to the setting where a policy must be learned from a static dataset, without additional online data collection. This setting is compelling as potentially it allows RL methods to take advantage of large, pre-collected datasets, much like how the rise of large datasets has fueled results in supervised learning in recent years. However, existing online RL benchmarks are not tailored towards the offline setting, making progress in offline RL difficult to measure. In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. Examples of such properties include: datasets generated via hand-designed controllers and human demonstrators, multi-objective datasets where an agent can perform different tasks in the same environment, and datasets consisting of a mixtures of policies. To facilitate research, we release our benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase. We hope that our benchmark will focus research effort on methods that drive improvements not just on simulated tasks, but ultimately on the kinds of real-world problems where offline RL will have the largest impact.

1709 sitasi en Computer Science, Mathematics
S2 Open Access 2017
Enhanced Deep Residual Networks for Single Image Super-Resolution

Bee Lim, Sanghyun Son, Heewon Kim et al.

Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge[26].

7047 sitasi en Computer Science
S2 Open Access 2017
Deep Learning for Medical Image Processing: Overview, Challenges and Future

M. I. Razzak, S. Naz, Ahmad Zaib

The health care sector is totally different from any other industry. It is a high priority sector and consumers expect the highest level of care and services regardless of cost. The health care sector has not achieved society’s expectations, even though the sector consumes a huge percentage of national budgets. Mostly, the interpretations of medical data are analyzed by medical experts. In terms of a medical expert interpreting images, this is quite limited due to its subjectivity and the complexity of the images; extensive variations exist between experts and fatigue sets in due to their heavy workload. Following the success of deep learning in other real-world applications, it is seen as also providing exciting and accurate solutions for medical imaging, and is seen as a key method for future applications in the health care sector. In this chapter, we discuss state-of-the-art deep learning architecture and its optimization when used for medical image segmentation and classification. The chapter closes with a discussion of the challenges of deep learning methods with regard to medical imaging and open research issue.

1151 sitasi en Computer Science
S2 Open Access 2017
Deep Learning for Brain MRI Segmentation: State of the Art and Future Directions

Z. Akkus, A. Galimzianova, A. Hoogi et al.

Quantitative analysis of brain MRI is routine for many neurological diseases and conditions and relies on accurate segmentation of structures of interest. Deep learning-based segmentation approaches for brain MRI are gaining interest due to their self-learning and generalization ability over large amounts of data. As the deep learning architectures are becoming more mature, they gradually outperform previous state-of-the-art classical machine learning algorithms. This review aims to provide an overview of current deep learning-based segmentation approaches for quantitative brain MRI. First we review the current deep learning architectures used for segmentation of anatomical brain structures and brain lesions. Next, the performance, speed, and properties of deep learning approaches are summarized and discussed. Finally, we provide a critical assessment of the current state and identify likely future developments and trends.

999 sitasi en Computer Science, Medicine
S2 Open Access 2016
A Deep Learning Approach for Network Intrusion Detection System

A. Javaid, Quamar Niyaz, Weiqing Sun et al.

A Network Intrusion Detection System (NIDS) helps system administrators to detect network security breaches in their organizations. However, many challenges arise while developing a flexible and efficient NIDS for unforeseen and unpredictable attacks. We propose a deep learning based approach for developing such an efficient and flexible NIDS. We use Self-taught Learning (STL), a deep learning based technique, on NSL-KDD - a benchmark dataset for network intrusion. We present the performance of our approach and compare it with a few previous work. Compared metrics include accuracy, precision, recall, and f-measure values.

1162 sitasi en Computer Science
S2 Open Access 2018
Demystifying Parallel and Distributed Deep Learning

Tal Ben-Nun, T. Hoefler

Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning.

783 sitasi en Computer Science
S2 Open Access 2018
On the information bottleneck theory of deep learning

Andrew M. Saxe, Yamini Bansal, Joel Dapello et al.

The practical successes of deep neural networks have not been matched by theoretical progress that satisfyingly explains their behavior. In this work, we study the information bottleneck (IB) theory of deep learning, which makes three specific claims: first, that deep networks undergo two distinct phases consisting of an initial fitting phase and a subsequent compression phase; second, that the compression phase is causally related to the excellent generalization performance of deep networks; and third, that the compression phase occurs due to the diffusion-like behavior of stochastic gradient descent. Here we show that none of these claims hold true in the general case, and instead reflect assumptions made to compute a finite mutual information metric in deterministic networks. When computed using simple binning, we demonstrate through a combination of analytical results and simulation that the information plane trajectory observed in prior work is predominantly a function of the neural nonlinearity employed: double-sided saturating nonlinearities like yield a compression phase as neural activations enter the saturation regime, but linear activation functions and single-sided saturating nonlinearities like the widely used ReLU in fact do not. Moreover, we find that there is no evident causal connection between compression and generalization: networks that do not compress are still capable of generalization, and vice versa. Next, we show that the compression phase, when it exists, does not arise from stochasticity in training by demonstrating that we can replicate the IB findings using full batch gradient descent rather than stochastic gradient descent. Finally, we show that when an input domain consists of a subset of task-relevant and task-irrelevant information, hidden representations do compress the task-irrelevant information, although the overall information about the input may monotonically increase with training time, and that this compression happens concurrently with the fitting process rather than during a subsequent compression period.

669 sitasi en Physics, Computer Science
S2 Open Access 2018
Deep Learning-Based Channel Estimation

Mehran Soltani, V. Pourahmadi, A. Mirzaei et al.

In this letter, we present a deep learning algorithm for channel estimation in communication systems. We consider the time–frequency response of a fast fading communication channel as a 2D image. The aim is to find the unknown values of the channel response using some known values at the pilot locations. To this end, a general pipeline using deep image processing techniques, image super-resolution (SR), and image restoration (IR) is proposed. This scheme considers the pilot values, altogether, as a low-resolution image and uses an SR network cascaded with a denoising IR network to estimate the channel. Moreover, the implementation of the proposed pipeline is presented. The estimation error shows that the presented algorithm is comparable to the minimum mean square error (MMSE) with full knowledge of the channel statistics, and it is better than an approximation to linear MMSE. The results confirm that this pipeline can be used efficiently in channel estimation.

656 sitasi en Computer Science, Mathematics
S2 Open Access 2019
Deep learning models for bankruptcy prediction using textual disclosures

Feng Mai, Shaonan Tian, Chihoon Lee et al.

Abstract This study introduces deep learning models for corporate bankruptcy forecasting using textual disclosures. Although textual data are common, it is rarely considered in the financial decision support models. Deep learning uses layers of neural networks to extract features from textual data for prediction. We construct a comprehensive bankruptcy database of 11,827 U.S. public companies and show that deep learning models yield superior prediction performance in forecasting bankruptcy using textual disclosures. When textual data are used in conjunction with traditional accounting-based ratio and market-based variables, deep learning models can further improve the prediction accuracy. We also investigate the effectiveness of two deep learning architectures. Interestingly, our empirical results show that simpler models such as averaging embedding are more effective than convolutional neural networks. Our results provide the first large-sample evidence for the predictive power of textual disclosures.

386 sitasi en Computer Science
S2 Open Access 2019
A comprehensive study on deep learning bug characteristics

Md Johirul Islam, Giang Nguyen, Rangeet Pan et al.

Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such characteristics of bugs in deep learning software has the potential to foster the development of better deep learning platforms, debugging mechanisms, development practices, and encourage the development of analysis and verification frameworks. Therefore, we study 2716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, root causes of bugs, impacts of bugs, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. The key findings of our study include: data bug and logic bug are the most severe bug types in deep learning software appearing more than 48% of the times, major root causes of these bugs are Incorrect Model Parameter (IPS) and Structural Inefficiency (SI) showing up more than 43% of the times.We have also found that the bugs in the usage of deep learning libraries have some common antipatterns.

348 sitasi en Computer Science
DOAJ Open Access 2025
GBsim: A Robust GCN-BERT Approach for Cross-Architecture Binary Code Similarity Analysis

Jiang Du, Qiang Wei, Yisen Wang et al.

Recent advances in graph neural networks have transformed structural pattern learning in domains ranging from social network analysis to biomolecular modeling. Nevertheless, practical deployments in mission-critical scenarios such as binary code similarity detection face two fundamental obstacles: first, the inherent noise in graph construction processes exemplified by incomplete control flow edges during binary function recovery; second, the substantial distribution discrepancies caused by cross-architecture instruction set variations. Conventional GNN architectures demonstrate severe performance degradation under such low signal-to-noise ratio conditions and cross-domain operational environments, particularly in security-sensitive vulnerability identification tasks where feature instability or domain shifts could trigger critical false judgments. To address these challenges, we propose GBsim, a novel approach that combines graph neural networks with natural language processing. GBsim employs a cross-architecture language model to transform binary functions into semantic graphs, leverages a multilayer GCN for structural feature extraction, and employs a Transformer layer to integrate semantic information, generates robust cross-architecture embeddings that maintain high performance despite significant distribution shifts. Extensive experiments on a large-scale cross-architecture dataset show that GBsim achieves an MRR of 0.901 and a Recall@1 of 0.831, outperforming state-of-the-art methods. In real-world vulnerability detection tasks, GBsim achieves an average recall rate of 81.3% on a 1-day vulnerability dataset, demonstrating its practical effectiveness in identifying security threats and outperforming existing methods by 2.1%. This performance advantage stems from GBsim’s ability to maximize information preservation across architectural boundaries, enhancing model robustness in the presence of noise and distribution shifts.

Science, Astrophysics

Halaman 21 dari 552695