The Option-Critic Architecture
Pierre-Luc Bacon, J. Harb, Doina Precup
Temporal abstraction is key to scaling up learning and planning in reinforcement learning. While planning with temporally extended actions is well understood, creating such abstractions autonomously from data has remained challenging.We tackle this problem in the framework of options [Sutton,Precup and Singh, 1999; Precup, 2000]. We derive policy gradient theorems for options and propose a new option-critic architecture capable of learning both the internal policies and the termination conditions of options, in tandem with the policy over options, and without the need to provide any additional rewards or subgoals. Experimental results in both discrete and continuous environments showcase the flexibility and efficiency of the framework.
1235 sitasi
en
Computer Science
Hierarchical Representations for Efficient Architecture Search
Hanxiao Liu, K. Simonyan, O. Vinyals
et al.
We explore efficient neural architecture search methods and show that a simple yet powerful evolutionary algorithm can discover new architectures with excellent performance. Our approach combines a novel hierarchical genetic representation scheme that imitates the modularized design pattern commonly adopted by human experts, and an expressive search space that supports complex topologies. Our algorithm efficiently discovers architectures that outperform a large number of manually designed models for image classification, obtaining top-1 error of 3.6% on CIFAR-10 and 20.3% when transferred to ImageNet, which is competitive with the best existing neural architecture search approaches. We also present results using random search, achieving 0.3% less top-1 accuracy on CIFAR-10 and 0.1% less on ImageNet whilst reducing the search time from 36 hours down to 1 hour.
949 sitasi
en
Computer Science, Mathematics
Auto-Keras: An Efficient Neural Architecture Search System
Haifeng Jin, Qingquan Song, Xia Hu
Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms, e.g., NASNet, PNAS, usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling more efficient training during the search. In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search. The framework develops a neural network kernel and a tree-structured acquisition function optimization algorithm to efficiently explores the search space. Extensive experiments on real-world benchmark datasets have been done to demonstrate the superior performance of the developed framework over the state-of-the-art methods. Moreover, we build an open-source AutoML system based on our method, namely Auto-Keras. The code and documentation are available at https://autokeras.com. The system runs in parallel on CPU and GPU, with an adaptive search strategy for different GPU memory limits.
902 sitasi
en
Mathematics, Computer Science
The genetic architecture of Parkinson's disease.
C. Blauwendraat, M. Nalls, A. Singleton
Parkinson's disease is a complex neurodegenerative disorder for which both rare and common genetic variants contribute to disease risk, onset, and progression. Mutations in more than 20 genes have been associated with the disease, most of which are highly penetrant and often cause early onset or atypical symptoms. Although our understanding of the genetic basis of Parkinson's disease has advanced considerably, much remains to be done. Further disease-related common genetic variability remains to be identified and the work in identifying rare risk alleles has only just begun. To date, genome-wide association studies have identified 90 independent risk-associated variants. However, most of them have been identified in patients of European ancestry and we know relatively little of the genetics of Parkinson's disease in other populations. We have a limited understanding of the biological functions of the risk alleles that have been identified, although Parkinson's disease risk variants appear to be in close proximity to known Parkinson's disease genes and lysosomal-related genes. In the past decade, multiple efforts have been made to investigate the genetic architecture of Parkinson's disease, and emerging technologies, such as machine learning, single-cell RNA sequencing, and high-throughput screens, will improve our understanding of genetic risk.
834 sitasi
en
Medicine, Biology
A Comprehensive Survey of Neural Architecture Search
Pengzhen Ren, Yun Xiao, Xiaojun Chang
et al.
Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.
829 sitasi
en
Computer Science, Mathematics
CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope
Dulari Bhatt, Chirag I. Patel, Hardik N. Talsania
et al.
Computer vision is becoming an increasingly trendy word in the area of image processing. With the emergence of computer vision applications, there is a significant demand to recognize objects automatically. Deep CNN (convolution neural network) has benefited the computer vision community by producing excellent results in video processing, object recognition, picture classification and segmentation, natural language processing, speech recognition, and many other fields. Furthermore, the introduction of large amounts of data and readily available hardware has opened new avenues for CNN study. Several inspirational concepts for the progress of CNN have been investigated, including alternative activation functions, regularization, parameter optimization, and architectural advances. Furthermore, achieving innovations in architecture results in a tremendous enhancement in the capacity of the deep CNN. Significant emphasis has been given to leveraging channel and spatial information, with a depth of architecture and information processing via multi-path. This survey paper focuses mainly on the primary taxonomy and newly released deep CNN architectures, and it divides numerous recent developments in CNN architectures into eight groups. Spatial exploitation, multi-path, depth, breadth, dimension, channel boosting, feature-map exploitation, and attention-based CNN are the eight categories. The main contribution of this manuscript is in comparing various architectural evolutions in CNN by its architectural change, strengths, and weaknesses. Besides, it also includes an explanation of the CNN’s components, the strengths and weaknesses of various CNN variants, research gap or open challenges, CNN applications, and the future research direction.
A survey of mobile cloud computing: architecture, applications, and approaches
D. Hoang, Chonho Lee, D. Niyato
et al.
2517 sitasi
en
Computer Science
Random Search and Reproducibility for Neural Architecture Search
Liam Li, Ameet Talwalkar
Neural architecture search (NAS) is a promising research direction that has the potential to replace expert-designed networks with learned, task-specific architectures. In this work, in order to help ground the empirical results in this field, we propose new NAS baselines that build off the following observations: (i) NAS is a specialized hyperparameter optimization problem; and (ii) random search is a competitive baseline for hyperparameter optimization. Leveraging these observations, we evaluate both random search with early-stopping and a novel random search with weight-sharing algorithm on two standard NAS benchmarks---PTB and CIFAR-10. Our results show that random search with early-stopping is a competitive NAS baseline, e.g., it performs at least as well as ENAS, a leading NAS method, on both benchmarks. Additionally, random search with weight-sharing outperforms random search with early-stopping, achieving a state-of-the-art NAS result on PTB and a highly competitive result on CIFAR-10. Finally, we explore the existing reproducibility issues of published NAS results. We note the lack of source material needed to exactly reproduce these results, and further discuss the robustness of published results given the various sources of variability in NAS experimental setups. Relatedly, we provide all information (code, random seeds, documentation) needed to exactly reproduce our results, and report our random search with weight-sharing results for each benchmark on multiple runs.
788 sitasi
en
Computer Science, Mathematics
Understanding and Simplifying One-Shot Architecture Search
Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph
et al.
808 sitasi
en
Computer Science
Toward an Architecture for Never-Ending Language Learning
Andrew Carlson, J. Betteridge, B. Kisiel
et al.
We consider here the problem of building a never-ending language learner; that is, an intelligent computer agent that runs forever and that each day must (1) extract, or read, information from the web to populate a growing structured knowledge base, and (2) learn to perform this task better than on the previous day. In particular, we propose an approach and a set of design principles for such an agent, describe a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs with an estimated precision of 74% after running for 67 days, and discuss lessons learned from this preliminary attempt to build a never-ending learning agent.
2236 sitasi
en
Computer Science
The Cascade-Correlation Learning Architecture
S. Fahlman, C. Lebiere
3049 sitasi
en
Computer Science
Space is the machine: A configurational theory of architecture
B. Hillier
2450 sitasi
en
Computer Science
The functional architecture of human empathy.
J. Decety, P. Jackson
3051 sitasi
en
Psychology, Medicine
Pursuing Happiness: The Architecture of Sustainable Change
S. Lyubomirsky, Kennon M. Sheldon, D. Schkade
3164 sitasi
en
Psychology
MDA explained - the Model Driven Architecture: practice and promise
Anneke Kleppe, J. Warmer, Wim Bast
2244 sitasi
en
Engineering, Computer Science
Systematic determination of genetic network architecture
Saeed Tavazoie, J. Hughes, M. Campbell
et al.
2743 sitasi
en
Biology, Medicine
Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps
G. Carpenter, S. Grossberg, Natalya Markuzon
et al.
2220 sitasi
en
Medicine, Computer Science
Modeling Rational Agents within a BDI-Architecture
Anand Srinivasa Rao, M. Georgeff
2641 sitasi
en
Computer Science
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Hanrui Wang, Zhekai Zhang, Song Han
The attention mechanism is becoming increasingly popular in Natural Language Processing (NLP) applications, showing superior performance than convolutional and recurrent architectures. However, general-purpose platforms such as CPUs and GPUs are inefficient when performing attention inference due to complicated data movement and low arithmetic intensity. Moreover, existing NN accelerators mainly focus on optimizing convolutional or recurrent models, and cannot efficiently support attention. In this paper, we present SpAtten, an efficient algorithm-architecture co-design that leverages token sparsity, head sparsity, and quantization opportunities to reduce the attention computation and memory access. Inspired by the high redundancy of human languages, we propose the novel cascade token pruning to prune away unimportant tokens in the sentence. We also propose cascade head pruning to remove unessential heads. Cascade pruning is fundamentally different from weight pruning since there is no trainable weight in the attention mechanism, and the pruned tokens and heads are selected on the fly. To efficiently support them on hardware, we design a novel top-k engine to rank token and head importance scores with high throughput. Furthermore, we propose progressive quantization that first fetches MSBs only and performs the computation; if the confidence is low, it fetches LSBs and recomputes the attention outputs, trading computation for memory reduction.Extensive experiments on 30 benchmarks show that, on average, SpAtten reduces DRAM access by 10.0× with no accuracy loss, and achieves 1.6×, 3.0×, 162×, 347× speedup, and 1.4×, 3.2×, 1193×, 4059× energy savings over A3 accelerator, MNNFast accelerator, TITAN Xp GPU, Xeon CPU, respectively.
521 sitasi
en
Computer Science
Neural Architecture Search without Training
J. Mellor, Jack Turner, A. Storkey
et al.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be extremely slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be remedied if we could infer a network's trained accuracy from its initial state. In this work, we examine how the linear maps induced by data points correlate for untrained network architectures in the NAS-Bench-201 search space, and motivate how this can be used to give a measure of modelling flexibility which is highly indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU. Code to reproduce our experiments is available at this https URL.
467 sitasi
en
Computer Science, Mathematics