Competition-level code generation with AlphaCode
Yujia Li, David Choi, Junyoung Chung
et al.
Programming is a powerful and ubiquitous problem-solving tool. Systems that can assist programmers or even generate programs themselves could make programming more productive and accessible. Recent transformer-based neural network models show impressive code generation abilities yet still perform poorly on more complex tasks requiring problem-solving skills, such as competitive programming problems. Here, we introduce AlphaCode, a system for code generation that achieved an average ranking in the top 54.3% in simulated evaluations on recent programming competitions on the Codeforces platform. AlphaCode solves problems by generating millions of diverse programs using specially trained transformer-based networks and then filtering and clustering those programs to a maximum of just 10 submissions. This result marks the first time an artificial intelligence system has performed competitively in programming competitions. Description Machine learning systems can program too Computer programming competitions are popular tests among programmers that require critical thinking informed by experience and creating solutions to unforeseen problems, both of which are key aspects of human intelligence but challenging to mimic by machine learning models. Using self-supervised learning and an encoder-decoder transformer architecture, Li et al. developed AlphaCode, a deep-learning model that can achieve approximately human-level performance on the Codeforces platform, which regularly hosts these competitions and attracts numerous participants worldwide (see the Perspective by Kolter). The development of such coding platforms could have a huge impact on programmers’ productivity. It may even change the culture of programming by shifting human work to formulating problems, with machine learning being the main one responsible for generating and executing codes. —YS Modern machine learning systems can achieve average human-level performance in popular competitive programming contests.
2050 sitasi
en
Computer Science, Medicine
Deep One-Class Classification
Lukas Ruff, Nico Görnitz, Lucas Deecke
et al.
2485 sitasi
en
Computer Science
Model Cards for Model Reporting
Margaret Mitchell, Simone Wu, Andrew Zaldivar
et al.
Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type [15]) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related artificial intelligence technology, increasing transparency into how well artificial intelligence technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation.
2509 sitasi
en
Computer Science
Artificial intelligence in healthcare: past, present and future
F. Jiang, Yong Jiang, Hui Zhi
et al.
Artificial intelligence (AI) aims to mimic human cognitive functions. It is bringing a paradigm shift to healthcare, powered by increasing availability of healthcare data and rapid progress of analytics techniques. We survey the current status of AI applications in healthcare and discuss its future. AI can be applied to various types of healthcare data (structured and unstructured). Popular AI techniques include machine learning methods for structured data, such as the classical support vector machine and neural network, and the modern deep learning, as well as natural language processing for unstructured data. Major disease areas that use AI tools include cancer, neurology and cardiology. We then review in more details the AI applications in stroke, in the three major areas of early detection and diagnosis, treatment, as well as outcome prediction and prognosis evaluation. We conclude with discussion about pioneer AI systems, such as IBM Watson, and hurdles for real-life deployment of AI.
3496 sitasi
en
Computer Science, Medicine
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
Nicolas Papernot, P. Mcdaniel, Xi Wu
et al.
Deep learning algorithms have been shown to perform extremely well on many classical machine learning problems. However, recent studies have shown that deep learning, like other machine learning techniques, is vulnerable to adversarial samples: inputs crafted to force a deep neural network (DNN) to provide adversary-selected outputs. Such attacks can seriously undermine the security of the system supported by the DNN, sometimes with devastating consequences. For example, autonomous vehicles can be crashed, illicit or illegal content can bypass content filters, or biometric authentication systems can be manipulated to allow improper access. In this work, we introduce a defensive mechanism called defensive distillation to reduce the effectiveness of adversarial samples on DNNs. We analytically investigate the generalizability and robustness properties granted by the use of defensive distillation when training DNNs. We also empirically study the effectiveness of our defense mechanisms on two DNNs placed in adversarial settings. The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN. Such dramatic gains can be explained by the fact that distillation leads gradients used in adversarial sample creation to be reduced by a factor of 1030. We also find that distillation increases the average minimum number of features that need to be modified to create adversarial samples by about 800% on one of the DNNs we tested.
3263 sitasi
en
Computer Science, Mathematics
Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures
Matt Fredrikson, S. Jha, Thomas Ristenpart
3217 sitasi
en
Computer Science
A survey on feature selection methods
Girish Chandrashekar, F. Sahin
4676 sitasi
en
Computer Science
Gradient boosting machines, a tutorial
Alexey Natekin, A. Knoll
Gradient boosting machines are a family of powerful machine-learning techniques that have shown considerable success in a wide range of practical applications. They are highly customizable to the particular needs of the application, like being learned with respect to different loss functions. This article gives a tutorial introduction into the methodology of gradient boosting methods with a strong focus on machine learning aspects of modeling. A theoretical information is complemented with descriptive examples and illustrations which cover all the stages of the gradient boosting model design. Considerations on handling the model complexity are discussed. Three practical examples of gradient boosting applications are presented and comprehensively analyzed.
3291 sitasi
en
Computer Science, Medicine
A Practical Guide to Training Restricted Boltzmann Machines
Geoffrey E. Hinton
3184 sitasi
en
Computer Science
Spark: Cluster Computing with Working Sets
M. Zaharia, Mosharaf Chowdhury, Michael J. Franklin
et al.
4948 sitasi
en
Computer Science
Collective Classification in Network Data
Prithviraj Sen, Galileo Namata, M. Bilgic
et al.
4517 sitasi
en
Computer Science
Random Features for Large-Scale Kernel Machines
A. Rahimi, B. Recht
4781 sitasi
en
Mathematics, Computer Science
Experiments with a New Boosting Algorithm
Y. Freund, R. Schapire
9798 sitasi
en
Computer Science
Genetic Algorithms + Data Structures = Evolution Programs
Z. Michalewicz
12925 sitasi
en
Computer Science, Mathematics
Laplacian Eigenmaps for Dimensionality Reduction and Data Representation
Mikhail Belkin, P. Niyogi
8424 sitasi
en
Computer Science, Mathematics
The relationship between Precision-Recall and ROC curves
Jesse Davis, Mark H. Goadrich
6567 sitasi
en
Computer Science, Mathematics
Introduction to Statistical Pattern Recognition
P. Pudil, P. Somol, M. Haindl
4577 sitasi
en
Computer Science
An Introduction to Genetic Algorithms.
D. Heiss-Czedik
10011 sitasi
en
Computer Science
Learning Multiple Tasks with Kernel Methods
T. Evgeniou, C. Micchelli, M. Pontil
995 sitasi
en
Computer Science, Mathematics
Active Learning
Burr Settles
696 sitasi
en
Computer Science