Artificial intelligence to deep learning: machine intelligence approach for drug discovery
Rohan Gupta, Devesh Srivastava, Mehar Sahu
et al.
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure–activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure–activity relationship to drug repositioning, protein misfolding to protein–protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
Tianshi Chen, Zidong Du, Ninghui Sun
et al.
1635 sitasi
en
Computer Science
Scaling Distributed Machine Learning with the Parameter Server
Mu Li, D. Andersen, J. Park
et al.
1898 sitasi
en
Computer Science
Evasion Attacks against Machine Learning at Test Time
B. Biggio, Igino Corona, Davide Maiorca
et al.
In security-sensitive applications, the success of machine learning depends on a thorough vetting of their resistance to adversarial data. In one pertinent, well-motivated attack scenario, an adversary may attempt to evade a deployed system at test time by carefully manipulating attack samples. In this work, we present a simple but effective gradient-based approach that can be exploited to systematically assess the security of several, widely-used classification algorithms against evasion attacks. Following a recently proposed framework for security evaluation, we simulate attack scenarios that exhibit different risk levels for the classifier by increasing the attacker's knowledge of the system and her ability to manipulate attack samples. This gives the classifier designer a better picture of the classifier performance under evasion attacks, and allows him to perform a more informed model selection (or parameter setting). We evaluate our approach on the relevant security task of malware detection in PDF files, and show that such systems can be easily evaded. We also sketch some countermeasures suggested by our analysis.
2290 sitasi
en
Computer Science
Next-Generation Machine Learning for Biological Networks.
Diogo M. Camacho, K. M. Collins, Rani K. Powers
et al.
Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology.
802 sitasi
en
Medicine, Biology
Machine Learning With Big Data: Challenges and Approaches
Alexandra L’Heureux, Katarina Grolinger, H. ElYamany
et al.
The Big Data revolution promises to transform how we live, work, and think by enabling process optimization, empowering insight discovery and improving decision making. The realization of this grand potential relies on the ability to extract value from such massive data through data analytics; machine learning is at its core because of its ability to learn from data and provide data driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era, and thus are based upon multiple assumptions, such as the data set fitting entirely into memory, what unfortunately no longer holds true in this new context. These broken assumptions, together with the Big Data characteristics, are creating obstacles for the traditional techniques. Consequently, this paper compiles, summarizes, and organizes machine learning challenges with Big Data. In contrast to other research that discusses challenges, this work highlights the cause–effect relationship by organizing challenges according to Big Data Vs or dimensions that instigated the issue: volume, velocity, variety, or veracity. Moreover, emerging machine learning approaches and techniques are discussed in terms of how they are capable of handling the various challenges with the ultimate objective of helping practitioners select appropriate solutions for their use cases. Finally, a matrix relating the challenges and approaches is presented. Through this process, this paper provides a perspective on the domain, identifies research gaps and opportunities, and provides a strong foundation and encouragement for further research in the field of machine learning with Big Data.
825 sitasi
en
Computer Science
Machine learning in chemoinformatics and drug discovery.
Yu-Chen Lo, Stefano E. Rensi, Wen Torng
et al.
Chemoinformatics is an established discipline focusing on extracting, processing and extrapolating meaningful data from chemical structures. With the rapid explosion of chemical 'big' data from HTS and combinatorial synthesis, machine learning has become an indispensable tool for drug designers to mine chemical information from large compound databases to design drugs with important biological properties. To process the chemical data, we first reviewed multiple processing layers in the chemoinformatics pipeline followed by the introduction of commonly used machine learning models in drug discovery and QSAR analysis. Here, we present basic principles and recent case studies to demonstrate the utility of machine learning techniques in chemoinformatics analyses; and we discuss limitations and future directions to guide further development in this evolving field.
747 sitasi
en
Computer Science, Medicine
eDoctor: machine learning and the future of medicine
G. Handelman, H. Kok, R. Chandra
et al.
Machine learning (ML) is a burgeoning field of medicine with huge resources being applied to fuse computer science and statistics to medical problems. Proponents of ML extol its ability to deal with large, complex and disparate data, often found within medicine and feel that ML is the future for biomedical research, personalized medicine, computer‐aided diagnosis to significantly advance global health care. However, the concepts of ML are unfamiliar to many medical professionals and there is untapped potential in the use of ML as a research tool. In this article, we provide an overview of the theory behind ML, explore the common ML algorithms used in medicine including their pitfalls and discuss the potential future of ML in medicine.
Super-resolution reconstruction of turbulent flows with machine learning
Kai Fukami, K. Fukagata, K. Taira
We use machine learning to perform super-resolution analysis of grossly under-resolved turbulent flow field data to reconstruct the high-resolution flow field. Two machine learning models are developed, namely, the convolutional neural network (CNN) and the hybrid downsampled skip-connection/multi-scale (DSC/MS) models. These machine learning models are applied to a two-dimensional cylinder wake as a preliminary test and show remarkable ability to reconstruct laminar flow from low-resolution flow field data. We further assess the performance of these models for two-dimensional homogeneous turbulence. The CNN and DSC/MS models are found to reconstruct turbulent flows from extremely coarse flow field images with remarkable accuracy. For the turbulent flow problem, the machine-leaning-based super-resolution analysis can greatly enhance the spatial resolution with as little as 50 training snapshot data, holding great potential to reveal subgrid-scale physics of complex turbulent flows. With the growing availability of flow field data from high-fidelity simulations and experiments, the present approach motivates the development of effective super-resolution models for a variety of fluid flows.
Current Applications and Future Impact of Machine Learning in Radiology.
M. M. •. Garry Choy, MD Mph Omid Khalilzadeh, MD • Mark Michalski
et al.
Recent advances and future perspectives of machine learning techniques offer promising applications in medical imaging. Machine learning has the potential to improve different steps of the radiology workflow including order scheduling and triage, clinical decision support systems, detection and interpretation of findings, postprocessing and dose estimation, examination quality control, and radiology reporting. In this article, the authors review examples of current applications of machine learning and artificial intelligence techniques in diagnostic radiology. In addition, the future impact and natural extension of these techniques in radiology practice are discussed.
Interpretable Machine Learning in Healthcare
M. Ahmad, A. Teredesai, C. Eckert
This tutorial extensively covers the definitions, nuances, challenges, and requirements for the design of interpretable and explainable machine learning models and systems in healthcare. We discuss many uses in which interpretable machine learning models are needed in healthcare and how they should be deployed. Additionally, we explore the landscape of recent advances to address the challenges model interpretability in healthcare and also describe how one would go about choosing the right interpretable machine learnig algorithm for a given problem in healthcare.
661 sitasi
en
Computer Science
Waiting for a sales renaissance in the fourth industrial revolution: Machine learning and artificial intelligence in sales research and practice
Niladri B. Syam, Arun Sharma
Machine Learning from Theory to Algorithms: An Overview
J. Alzubi, A. Nayyar, Akshi Kumar
The current SMAC (Social, Mobile, Analytic, Cloud) technology trend paves the way to a future in which intelligent machines, networked processes and big data are brought together. This virtual world has generated vast amount of data which is accelerating the adoption of machine learning solutions & practices. Machine Learning enables computers to imitate and adapt human-like behaviour. Using machine learning, each interaction, each action performed, becomes something the system can learn and use as experience for the next time. This work is an overview of this data analytics method which enables computers to learn and do what comes naturally to humans, i.e. learn from experience. It includes the preliminaries of machine learning, the definition, nomenclature and applications’ describing it’s what, how and why. The technology roadmap of machine learning is discussed to understand and verify its potential as a market & industry practice. The primary intent of this work is to give insight into why machine learning is the future.
617 sitasi
en
Computer Science, Physics
A Review of Machine Learning and Deep Learning Applications
Pramila Shinde, Seema Shah
Machine learning is one of the fields in the modern computing world. A plenty of research has been undertaken to make machines intelligent. Learning is a natural human behavior which has been made an essential aspect of the machines as well. There are various techniques devised for the same. Traditional machine learning algorithms have been applied in many application areas. Researchers have put many efforts to improve the accuracy of that machinelearning algorithms. Another dimension was given thought which leads to deep learning concept. Deep learning is a subset of machine learning. So far few applications of deep learning have been explored. This is definitely going to cater to solving issues in several new application domains, sub-domains using deep learning. A review of these past and future application domains, sub-domains, and applications of machine learning and deep learning are illustrated in this paper.
538 sitasi
en
Computer Science
iml: An R package for Interpretable Machine Learning
Christoph Molnar, Giuseppe Casalicchio, B. Bischl
Complex, non-parametric models, which are typically used in machine learning, have proven to be successful in many prediction tasks. But these models usually operate as black boxes: While they are good at predicting, they are often not interpretable. Many inherently interpretable models have been suggested, which come at the cost of losing predictive power. Another option is to apply interpretability methods to a black box model after model training. Given the velocity of research on new machine learning models, it is preferable to have model-agnostic tools which can be applied to a random forest as well as to a neural network. Tools for model-agnostic interpretability methods should improve the adoption of machine learning.
522 sitasi
en
Computer Science
Stealing Hyperparameters in Machine Learning
Binghui Wang, N. Gong
Hyperparameters are critical in machine learning, as different hyperparameters often result in models with significantly different performance. Hyperparameters may be deemed confidential because of their commercial value and the confidentiality of the proprietary algorithms that the learner uses to learn them. In this work, we propose attacks on stealing the hyperparameters that are learned by a learner. We call our attacks hyperparameter stealing attacks. Our attacks are applicable to a variety of popular machine learning algorithms such as ridge regression, logistic regression, support vector machine, and neural network. We evaluate the effectiveness of our attacks both theoretically and empirically. For instance, we evaluate our attacks on Amazon Machine Learning. Our results demonstrate that our attacks can accurately steal hyperparameters. We also study countermeasures. Our results highlight the need for new defenses against our hyperparameter stealing attacks for certain machine learning algorithms.
501 sitasi
en
Computer Science, Mathematics
Accelerating the Machine Learning Lifecycle with MLflow
M. Zaharia, Andrew Chen, A. Davidson
et al.
489 sitasi
en
Computer Science
Machine Learning in Banking Risk Management: A Literature Review
M. Leo, Suneel Sharma, K. Maddulety
There is an increasing influence of machine learning in business applications, with many solutions already implemented and many more being explored. Since the global financial crisis, risk management in banks has gained more prominence, and there has been a constant focus around how risks are being detected, measured, reported and managed. Considerable research in academia and industry has focused on the developments in banking and risk management and the current and emerging challenges. This paper, through a review of the available literature seeks to analyse and evaluate machine-learning techniques that have been researched in the context of banking risk management, and to identify areas or problems in risk management that have been inadequately explored and are potential areas for further research. The review has shown that the application of machine learning in the management of banking risks such as credit risk, market risk, operational risk and liquidity risk has been explored; however, it doesn’t appear commensurate with the current industry level of focus on both risk management and machine learning. A large number of areas remain in bank risk management that could significantly benefit from the study of how machine learning can be applied to address specific problems.
Machine learning and soil sciences: a review aided by machine learning tools
J. Padarian, B. Minasny, A. McBratney
Abstract. The application of machine learning (ML) techniques in various fields of science has increased rapidly, especially in the last 10 years. The increasing availability of soil data that can be efficiently acquired remotely and proximally, and freely available open-source algorithms, have led to an accelerated adoption of ML techniques to analyse soil data. Given the large number of publications, it is an impossible task to manually review all papers on the application of ML in soil science without narrowing down a narrative of ML application in a specific research question. This paper aims to provide a comprehensive review of the application of ML techniques in soil science aided by a ML algorithm (latent Dirichlet allocation) to find patterns in a large collection of text corpora. The objective is to gain insight into publications of ML applications in soil science and to discuss the research gaps in this topic. We found that (a) there is an increasing usage of ML methods in soil sciences, mostly concentrated in developed countries, (b) the reviewed publications can be grouped into 12 topics, namely remote sensing, soil organic carbon, water, contamination, methods (ensembles), erosion and parent material, methods (NN, neural networks, SVM, support vector machines), spectroscopy, modelling (classes), crops, physical, and modelling (continuous), and (c) advanced ML methods usually perform better than simpler approaches thanks to their capability to capture non-linear relationships. From these findings, we found research gaps, in particular, about the precautions that should be taken (parsimony) to avoid overfitting, and that the interpretability of the ML models is an important aspect to consider when applying advanced ML methods in order to improve our knowledge and understanding of soil. We foresee that a large number of studies will focus on the latter topic.
345 sitasi
en
Computer Science
Behavioral dataset for Long-Evans and its schizophrenia-like substrain through several generations
Gábor Kőrösi, Oliver Czimbalmos, Gabriella Kekesi
et al.
Abstract We present a high-throughput behavioral dataset acquired with Ambitus, an automated reward-based corridor system that records locomotor and exploratory activities and cognitive functions after minimal handling. The collection contains 91 raw and derived variables, each measured across four consecutive trials, for 1,342 Long-Evans rats, including a triple-hit schizophrenia-like substrain (Lisket) bred through 16 generations. All data files, detailed metadata and analysis scripts are openly available on Zenodo. This resource enables longitudinal and multivariate studies of behavioral phenotypes, trans-generational effects, and strain differences, and it provides a benchmark for machine-learning-based marker discovery in rodent models.