Deep learning and its applications to machine health monitoring
Rui Zhao, Ruqiang Yan, Zhenghua Chen
et al.
Abstract Since 2006, deep learning (DL) has become a rapidly growing research direction, redefining state-of-the-art performances in a wide range of areas such as object recognition, image segmentation, speech recognition and machine translation. In modern manufacturing systems, data-driven machine health monitoring is gaining in popularity due to the widespread deployment of low-cost sensors and their connection to the Internet. Meanwhile, deep learning provides useful tools for processing and analyzing these big machinery data. The main purpose of this paper is to review and summarize the emerging research work of deep learning on machine health monitoring. After the brief introduction of deep learning techniques, the applications of deep learning in machine health monitoring systems are reviewed mainly from the following aspects: Auto-encoder (AE) and its variants, Restricted Boltzmann Machines and its variants including Deep Belief Network (DBN) and Deep Boltzmann Machines (DBM), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). In addition, an experimental study on the performances of these approaches has been conducted, in which the data and code have been online. Finally, some new trends of DL-based machine health monitoring methods are discussed.
2317 sitasi
en
Computer Science
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
S. Raschka
The correct use of model evaluation, model selection, and algorithm selection techniques is vital in academic machine learning research as well as in many industrial settings. This article reviews different techniques that can be used for each of these three subtasks and discusses the main advantages and disadvantages of each technique with references to theoretical and empirical studies. Further, recommendations are given to encourage best yet feasible practices in research and applications of machine learning. Common methods such as the holdout method for model evaluation and selection are covered, which are not recommended when working with small datasets. Different flavors of the bootstrap technique are introduced for estimating the uncertainty of performance estimates, as an alternative to confidence intervals via normal approximation if bootstrapping is computationally feasible. Common cross-validation techniques such as leave-one-out cross-validation and k-fold cross-validation are reviewed, the bias-variance trade-off for choosing k is discussed, and practical tips for the optimal choice of k are given based on empirical evidence. Different statistical tests for algorithm comparisons are presented, and strategies for dealing with multiple comparisons such as omnibus tests and multiple-comparison corrections are discussed. Finally, alternative methods for algorithm selection, such as the combined F-test 5x2 cross-validation and nested cross-validation, are recommended for comparing machine learning algorithms when datasets are small.
999 sitasi
en
Computer Science, Mathematics
Ensuring Fairness in Machine Learning to Advance Health Equity
A. Rajkomar, Michaela Hardt, M. Howell
et al.
Machine learning can identify the statistical patterns of data generated by tens of thousands of physicians and billions of patients to train computers to perform specific tasks with sometimes superhuman ability, such as detecting diabetic eye disease better than retinal specialists (1). However, historical data also capture patterns of health care disparities, and machine-learning models trained on these data may perpetuate these inequities. This concern is not just academic. In a model used to predict future crime on the basis of historical arrest records, African American defendants who did not reoffend were classified as high risk at a substantially higher rate than white defendants who did not reoffend (2, 3). Similar biases have been observed in predictive policing (4) and identifying which calls to a child protective services agency required an in-person investigation (5, 6). The implications for health care led the American Medical Association to pass policy recommendations to promote development of thoughtfully designed, high-quality, clinically validated health care AI [artificial or augmented intelligence, such as machine learning] that . . . identifies and takes steps to address bias and avoids introducing or exacerbating health care disparities including when testing or deploying new AI tools on vulnerable populations (7). We argue that health care organizations and policymakers should go beyond the American Medical Association's position of doing no harm and instead proactively design and use machine-learning systems to advance health equity. Whereas much health disparities work has focused on discriminatory decision making and implicit biases by clinicians, policymakers, organizational leaders, and researchers are increasingly focusing on the ill health effects of structural racism and classismhow systems are shaped in ways that harm the health of disempowered, marginalized populations (8). For example, the United States has a shameful history of purposive decisions by government and private businesses to segregate housing. Zoning laws, discrimination in mortgage lending, prejudicial practices by real estate agents, and the ghettoization of public housing all contributed to the concentration of urban African Americans in inferior housing that has led to poor health (9, 10). Even when the goal of decision makers is not outright discrimination against disadvantaged groups, actions may lead to inequities. For example, if the goal of a machine-learning system is to maximize efficiency, that might come at the expense of disadvantaged populations. As a society, we value health equity. For example, the Healthy People 2020 vision statement aims for a society in which all people live long, healthy lives, and one of the mission's goals is to achieve health equity, eliminate disparities, and improve the health of all groups (11). The 4 classic principles of Western clinical medical ethics are justice, autonomy, beneficence, and nonmaleficence. However, health equity will not be attained unless we purposely design our health and social systems, which increasingly will be infused with machine learning (12), to achieve this goal. To ensure fairness in machine learning, we recommend a participatory process that involves key stakeholders, including frequently marginalized populations, and considers distributive justice within specific clinical and organizational contexts. Different technical approaches can configure the mathematical properties of machine-learning models to render predictions that are equitable in various ways. The existence of mathematical levers must be supplemented with criteria for when and why they should be usedeach tool comes with tradeoffs that require ethical reasoning to decide what is best for a given application. We propose incorporating fairness into the design, deployment, and evaluation of machine-learning models. We discuss 2 clinical applications in which machine learning might harm protected groups by being inaccurate, diverting resources, or worsening outcomes, especially if the models are built without consideration for these patients. We then describe the mechanisms by which a model's design, data, and deployment may lead to disparities; explain how different approaches to distributive justice in machine learning can advance health equity; and explore what contexts are more appropriate for different equity approaches in machine learning. Case Study 1: Intensive Care Unit Monitoring A common area of predictive modeling research focuses on creating a monitoring systemfor example, to warn a rapid response team about inpatients at high risk for deterioration (1315), requiring their transfer to an intensive care unit within 6 hours. How might such a system inadvertently result in harm to a protected group? In this thought experiment, we consider African Americans as a protected group. To build the model, our hypothetical researchers collected historical records of patients who had clinical deterioration and those who did not. The model acts like a diagnostic test of risk for intensive care unit transfer. However, if too few African American patients were included in the training datathe data used to construct the modelthe model might be inaccurate for them. For example, it might have a lower sensitivity and miss more patients at risk for deterioration. African American patients might be harmed if clinical teams started relying on alerts to identify at-risk patients without realizing that the prediction system underdetects patients in that group (automation bias) (16). If the model had a lower positive predictive value for African Americans, it might also disproportionately harm them through dismissal biasa generalization of alert fatigue in which clinicians may learn to discount or dismiss alerts for African Americans because they are more likely to be false-positive (17). Case Study 2: Reducing Length of Stay Imagine that a hospital created a model with clinical and social variables to predict which inpatients might be discharged earliest so that it could direct limited case management resources to them to prevent delays. If residence in ZIP codes of socioeconomically depressed or predominantly African American neighborhoods predicted greater lengths of stay (18), this model might disproportionately allocate case management resources to patients from richer, predominantly white neighborhoods and away from African Americans in poorer ones. What Is Machine Learning? Traditionally, computer systems map inputs to outputs according to manually specified ifthen rules. With increasingly complex tasks, such as language translation, manually specifying rules becomes infeasible, and instead the mapping (or model) is learned by the system given only input examples represented through a set of features together with their desired output, referred to as labels. The quality of a model is assessed by computing evaluation metrics on data not used to build the model, such as sensitivity, specificity, or the c-statistic, which measures the ability of a model to distinguish patients with a condition from those without it (19, 20). Once the model's quality is deemed satisfactory, it can be deployed to make predictions on new examples for which the label is unknown when the prediction is made. The quality of the models on retrospective data must be followed with tests of clinical effectiveness, safety, and comparison with current practice, which may require clinical trials (21). Traditionally, statistical models for prediction, such as the pooled-cohort equation (22), have used few variables to predict clinical outcomes, such as cardiovascular risk (23). Modern machine-learning techniques, however, can consider many more features. For example, a recent model to predict hospital readmissions examined hundreds of thousands of pieces of information, including the free text of clinical notes (24). Complex data and models can drive more personalized and accurate predictions but may also make algorithms hard to understand and trust (25). What Can Cause a Machine-Learning System to Be Unfair? The Glossary lists key biases in the design, data, and deployment of a machine-learning model that may perpetuate or exacerbate health care disparities if left unchecked. The Figure reveals how the various biases relate to one another and how the interactions of model predictions with clinicians and patients may exacerbate health care disparities. Biases may arise during the design of a model. For example, if the label is marred by health care disparities, such as predicting the onset of clinical depression in environments where protected groups have been systematically misdiagnosed, then the model will learn to perpetuate this disparity. This represents a generalization of test-referral bias (26) that we refer to as label bias. Moreover, the data on which the model is developed may be biased. Data on patients in the protected group might be distributed differently from those in the nonprotected group because of biological or nonbiological variation (9, 27). For example, the data may not contain enough examples from a group to properly tailor the predictions to them (minority bias) (28), or the data set of the protected group may be less informative because features are missing not at random as a result of more fragmented care (29, 30). Glossary Figure. Conceptual framework of how various biases relate to one another. During model development, differences in the distribution of features used to predict a label between the protected and nonprotected groups may bias a model to be less accurate for protected groups. Moreover, the data used to develop a model may not generalize to the data used during model deployment (trainingserving skew). Biases in model design and data affect patient outcomes through the model's interaction with clinicians and patients. The immediate effect of these differences is that the model may
Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning
Matthew Jagielski, Alina Oprea, B. Biggio
et al.
As machine learning becomes widely used for automated decisions, attackers have strong incentives to manipulate the results and models generated by machine learning algorithms. In this paper, we perform the first systematic study of poisoning attacks and their countermeasures for linear regression models. In poisoning attacks, attackers deliberately influence the training data to manipulate the results of a predictive model. We propose a theoretically-grounded optimization framework specifically designed for linear regression and demonstrate its effectiveness on a range of datasets and models. We also introduce a fast statistical attack that requires limited knowledge of the training process. Finally, we design a new principled defense method that is highly resilient against all poisoning attacks. We provide formal guarantees about its convergence and an upper bound on the effect of poisoning attacks when the defense is deployed. We evaluate extensively our attacks and defenses on three realistic datasets from health care, loan assessment, and real estate domains.
860 sitasi
en
Computer Science
Machine-learning reprogrammable metasurface imager
Lianlin Li, Hengxin Ruan, Che Liu
et al.
Conventional microwave imagers usually require either time-consuming data acquisition, or complicated reconstruction algorithms for data post-processing, making them largely ineffective for complex in-situ sensing and monitoring. Here, we experimentally report a real-time digital-metasurface imager that can be trained in-situ to generate the radiation patterns required by machine-learning optimized measurement modes. This imager is electronically reprogrammed in real time to access the optimized solution for an entire data set, realizing storage and transfer of full-resolution raw data in dynamically varying scenes. High-accuracy image coding and recognition are demonstrated in situ for various image sets, including hand-written digits and through-wall body gestures, using a single physical hardware imager, reprogrammed in real time. Our electronically controlled metasurface imager opens new venues for intelligent surveillance, fast data acquisition and processing, imaging at various frequencies, and beyond. Conventional imagers require time-consuming data acquisition, or complicated reconstruction algorithms for data post-processing. Here, the authors demonstrate a real-time digital-metasurface imager that can be trained in-situ to show high accuracy image coding and recognition for various image sets.
505 sitasi
en
Medicine, Computer Science
Machine learning-assisted directed protein evolution with combinatorial libraries
Zachary Wu, S. Kan, Russell D. Lewis
et al.
Significance Proteins often function poorly when used outside their natural contexts; directed evolution can be used to engineer them to be more efficient in new roles. We propose that the expense of experimentally testing a large number of protein variants can be decreased and the outcome can be improved by incorporating machine learning with directed evolution. Simulations on an empirical fitness landscape demonstrate that the expected performance improvement is greater with this approach. Machine learning-assisted directed evolution from a single parent produced enzyme variants that selectively synthesize the enantiomeric products of a new-to-nature chemical transformation. By exploring multiple mutations simultaneously, machine learning efficiently navigates large regions of sequence space to identify improved proteins and also produces diverse solutions to engineering problems. To reduce experimental effort associated with directed protein evolution and to explore the sequence space encoded by mutating multiple positions simultaneously, we incorporate machine learning into the directed evolution workflow. Combinatorial sequence space can be quite expensive to sample experimentally, but machine-learning models trained on tested variants provide a fast method for testing sequence space computationally. We validated this approach on a large published empirical fitness landscape for human GB1 binding protein, demonstrating that machine learning-guided directed evolution finds variants with higher fitness than those found by other directed evolution approaches. We then provide an example application in evolving an enzyme to produce each of the two possible product enantiomers (i.e., stereodivergence) of a new-to-nature carbene Si–H insertion reaction. The approach predicted libraries enriched in functional enzymes and fixed seven mutations in two rounds of evolution to identify variants for selective catalysis with 93% and 79% ee (enantiomeric excess). By greatly increasing throughput with in silico modeling, machine learning enhances the quality and diversity of sequence solutions for a protein engineering problem.
489 sitasi
en
Computer Science, Biology
Machine Learning at the Network Edge: A Survey
M. G. Sarwar Murshed, Chris Murphy, Daqing Hou
et al.
Resource-constrained IoT devices, such as sensors and actuators, have become ubiquitous in recent years. This has led to the generation of large quantities of data in real-time, which is an appealing target for AI systems. However, deploying machine learning models on such end-devices is nearly impossible. A typical solution involves offloading data to external computing systems (such as cloud servers) for further processing but this worsens latency, leads to increased communication costs, and adds to privacy concerns. To address this issue, efforts have been made to place additional computing devices at the edge of the network, i.e., close to the IoT devices where the data is generated. Deploying machine learning systems on such edge computing devices alleviates the above issues by allowing computations to be performed close to the data sources. This survey describes major research efforts where machine learning systems have been deployed at the edge of computer networks, focusing on the operational aspects including compression techniques, tools, frameworks, and hardware used in successful applications of intelligent edge systems.
480 sitasi
en
Computer Science, Mathematics
Estimation of energy consumption in machine learning
E. García-Martín, Crefeda Faviola Rodrigues, G. Riley
et al.
Energy consumption has been widely studied in the computer architecture field for decades. While the adoption of energy as a metric in machine learning is emerging, the majority of research is stil ...
473 sitasi
en
Computer Science
Machine Learning–Based Model for Prediction of Outcomes in Acute Stroke
Joonnyung Heo, Jihoon G. Yoon, Hyungjong Park
et al.
Background and Purpose— The prediction of long-term outcomes in ischemic stroke patients may be useful in treatment decisions. Machine learning techniques are being increasingly adapted for use in the medical field because of their high accuracy. This study investigated the applicability of machine learning techniques to predict long-term outcomes in ischemic stroke patients. Methods— This was a retrospective study using a prospective cohort that enrolled patients with acute ischemic stroke. Favorable outcome was defined as modified Rankin Scale score 0, 1, or 2 at 3 months. We developed 3 machine learning models (deep neural network, random forest, and logistic regression) and compared their predictability. To evaluate the accuracy of the machine learning models, we also compared them to the Acute Stroke Registry and Analysis of Lausanne (ASTRAL) score. Results— A total of 2604 patients were included in this study, and 2043 (78%) of them had favorable outcomes. The area under the curve for the deep neural network model was significantly higher than that of the ASTRAL score (0.888 versus 0.839; P<0.001), while the areas under the curves of the random forest (0.857; P=0.136) and logistic regression (0.849; P=0.413) models were not significantly higher than that of the ASTRAL score. Using only the 6 variables that are used for the ASTRAL score, the performance of the machine learning models did not significantly differ from that of the ASTRAL score. Conclusions— Machine learning algorithms, particularly the deep neural network, can improve the prediction of long-term outcomes in ischemic stroke patients.
A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science
M. Alloghani, D. Al-Jumeily, Jamila Mustafina
et al.
463 sitasi
en
Computer Science
Hands-On Machine Learning with R
Bradley C. Boehmke, Brandon M. Greenwell
459 sitasi
en
Computer Science
Machine learning in acoustics: Theory and applications.
Michael J. Bianco, P. Gerstoft, James Traer
et al.
Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes.
457 sitasi
en
Engineering, Computer Science
How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature.
Yun Liu, Po-Hsuan Cameron Chen, Jonathan Krause
et al.
In recent years, many new clinical diagnostic tools have been developed using complicated machine learning methods. Irrespective of how a diagnostic tool is derived, it must be evaluated using a 3-step process of deriving, validating, and establishing the clinical effectiveness of the tool. Machine learning-based tools should also be assessed for the type of machine learning model used and its appropriateness for the input data type and data set size. Machine learning models also generally have additional prespecified settings called hyperparameters, which must be tuned on a data set independent of the validation set. On the validation set, the outcome against which the model is evaluated is termed the reference standard. The rigor of the reference standard must be assessed, such as against a universally accepted gold standard or expert grading.
Exploiting machine learning for end-to-end drug discovery and development
S. Ekins, A. C. Puhl, Kimberley M. Zorn
et al.
452 sitasi
en
Medicine, Computer Science
What Is Machine Learning: a Primer for the Epidemiologist.
Qifang Bi, K. Goodman, J. Kaminsky
et al.
Machine learning is a branch of computer science that has the potential to transform epidemiological sciences. Amid a growing focus on "Big Data," it offers epidemiologists new tools to tackle problems for which classical methods are not well-suited. In order to critically evaluate the value of integrating machine learning algorithms and existing methods, however, it is essential to address language and technical barriers between the two fields that can make it difficult for epidemiologists to read and assess machine learning studies. Here, we provide an overview of the concepts and terminology used in machine learning literature, which encompasses a diverse set of tools with goals ranging from prediction, to classification, to clustering. We provide a brief introduction to five common machine learning algorithms and four ensemble-based approaches. We then summarize epidemiological applications of machine learning techniques in the published literature. We recommend approaches to incorporate machine learning in epidemiological research and discuss opportunities and challenges for integrating machine learning and existing epidemiological research methods.
443 sitasi
en
Medicine, Computer Science
Machine Learning Made Easy: A Review of Scikit-learn Package in Python Programming Language
J. Hao, T. Ho
Machine learning is a popular topic in data analysis and modeling. Many different machine learning algorithms have been developed and implemented in a variety of programming languages over the past 20 years. In this article, we first provide an overview of machine learning and clarify its difference from statistical inference. Then, we review Scikit-learn, a machine learning package in the Python programming language that is widely used in data science. The Scikit-learn package includes implementations of a comprehensive list of machine learning methods under unified data and modeling procedure conventions, making it a convenient toolkit for educational and behavior statisticians.
432 sitasi
en
Computer Science
Benchmark and Survey of Automated Machine Learning Frameworks
M. Zöller, Marco F. Huber
Machine learning (ML) has become a vital part in many aspects of our daily life. However, building well performing machine learning applications requires highly specialized data scientists and domain experts. Automated machine learning (AutoML) aims to reduce the demand for data scientists by enabling domain experts to automatically build machine learning applications without extensive knowledge of statistics and machine learning. This paper is a combination of a survey on current AutoML methods and a benchmark of popular AutoML frameworks on real data sets. Driven by the selected frameworks for evaluation, we summarize and review important AutoML techniques and methods concerning every step in building an ML pipeline. The selected AutoML frameworks are evaluated on 137 different data sets.
430 sitasi
en
Computer Science, Mathematics
Machine Learning for Survival Analysis
Ping Wang, Yan Li, Chandan K. Reddy
Survival analysis is a subfield of statistics where the goal is to analyze and model data where the outcome is the time until an event of interest occurs. One of the main challenges in this context is the presence of instances whose event outcomes become unobservable after a certain time point or when some instances do not experience any event during the monitoring period. This so-called censoring can be handled most effectively using survival analysis techniques. Traditionally, statistical approaches have been widely developed in the literature to overcome the issue of censoring. In addition, many machine learning algorithms have been adapted to deal with such censored data and tackle other challenging problems that arise in real-world data. In this survey, we provide a comprehensive and structured review of the statistical methods typically used and the machine learning techniques developed for survival analysis, along with a detailed taxonomy of the existing methods. We also discuss several topics that are closely related to survival analysis and describe several successful applications in a variety of real-world application domains. We hope that this article will give readers a more comprehensive understanding of recent advances in survival analysis and offer some guidelines for applying these approaches to solve new problems arising in applications involving censored data.
421 sitasi
en
Computer Science
Machine learning assisted materials design and discovery for rechargeable batteries
Yue Liu, Biru Guo, Xinxin Zou
et al.
Abstract Machine learning plays an important role in accelerating the discovery and design process for novel electrochemical energy storage materials. This review aims to provide the state-of-the-art and prospects of machine learning for the design of rechargeable battery materials. After illustrating the key concepts of machine learning and basic procedures for applying machine learning in rechargeable battery materials science, we focus on how to obtain the most important features from the specific physical, chemical and/or other properties of material by using wrapper feature selection method, embedded feature selection method, and the combination of these two methods. And then, the applications of machine learning in rechargeable battery materials design and discovery are reviewed, including the property prediction for liquid electrolytes, solid electrolytes, electrode materials, and the discovery of novel rechargeable battery materials through component prediction and structure prediction. More importantly, we discuss the key challenges related to machine learning in rechargeable battery materials science, including the contradiction between high dimension and small sample, the conflict between the complexity and accuracy of machine learning models, and the inconsistency between learning results and domain expert knowledge. In response to these challenges, we propose possible countermeasures and forecast potential directions of future research. This review is expected to shed light on machine learning in rechargeable battery materials design and property optimization.
354 sitasi
en
Materials Science
Statistical and machine learning models in credit scoring: A systematic literature survey
X. Dastile, T. Çelik, M. Potsane
Abstract In practice, as a well-known statistical method, the logistic regression model is used to evaluate the credit-worthiness of borrowers due to its simplicity and transparency in predictions. However, in literature, sophisticated machine learning models can be found that can replace the logistic regression model. Despite the advances and applications of machine learning models in credit scoring, there are still two major issues: the incapability of some of the machine learning models to explain predictions; and the issue of imbalanced datasets. As such, there is a need for a thorough survey of recent literature in credit scoring. This article employs a systematic literature survey approach to systematically review statistical and machine learning models in credit scoring, to identify limitations in literature, to propose a guiding machine learning framework, and to point to emerging directions. This literature survey is based on 74 primary studies, such as journal and conference articles, that were published between 2010 and 2018. According to the meta-analysis of this literature survey, we found that in general, an ensemble of classifiers performs better than single classifiers. Although deep learning models have not been applied extensively in credit scoring literature, they show promising results.
335 sitasi
en
Computer Science