Tackling Climate Change with Machine Learning
D. Rolnick, P. Donti, L. Kaack
et al.
Climate change is one of the greatest challenges facing humanity, and we, as machine learning (ML) experts, may wonder how we can help. Here we describe how ML can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by ML, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the ML community to join the global effort against climate change.
1050 sitasi
en
Computer Science, Mathematics
Data Mining Practical Machine Learning Tools And Techniques With Java Implementations
Marcel Abendroth
1609 sitasi
en
Computer Science
Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey
Hongyu Liu, Bo Lang
Networks play important roles in modern life, and cyber security has become a vital research area. An intrusion detection system (IDS) which is an important cyber security technique, monitors the state of software and hardware running in the network. Despite decades of development, existing IDSs still face challenges in improving the detection accuracy, reducing the false alarm rate and detecting unknown attacks. To solve the above problems, many researchers have focused on developing IDSs that capitalize on machine learning methods. Machine learning methods can automatically discover the essential differences between normal data and abnormal data with high accuracy. In addition, machine learning methods have strong generalizability, so they are also able to detect unknown attacks. Deep learning is a branch of machine learning, whose performance is remarkable and has become a research hotspot. This survey proposes a taxonomy of IDS that takes data objects as the main dimension to classify and summarize machine learning-based and deep learning-based IDS literature. We believe that this type of taxonomy framework is fit for cyber security researchers. The survey first clarifies the concept and taxonomy of IDSs. Then, the machine learning algorithms frequently used in IDSs, metrics, and benchmark datasets are introduced. Next, combined with the representative literature, we take the proposed taxonomic system as a baseline and explain how to solve key IDS issues with machine learning and deep learning techniques. Finally, challenges and future developments are discussed by reviewing recent representative studies.
878 sitasi
en
Engineering
Efficient and Robust Automated Machine Learning
Matthias Feurer, Aaron Klein, Katharina Eggensperger
et al.
1860 sitasi
en
Computer Science
Machine learning applications in cancer prognosis and prediction
Konstantina D. Kourou, T. Exarchos, K. Exarchos
et al.
Cancer has been characterized as a heterogeneous disease consisting of many different subtypes. The early diagnosis and prognosis of a cancer type have become a necessity in cancer research, as it can facilitate the subsequent clinical management of patients. The importance of classifying cancer patients into high or low risk groups has led many research teams, from the biomedical and the bioinformatics field, to study the application of machine learning (ML) methods. Therefore, these techniques have been utilized as an aim to model the progression and treatment of cancerous conditions. In addition, the ability of ML tools to detect key features from complex datasets reveals their importance. A variety of these techniques, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Support Vector Machines (SVMs) and Decision Trees (DTs) have been widely applied in cancer research for the development of predictive models, resulting in effective and accurate decision making. Even though it is evident that the use of ML methods can improve our understanding of cancer progression, an appropriate level of validation is needed in order for these methods to be considered in the everyday clinical practice. In this work, we present a review of recent ML approaches employed in the modeling of cancer progression. The predictive models discussed here are based on various supervised ML techniques as well as on different input features and data samples. Given the growing trend on the application of ML methods in cancer research, we present here the most recent publications that employ these techniques as an aim to model cancer risk or patient outcomes.
2859 sitasi
en
Computer Science, Medicine
Predicting the state of charge and health of batteries using data-driven machine learning
M. Ng, Jin Zhao, Qingyu Yan
et al.
600 sitasi
en
Computer Science
A survey on machine learning for data fusion
Tong Meng, Xuyang Jing, Zheng Yan
et al.
Abstract Data fusion is a prevalent way to deal with imperfect raw data for capturing reliable, valuable and accurate information. Comparing with a range of classical probabilistic data fusion techniques, machine learning method that automatically learns from past experiences without explicitly programming, remarkably renovates fusion techniques by offering the strong ability of computing and predicting. Nevertheless, the literature still lacks a thorough review of the recent advances of machine learning for data fusion. Therefore, it is beneficial to review and summarize the state of the art in order to gain a deep insight on how machine learning can benefit and optimize data fusion. In this paper, we provide a comprehensive survey on data fusion methods based on machine learning. We first offer a detailed introduction to the background of data fusion and machine learning in terms of definitions, applications, architectures, processes, and typical techniques. Then, we propose a number of requirements and employ them as criteria to review and evaluate the performance of existing fusion methods based on machine learning. Through the literature review, analysis and comparison, we finally come up with a number of open issues and propose future research directions in this field.
560 sitasi
en
Computer Science
InterpretML: A Unified Framework for Machine Learning Interpretability
H. Nori, Samuel Jenkins, Paul Koch
et al.
InterpretML is an open-source Python package which exposes machine learning interpretability algorithms to practitioners and researchers. InterpretML exposes two types of interpretability - glassbox models, which are machine learning models designed for interpretability (ex: linear models, rule lists, generalized additive models), and blackbox explainability techniques for explaining existing systems (ex: Partial Dependence, LIME). The package enables practitioners to easily compare interpretability algorithms by exposing multiple methods under a unified API, and by having a built-in, extensible visualization platform. InterpretML also includes the first implementation of the Explainable Boosting Machine, a powerful, interpretable, glassbox model that can be as accurate as many blackbox models. The MIT licensed source code can be downloaded from github.com/microsoft/interpret.
592 sitasi
en
Computer Science, Mathematics
The Machine‐Learning Approach
Advancing Biosensors with Machine Learning.
Feiyun Cui, Yun Yue, Yi Zhang
et al.
Chemometrics play a critical role in biosensors-based detection, analysis, and diagnosis. Nowadays, as a branch of artificial intelligence (AI), machine learning (ML) have achieved impressive advances. However, novel advanced ML methods, especially deep learning, which is famous for image analysis, facial recognition, and speech recognition, has remained relatively elusive to the biosensor community. Herein, how ML can be beneficial to biosensors is systematically discussed. The advantages and drawbacks of most popular ML algorithms are summarized on the basis of sensing data analysis. Specially, deep learning methods such as convolutional neural network (CNN) and recurrent neural network (RNN) are emphasized. Diverse ML-assisted electrochemical biosensors, wearable electronics, SERS and other spectra-based biosensors, fluorescence biosensors and colorimetric biosensors are comprehensively discussed. Furthermore, biosensor networks and multibiosensor data fusion are introduced. This review will nicely bridge ML with biosensors, and greatly expand chemometrics for detection, analysis, and diagnosis.
534 sitasi
en
Medicine, Computer Science
What Role Does Hydrological Science Play in the Age of Machine Learning?
G. Nearing, Frederik Kratzert, A. Sampson
et al.
This paper is derived from a keynote talk given at the Google's 2020 Flood Forecasting Meets Machine Learning Workshop. Recent experiments applying deep learning to rainfall‐runoff simulation indicate that there is significantly more information in large‐scale hydrological data sets than hydrologists have been able to translate into theory or models. While there is a growing interest in machine learning in the hydrological sciences community, in many ways, our community still holds deeply subjective and nonevidence‐based preferences for models based on a certain type of “process understanding” that has historically not translated into accurate theory, models, or predictions. This commentary is a call to action for the hydrology community to focus on developing a quantitative understanding of where and when hydrological process understanding is valuable in a modeling discipline increasingly dominated by machine learning. We offer some potential perspectives and preliminary examples about how this might be accomplished.
525 sitasi
en
Computer Science
Ethical Machine Learning in Health Care
I. Chen, E. Pierson, Sherri Rose
et al.
The use of machine learning (ML) in healthcare raises numerous ethical concerns, especially as models can amplify existing health inequities. Here, we outline ethical considerations for equitable ML in the advancement of healthcare. Specifically, we frame ethics of ML in healthcare through the lens of social justice. We describe ongoing efforts and outline challenges in a proposed pipeline of ethical ML in health, ranging from problem selection to postdeployment considerations. We close by summarizing recommendations to address these challenges.
525 sitasi
en
Computer Science, Psychology
The rise of machine learning for detection and classification of malware: Research developments, trends and challenges
Daniel Gibert, Carles Mateu, Jordi Planes
Abstract The struggle between security analysts and malware developers is a never-ending battle with the complexity of malware changing as quickly as innovation grows. Current state-of-the-art research focus on the development and application of machine learning techniques for malware detection due to its ability to keep pace with malware evolution. This survey aims at providing a systematic and detailed overview of machine learning techniques for malware detection and in particular, deep learning techniques. The main contributions of the paper are: (1) it provides a complete description of the methods and features in a traditional machine learning workflow for malware detection and classification, (2) it explores the challenges and limitations of traditional machine learning and (3) it analyzes recent trends and developments in the field with special emphasis on deep learning approaches. Furthermore, (4) it presents the research issues and unsolved challenges of the state-of-the-art techniques and (5) it discusses the new directions of research. The survey helps researchers to have an understanding of the malware detection field and of the new developments and directions of research explored by the scientific community to tackle the problem.
513 sitasi
en
Computer Science
Applications of machine learning to diagnosis and treatment of neurodegenerative diseases
Monika A. Myszczynska, P. Ojamies, A. Lacoste
et al.
Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: Comparative analysis of ANN and SVM models
Daniel Asante Otchere, T. Ganat, R. Gholami
et al.
Abstract The advent of Artificial Intelligence (AI) in the petroleum industry has seen an increase in its use in exploration, development, production, reservoir engineering and management planning to accelerate decision making, reduce cost and time. Supervised machine learning has gained much popularity in establishing a relationship between complex non-linear datasets. This type of machine learning algorithm has showcased its superiority over petroleum engineering regression techniques in terms of prediction errors for high dimensional data, computational power and memory. This review focuses on the most widely used machine learning algorithm employed in the petroleum industry, the Artificial Neural Network (ANN) with different shallow models used in reservoir characterisation. The Support Vector Machine (SVM) and Relevant Vector Machine (RVM) has over the years emerged as competitive algorithms where in most cases based on this review it outperformed the ANN. This makes it preferable than the ANN when there are limited data sets. Finally, hybridisation of multiple algorithms methodologies also showed improved performance over singularly applied algorithms offering a pathway in improving reservoir characterisation based on supervised machine learning as future scope of work.
411 sitasi
en
Computer Science
Machine learning and AI in marketing – Connecting computing power to human insights
Liye Ma, Baohong Sun
Abstract Artificial intelligence (AI) agents driven by machine learning algorithms are rapidly transforming the business world, generating heightened interest from researchers. In this paper, we review and call for marketing research to leverage machine learning methods. We provide an overview of common machine learning tasks and methods, and compare them with statistical and econometric methods that marketing researchers traditionally use. We argue that machine learning methods can process large-scale and unstructured data, and have flexible model structures that yield strong predictive performance. Meanwhile, such methods may lack model transparency and interpretability. We discuss salient AI-driven industry trends and practices, and review the still nascent academic marketing literature which uses machine learning methods. More importantly, we present a unified conceptual framework and a multi-faceted research agenda. From five key aspects of empirical marketing research: method, data, usage, issue, and theory, we propose a number of research priorities, including extending machine learning methods and using them as core components in marketing research, using the methods to extract insights from large-scale unstructured, tracking, and network data, using them in transparent fashions for descriptive, causal, and prescriptive analyses, using them to map out customer purchase journeys and develop decision-support capabilities, and connecting the methods to human insights and marketing theories. Opportunities abound for machine learning methods in marketing, and we hope our multi-faceted research agenda will inspire more work in this exciting area.
402 sitasi
en
Computer Science
Edge Machine Learning for AI-Enabled IoT Devices: A Review
M. Merenda, Carlo Porcaro, D. Iero
In a few years, the world will be populated by billions of connected devices that will be placed in our homes, cities, vehicles, and industries. Devices with limited resources will interact with the surrounding environment and users. Many of these devices will be based on machine learning models to decode meaning and behavior behind sensors’ data, to implement accurate predictions and make decisions. The bottleneck will be the high level of connected things that could congest the network. Hence, the need to incorporate intelligence on end devices using machine learning algorithms. Deploying machine learning on such edge devices improves the network congestion by allowing computations to be performed close to the data sources. The aim of this work is to provide a review of the main techniques that guarantee the execution of machine learning models on hardware with low performances in the Internet of Things paradigm, paving the way to the Internet of Conscious Things. In this work, a detailed review on models, architecture, and requirements on solutions that implement edge machine learning on Internet of Things devices is presented, with the main goal to define the state of the art and envisioning development requirements. Furthermore, an example of edge machine learning implementation on a microcontroller will be provided, commonly regarded as the machine learning “Hello World”.
391 sitasi
en
Medicine, Computer Science
Integrating Physics-Based Modeling with Machine Learning: A Survey
J. Willard, X. Jia, Shaoming Xu
et al.
391 sitasi
en
Computer Science
A Review on Machine Learning for EEG Signal Processing in Bioengineering
M. Hosseini, Amin Hosseini, Kiarash Ahi
Electroencephalography (EEG) has been a staple method for identifying certain health conditions in patients since its discovery. Due to the many different types of classifiers available to use, the analysis methods are also equally numerous. In this review, we will be examining specifically machine learning methods that have been developed for EEG analysis with bioengineering applications. We reviewed literature from 1988 to 2018 to capture previous and current classification methods for EEG in multiple applications. From this information, we are able to determine the overall effectiveness of each machine learning method as well as the key characteristics. We have found that all the primary methods used in machine learning have been applied in some form in EEG classification. This ranges from Naive-Bayes to Decision Tree/Random Forest, to Support Vector Machine (SVM). Supervised learning methods are on average of higher accuracy than their unsupervised counterparts. This includes SVM and KNN. While each of the methods individually is limited in their accuracy in their respective applications, there is hope that the combination of methods when implemented properly has a higher overall classification accuracy. This paper provides a comprehensive overview of Machine Learning applications used in EEG analysis. It also gives an overview of each of the methods and general applications that each is best suited to.
373 sitasi
en
Medicine, Computer Science
Machine learning and algorithmic fairness in public and population health
Vishwali Mhasawade, Yuan Zhao, R. Chunara
174 sitasi
en
Computer Science