Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.
Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.
Kangwook Lee, Maximilian Lam, Ramtin Pedarsani
et al.
Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems, there are several types of noise that can affect the performance of distributed machine learning algorithms—straggler nodes, system failures, or communication bottlenecks—but there has been little interaction cutting across codes, machine learning, and distributed systems. In this paper, we provide theoretical insights on how coded solutions can achieve significant gains compared with uncoded ones. We focus on two of the most basic building blocks of distributed learning algorithms: matrix multiplication and data shuffling. For matrix multiplication, we use codes to alleviate the effect of stragglers and show that if the number of homogeneous workers is $n$ , and the runtime of each subtask has an exponential tail, coded computation can speed up distributed matrix multiplication by a factor of $\log n$ . For data shuffling, we use codes to reduce communication bottlenecks, exploiting the excess in storage. We show that when a constant fraction $\alpha $ of the data matrix can be cached at each worker, and $n$ is the number of workers, coded shuffling reduces the communication cost by a factor of $\left({\alpha + \frac {1}{n}}\right)\gamma (n)$ compared with uncoded shuffling, where $\gamma (n)$ is the ratio of the cost of unicasting $n$ messages to $n$ users to multicasting a common message (of the same size) to $n$ users. For instance, $\gamma (n) \simeq n$ if multicasting a message to $n$ users is as cheap as unicasting a message to one user. We also provide experimental results, corroborating our theoretical gains of the coded algorithms.
Ioannis Tsochantaridis, Thomas Hofmann, T. Joachims
et al.
Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits the sparseness and structural decomposition of the problem. We demonstrate the versatility and effectiveness of our method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.
Last year, at least 30,000 scientific papers used the Kohn–Sham scheme of density functional theory to solve electronic structure problems in a wide variety of scientific fields. Machine learning holds the promise of learning the energy functional via examples, bypassing the need to solve the Kohn–Sham equations. This should yield substantial savings in computer time, allowing larger systems and/or longer time-scales to be tackled, but attempts to machine-learn this functional have been limited by the need to find its derivative. The present work overcomes this difficulty by directly learning the density-potential and energy-density maps for test systems and various molecules. We perform the first molecular dynamics simulation with a machine-learned density functional on malonaldehyde and are able to capture the intramolecular proton transfer process. Learning density models now allows the construction of accurate density functionals for realistic molecular systems. Machine learning allows electronic structure calculations to access larger system sizes and, in dynamical simulations, longer time scales. Here, the authors perform such a simulation using a machine-learned density functional that avoids direct solution of the Kohn-Sham equations.
BackgroundMitochondrial injury plays a critical role in type 2 diabetes mellitus (T2DM) pathogenesis by impairing cellular energy metabolism and insulin sensitivity. The Zhimu-Huangbai herb pair (ZB), a classic Traditional Chinese Medicine formulation composed of Anemarrhena asphodeloides and Phellodendron chinense, has shown efficacy in T2DM, but its molecular mechanisms remain unclear. In this study, we aimed to identify crucial mitochondrial related genes of type 2 diabetes and the potential mechanism of ZB.MethodsGene expression datasets for T2DM (GSE76894, GSE25724, and GSE38642) were retrieved from the GEO database. Intersection targets of ZB herb pair and T2DM were identified by screening multiple databases, including the TCMSP and HERB. Mitochondrial function-related genes were obtained from human mitochondria-associated databases. WGCNA was employed to identify differentially expressed genes, which were then intersected with bioactive compound–target genes and mitochondrial-related genes to construct a PPI network. GO and KEGG enrichment analyses were subsequently performed. Four machine learning algorithms—SVM-RFE, RF, GLM, and XGB—were applied to screen feature genes and establish diagnostic models. Furthermore, the correlations between feature targets and immune cell infiltration were analyzed, single-gene GSEA was conducted, and molecular docking was performed to investigate the interactions between feature targets and bioactive constituents of ZB. For experimental validation, INS-1 cells were divided into six groups: the control group, model group, metformin group, and low-, medium-, and high-dose ZB groups. Cell viability, apoptosis, ROS levels, mitochondrial membrane potential, and mitochondrial morphology and function were assessed. Western blot analysis was performed to evaluate the expression of mitochondria-related genes (BCAT2, CASP8, EPHX2, and UCP2) and components of the AMPK–SIRT1–PGC-1α signaling pathway.ResultsA total of eight mitochondria-related differentially expressed genes associated with ZB treatment of T2DM were identified. GO analysis revealed enrichment in multiple biological processes, including response to nutrient levels; cellular components, such as pore complex; and molecular functions, including toxic substance binding. KEGG pathway analysis indicated significant enrichment in pathways including apoptosis, p53 signaling pathway, and necroptosis. Three key genes—BCAT2, CASP8, and EPHX2—were screened through machine learning algorithms, and the constructed T2DM diagnostic models all exhibited area under the curve (AUC) values greater than 0.7, indicating satisfactory discriminative performance. Immune infiltration analysis revealed that all three key genes were significantly correlated with immune cell populations. Molecular docking results demonstrated that the three key genes exhibited strong binding affinities (≤−5.0 kcal/mol) for their corresponding bioactive compounds derived from ZB, with the exception of the CASP8-nicotinamide combination. Experimental validation showed that ZB significantly enhanced the viability of INS-1 cells subjected to high-glucose and high-lipid conditions, inhibited apoptosis, reduced intracellular ROS generation, and ameliorated mitochondrial membrane potential, mitochondrial morphology, and respiratory function. Concurrently, the protein expression levels of UCP2 and BCAT2 were markedly upregulated, whereas those of CASP8 and EPHX2 were significantly downregulated. Additionally, ZB treatment upregulated the p-AMPK/AMPK ratio as well as the expression of SIRT1 and PGC-1α.ConclusionThe diagnostic model featuring genes BCAT2, CASP8, and EPHX2 provides new insights for T2DM diagnosis and treatment. ZB’s therapeutic mechanism involves regulating mitochondrial-related genes (BCAT2, CASP8, EPHX2, UCP2) and activating the AMPK-SIRT1-PGC-1α pathway, thereby improving mitochondrial morphology and function, reducing oxidative damage, and enhancing energy metabolism.
Este estudio analiza los determinantes del empleo informal en Bolivia mediante una combinación de técnicas econométricas tradicionales, métodos de machine learning y enfoques híbridos. Utilizando datos de las Encuestas de Hogares 2022 y 2023, se identifican los factores individuales y del hogar que influyen en la probabilidad de pertenecer al empleo informal. Los resultados muestran que variables como la edad, el nivel educativo, el ingreso del hogar y el género son determinantes clave. El Random Forest destaca el papel central de los ingresos laborales, usualmente excluidos por problemas de endogeneidad. El Adaptive Lasso permite identificar relaciones no lineales e interacciones complejas, como las asociadas al género, la pertenencia a grupos originarios y la presencia de niños pequeños en el hogar. Se concluye que el fenómeno del empleo informal responde a dinámicas multidimensionales que requieren enfoques analíticos integradores para el diseño de políticas públicas más efectivas y focalizadas.
Joshua I. Glaser, Raeed H. Chowdhury, M. Perich
et al.
Abstract Despite rapid advances in machine learning tools, the majority of neural decoding approaches still use traditional methods. Modern machine learning tools, which are versatile and easy to use, have the potential to significantly improve decoding performance. This tutorial describes how to effectively apply these algorithms for typical decoding problems. We provide descriptions, best practices, and code for applying common machine learning methods, including neural networks and gradient boosting. We also provide detailed comparisons of the performance of various methods at the task of decoding spiking activity in motor cortex, somatosensory cortex, and hippocampus. Modern methods, particularly neural networks and ensembles, significantly outperform traditional approaches, such as Wiener and Kalman filters. Improving the performance of neural decoding algorithms allows neuroscientists to better understand the information contained in a neural population and can help to advance engineering applications such as brain–machine interfaces. Our code package is available at github.com/kordinglab/neural_decoding.