Hasil "stat.ML" - JURNALIN

arXiv Open Access 2025

A note on the relations between mixture models, maximum-likelihood and entropic optimal transport

Titouan Vayer, Etienne Lasalle

This note aims to demonstrate that performing maximum-likelihood estimation for a mixture model is equivalent to minimizing over the parameters an optimal transport problem with entropic regularization. The objective is pedagogical: we seek to present this already known result in a concise and hopefully simple manner. We give an illustration with Gaussian mixture models by showing that the standard EM algorithm is a specific block-coordinate descent on an optimal transport loss.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2025

The Unified Non-Convex Framework for Robust Causal Inference: Overcoming the Gaussian Barrier and Optimization Fragility

Eichi Uehara

This document proposes a Unified Robust Framework that re-engineers the estimation of the Average Treatment Effect on the Overlap (ATO). It synthesizes gamma-Divergence for outlier robustness, Graduated Non-Convexity (GNC) for global optimization, and a "Gatekeeper" mechanism to address the impossibility of higher-order orthogonality in Gaussian regimes.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2022

Improving the Predictive Performances of $k$ Nearest Neighbors Learning by Efficient Variable Selection

Eddie Pei, Ernest Fokoue

This paper computationally demonstrates a sharp improvement in predictive performance for $k$ nearest neighbors thanks to an efficient forward selection of the predictor variables. We show both simulated and real-world data that this novel repeatedly approaches outperformance regression models under stepwise selection

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2021

A Note on Comparison of F-measures

Wei Ju, Wenxin Jiang

We comment on a recent TKDE paper "Linear Approximation of F-measure for the Performance Evaluation of Classification Algorithms on Imbalanced Data Sets", and make two improvements related to comparison of F-measures for two prediction rules.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2020

Deep orthogonal linear networks are shallow

Pierre Ablin

We consider the problem of training a deep orthogonal linear network, which consists of a product of orthogonal matrices, with no non-linearity in-between. We show that training the weights with Riemannian gradient descent is equivalent to training the whole factorization by gradient descent. This means that there is no effect of overparametrization and implicit bias at all in this setting: training such a deep, overparametrized, network is perfectly equivalent to training a one-layer shallow network.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2020

The Elliptical Potential Lemma Revisited

Alexandra Carpentier, Claire Vernade, Yasin Abbasi-Yadkori

This note proposes a new proof and new perspectives on the so-called Elliptical Potential Lemma. This result is important in online learning, especially for linear stochastic bandits. The original proof of the result, however short and elegant, does not give much flexibility on the type of potentials considered and we believe that this new interpretation can be of interest for future research in this field.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2019

Multilevel Monte Carlo estimation of log marginal likelihood

Takashi Goda, Kei Ishikawa

In this short note we provide an unbiased multilevel Monte Carlo estimator of the log marginal likelihood and discuss its application to variational Bayes.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2017

Multilayer Perceptron Algebra

Zhao Peng

Artificial Neural Networks(ANN) has been phenomenally successful on various pattern recognition tasks. However, the design of neural networks rely heavily on the experience and intuitions of individual developers. In this article, the author introduces a mathematical structure called MLP algebra on the set of all Multilayer Perceptron Neural Networks(MLP), which can serve as a guiding principle to build MLPs accommodating to the particular data sets, and to build complex MLPs from simpler ones.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2017

The energy landscape of a simple neural network

Anthony Collins Gamst, Alden Walker

We explore the energy landscape of a simple neural network. In particular, we expand upon previous work demonstrating that the empirical complexity of fitted neural networks is vastly less than a naive parameter count would suggest and that this implicit regularization is actually beneficial for generalization from fitted models.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2017

A Contemporary Overview of Probabilistic Latent Variable Models

Rick Farouni

In this paper we provide a conceptual overview of latent variable models within a probabilistic modeling framework, an overview that emphasizes the compositional nature and the interconnectedness of the seemingly disparate models commonly encountered in statistical practice.

en stat.ML, stat.OT

Detail Sumber

arXiv Open Access 2016

RSSL: Semi-supervised Learning in R

Jesse H. Krijthe

In this paper, we introduce a package for semi-supervised learning research in the R programming language called RSSL. We cover the purpose of the package, the methods it includes and comment on their use and implementation. We then show, using several code examples, how the package can be used to replicate well-known results from the semi-supervised learning literature.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2016

A Simple Approach to Sparse Clustering

Ery Arias-Castro, Xiao Pu

Consider the problem of sparse clustering, where it is assumed that only a subset of the features are useful for clustering purposes. In the framework of the COSA method of Friedman and Meulman, subsequently improved in the form of the Sparse K-means method of Witten and Tibshirani, a natural and simpler hill-climbing approach is introduced. The new method is shown to be competitive with these two methods and others.

en stat.ML

Detail DOI Sumber

arXiv Open Access 2016

On the convergence rate of the three operator splitting scheme

Fabian Pedregosa

The three operator splitting scheme was recently proposed by [Davis and Yin, 2015] as a method to optimize composite objective functions with one convex smooth term and two convex (possibly non-smooth) terms for which we have access to their proximity operator. In this short note we provide an alternative proof for the sublinear rate of convergence of this method.

en stat.ML, math.OC

Detail Sumber

arXiv Open Access 2015

Graphical Exponential Screening

Zhe Liu

In high dimensions we propose and analyze an aggregation estimator of the precision matrix for Gaussian graphical models. This estimator, called graphical Exponential Screening (gES), linearly combines a suitable set of individual estimators with different underlying graphs, and balances the estimation error and sparsity. We study the risk of this aggregation estimator and show that it is comparable to that of the best estimator based on a single graph, chosen by an oracle. Numerical performance of our method is investigated using both simulated and real datasets, in comparison with some state-of-art estimation procedures.

en stat.ML

Detail Sumber

arXiv Open Access 2015

Spectral Convergence Rate of Graph Laplacian

Xu Wang

Laplacian Eigenvectors of the graph constructed from a data set are used in many spectral manifold learning algorithms such as diffusion maps and spectral clustering. Given a graph constructed from a random sample of a $d$-dimensional compact submanifold $M$ in $\mathbb{R}^D$, we establish the spectral convergence rate of the graph Laplacian. It implies the consistency of the spectral clustering algorithm via a standard perturbation argument. A simple numerical study indicates the necessity of a denoising step before applying spectral algorithms.

en stat.ML

Detail Sumber

arXiv Open Access 2015

Variable and Fixed Interval Exponential Smoothing

Javier R. Movellan

Exponential smoothers are a simple and memory efficient way to compute running averages of time series. Here we define and describe practical properties of exponential smoothers for signals observed at constant and variable intervals.

en stat.ML, math.OC

Detail Sumber

arXiv Open Access 2014

Improved graph Laplacian via geometric self-consistency

Dominique Perrault-Joncas, Marina Meila

We address the problem of setting the kernel bandwidth used by Manifold Learning algorithms to construct the graph Laplacian. Exploiting the connection between manifold geometry, represented by the Riemannian metric, and the Laplace-Beltrami operator, we set the bandwidth by optimizing the Laplacian's ability to preserve the geometry of the data. Experiments show that this principled approach is effective and robust.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2014

Most Correlated Arms Identification

Che-Yu Liu, Sébastien Bubeck

We study the problem of finding the most mutually correlated arms among many arms. We show that adaptive arms sampling strategies can have significant advantages over the non-adaptive uniform sampling strategy. Our proposed algorithms rely on a novel correlation estimator. The use of this accurate estimator allows us to get improved results for a wide range of problem instances.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2013

Chebushev Greedy Algorithm in convex optimization

Vladimir Temlyakov

Chebyshev Greedy Algorithm is a generalization of the well known Orthogonal Matching Pursuit defined in a Hilbert space to the case of Banach spaces. We apply this algorithm for constructing sparse approximate solutions (with respect to a given dictionary) to convex optimization problems. Rate of convergence results in a style of the Lebesgue-type inequalities are proved.

en stat.ML, math.OC

Detail Sumber

arXiv Open Access 2010

Sparsity-accuracy trade-off in MKL

Ryota Tomioka, Taiji Suzuki

We empirically investigate the best trade-off between sparse and uniformly-weighted multiple kernel learning (MKL) using the elastic-net regularization on real and simulated datasets. We find that the best trade-off parameter depends not only on the sparsity of the true kernel-weight spectrum but also on the linear dependence among kernels and the number of samples.

en stat.ML, stat.AP

Detail Sumber

Hasil untuk "stat.ML"