Zuogong Yue, Victor Solo
We develop hard clustering based on likelihood rather than distance and prove convergence. We also provide simulations and real data examples.
Menampilkan 20 dari ~159322 hasil · dari DOAJ, arXiv, CrossRef
Zuogong Yue, Victor Solo
We develop hard clustering based on likelihood rather than distance and prove convergence. We also provide simulations and real data examples.
Dylan Engelbrecht
Harsh Dolhare, Vivek Borkar
We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with consensus. The main result is an analysis of this scheme using the ODE approach to stochastic approximation, leading to a high probability bound for the tracking error between suitably interpolated iterates and the limiting differential equation. Several future directions will also be highlighted.
Theodore Papamarkou, Alexey Lindo
Probability-generating function (PGF) kernels are introduced, which constitute a class of kernels supported on the unit hypersphere, for the purposes of spherical data analysis. PGF kernels generalize RBF kernels in the context of spherical data. The properties of PGF kernels are studied. A semi-parametric learning algorithm is introduced to enable the use of PGF kernels with spherical data.
Arunesh Mittal, Paul Sajda, John Paisley
We propose a deep generative factor analysis model with beta process prior that can approximate complex non-factorial distributions over the latent codes. We outline a stochastic EM algorithm for scalable inference in a specific instantiation of this model and present some preliminary results.
Patrick Heas, Cedric Herzet
This work provides closed-form solutions and minimum achievable errors for a large class of low-rank approximation problems in Hilbert spaces. The proposed theorem generalizes to the case of bounded linear operators the previous results obtained in the finite dimensional case for the Frobenius norm. The theorem provides the basis for the design of tractable algorithms for kernel or continuous DMD.
Christopher Beckham, Christopher Pal
Procedural terrain generation for video games has been traditionally been done with smartly designed but handcrafted algorithms that generate heightmaps. We propose a first step toward the learning and synthesis of these using recent advances in deep generative modelling with openly available satellite imagery from NASA.
Christopher Dienes
A composite loss framework is proposed for low-rank modeling of data consisting of interesting and common values, such as excess zeros or missing values. The methodology is motivated by the generalized low-rank framework and the hurdle method which is commonly used to analyze zero-inflated counts. The model is demonstrated on a manufacturing data set and applied to the problem of missing value imputation.
Andreas Maurer, Massimiliano Pontil
We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equally applicable to multi-category learning.
Niko Brümmer
This note compares two recently published machine learning methods for constructing flexible, but tractable families of variational hidden-variable posteriors. The first method, called "hierarchical variational models" enriches the inference model with an extra variable, while the other, called "auxiliary deep generative models", enriches the generative model instead. We conclude that the two methods are mathematically equivalent.
Ian Goodfellow
This technical report describes an efficient technique for computing the norm of the gradient of the loss function for a neural network with respect to its parameters. This gradient norm can be computed efficiently for every example.
Tianwen Wei
This contribution summarizes the results on the asymptotic performance of several variants of the FastICA algorithm. A number of new closed-form expressions are presented.
Niko Brümmer
The Laplace approximation calls for the computation of second derivatives at the likelihood maximum. When the maximum is found by the EM-algorithm, there is a convenient way to compute these derivatives. The likelihood gradient can be obtained from the EM-auxiliary, while the Hessian can be obtained from this gradient with the Pearlmutter trick.
Vladimir Temlyakov
The paper gives a systematic study of the approximate versions of three greedy-type algorithms that are widely used in convex optimization. By approximate version we mean the one where some of evaluations are made with an error. Importance of such versions of greedy-type algorithms in convex optimization and in approximation theory was emphasized in previous literature.
Yangbo He, Jinzhu Jia, Bin Yu
This supplementary material includes three parts: some preliminary results, four examples, an experiment, three new algorithms, and all proofs of the results in the paper "Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs".
Daniel Khashabi, Mojtaba Ziyadi, Feng Liang
In this work we propose a heteroscedastic generalization to RVM, a fast Bayesian framework for regression, based on some recent similar works. We use variational approximation and expectation propagation to tackle the problem. The work is still under progress and we are examining the results and comparing with the previous works.
Péter Kövesárki
This paper describes a novel method to approximate the polynomial coefficients of regression functions, with particular interest on multi-dimensional classification. The derivation is simple, and offers a fast, robust classification technique that is resistant to over-fitting.
Ian Goodfellow, Aaron Courville, Yoshua Bengio
We introduce a new method for training deep Boltzmann machines jointly. Prior methods require an initial learning pass that trains the deep Boltzmann machine greedily, one layer at a time, or do not perform well on classifi- cation tasks.
Joe Staines, David Barber
We discuss a general technique that can be used to form a differentiable bound on the optima of non-differentiable or discrete objective functions. We form a unified description of these methods and consider under which circumstances the bound is concave. In particular we consider two concrete applications of the method, namely sparse learning and support vector classification.
Halaman 3 dari 7967