Hasil "stat.ML" - JURNALIN

arXiv Open Access 2025

A Point Process Model for Optimizing Repeated Personalized Action Delivery to Users

Alexander Merkov, David Rohde

This paper provides a formalism for an important class of causal inference problems inspired by user-advertiser interaction in online advertiser. Then this formalism is specialized to an extension of temporal marked point processes and the neural point processes are suggested as practical solutions to some interesting special cases.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2024

Asymptotic Bayes risk of semi-supervised learning with uncertain labeling

Victor Leger, Romain Couillet

This article considers a semi-supervised classification setting on a Gaussian mixture model, where the data is not labeled strictly as usual, but instead with uncertain labels. Our main aim is to compute the Bayes risk for this model. We compare the behavior of the Bayes risk and the best known algorithm for this model. This comparison eventually gives new insights over the algorithm.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2024

Note on computational complexity of the Gromov-Wasserstein distance

Natalia Kravtsova

This note addresses computational difficulty of the Gromov-Wasserstein distance frequently mentioned in the literature. We provide details on the structure of the Gromov-Wasserstein distance optimization problem that show its non-convex quadratic nature for any instance of an input data. We further illustrate the non-convexity of the problem with several explicit examples.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2023

On the robust learning mixtures of linear regressions

Ying Huang, Liang Chen

In this note, we consider the problem of robust learning mixtures of linear regressions. We connect mixtures of linear regressions and mixtures of Gaussians with a simple thresholding, so that a quasi-polynomial time algorithm can be obtained under some mild separation condition. This algorithm has significantly better robustness than the previous result.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2022

Limitations on approximation by deep and shallow neural networks

Guergana Petrova, Przemysław Wojtaszczyk

We prove Carl's type inequalities for the error of approximation of compact sets K by deep and shallow neural networks. This in turn gives lower bounds on how well we can approximate the functions in K when requiring the approximants to come from outputs of such networks. Our results are obtained as a byproduct of the study of the recently introduced Lipschitz widths.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2021

$\bar{G}_{mst}$:An Unbiased Stratified Statistic and a Fast Gradient Optimization Algorithm Based on It

Aixiang Chen

-The fluctuation effect of gradient expectation and variance caused by parameter update between consecutive iterations is neglected or confusing by current mainstream gradient optimization algorithms. The work in this paper remedy this issue by introducing a novel unbiased stratified statistic \ $\bar{G}_{mst}$\ , a sufficient condition of fast convergence for \ $\bar{G}_{mst}$\ also is established. A novel algorithm named MSSG designed based on \ $\bar{G}_{mst}$\ outperforms other sgd-like algorithms. Theoretical conclusions and experimental evidence strongly suggest to employ MSSG when training deep model.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2021

Heterogeneous Dense Subhypergraph Detection

Mingao Yuan, Zuofeng Shang

We study the problem of testing the existence of a heterogeneous dense subhypergraph. The null hypothesis corresponds to a heterogeneous Erdös-Rényi uniform random hypergraph and the alternative hypothesis corresponds to a heterogeneous uniform random hypergraph that contains a dense subhypergraph. We establish detection boundaries when the edge probabilities are known and construct an asymptotically powerful test for distinguishing the hypotheses. We also construct an adaptive test which does not involve edge probabilities, and hence, is more practically useful.

en stat.ML, cs.LG

Detail Sumber

CrossRef Open Access 2021

Perbandingan Jumlah Eritrosit pada Sampel Darah 3 mL, 2 mL, dan 1 mL dengan Antikoagulan K2EDTA

Fajar Nur Cahya

Background: Laboratory examination has several factors that can affect the results of the examination, one of which is the pre-analytic factor that can affect the results of erythrocyte examination is the ratio between blood volume and anticoagulant. If the blood volume is insufficient, the anticoagulant causes red blood cells to become krenated, and if the excess blood volume can cause anticoagulants it can cause blood clots. Research Objective: This study aims to determine the ratio of the number of erythrocytes in the blood sample volume of 3 mL, 2 mL, and 1 mL with anticoagulant K2EDTA. Research Methods: This study used primary data with a hematological examination at the UTD RSUD Dr. H. Abdul Moeloek Bandar Lampung. This type of research is quantitative using an observational analytic design with a crossapproach sectional through a hematological examination using the Hematology Alayzer Mindray BC-3600 with a sample size of 40 respondents who meet the inclusion and exclusion criteria. Results: The results of the mean examination of the number of erythrocytes between the blood volume of 1 mL, 2 mL, 3 mL with the anticoagulant K2EDTA had different results, at a volume of 3 mL showed the lowest results. Conclusion: There is no significant difference between the examination of the number of erythrocytes with the blood sample volume of 1 mL, 2 mL, and 3 mL in thetube vacutainer K2EDTA Keywords: Hematology Examinatio;, Blood Volume; K2EDTA.

en

Detail DOI Sumber

arXiv Open Access 2020

Learning DAGs without imposing acyclicity

Gherardo Varando

We explore if it is possible to learn a directed acyclic graph (DAG) from data without imposing explicitly the acyclicity constraint. In particular, for Gaussian distributions, we frame structural learning as a sparse matrix factorization problem and we empirically show that solving an $\ell_1$-penalized optimization yields to good recovery of the true graph and, in general, to almost-DAG graphs. Moreover, this approach is computationally efficient and is not affected by the explosion of combinatorial complexity as in classical structural learning algorithms.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2019

The Implicit Bias of AdaGrad on Separable Data

Qian Qian, Xiaoyuan Qian

We study the implicit bias of AdaGrad on separable linear classification problems. We show that AdaGrad converges to a direction that can be characterized as the solution of a quadratic optimization problem with the same feasible set as the hard SVM problem. We also give a discussion about how different choices of the hyperparameters of AdaGrad might impact this direction. This provides a deeper understanding of why adaptive methods do not seem to have the generalization ability as good as gradient descent does in practice.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2018

An Application of HodgeRank to Online Peer Assessment

Tse-Yu Lin, Yen-Lung Tsai

Bias and heterogeneity in peer assessment can lead to the issue of unfair scoring in the educational field. To deal with this problem, we propose a reference ranking method for an online peer assessment system using HodgeRank. Such a scheme provides instructors with an objective scoring reference based on mathematics.

en stat.ML, cs.AI

Detail Sumber

arXiv Open Access 2018

Lehmer Transform and its Theoretical Properties

Masoud Ataei, Shengyuan Chen, Xiaogang Wang

We propose a new class of transforms that we call {\it Lehmer Transform} which is motivated by the {\it Lehmer mean function}. The proposed {\it Lehmer transform} decomposes a function of a sample into their constituting statistical moments. Theoretical properties of the proposed transform are presented. This transform could be very useful to provide an alternative method in analyzing non-stationary signals such as brain wave EEG.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2018

Breast Cancer Diagnosis via Classification Algorithms

Reihaneh Entezari

In this paper, we analyze the Wisconsin Diagnostic Breast Cancer Data using Machine Learning classification techniques, such as the SVM, Bayesian Logistic Regression (Variational Approximation), and K-Nearest-Neighbors. We describe each model, and compare their performance through different measures. We conclude that SVM has the best performance among all other classifiers, while it competes closely with the Bayesian Logistic Regression that is ranked second best method for this dataset.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2018

Pymc-learn: Practical Probabilistic Machine Learning in Python

Daniel Emaasit

$\textit{Pymc-learn}$ is a Python package providing a variety of state-of-the-art probabilistic models for supervised and unsupervised machine learning. It is inspired by $\textit{scikit-learn}$ and focuses on bringing probabilistic machine learning to non-specialists. It uses a general-purpose high-level language that mimics $\textit{scikit-learn}$. Emphasis is put on ease of use, productivity, flexibility, performance, documentation, and an API consistent with $\textit{scikit-learn}$. It depends on $\textit{scikit-learn}$ and $\textit{pymc3}$ and is distributed under the new BSD-3 license, encouraging its use in both academia and industry. Source code, binaries, and documentation are available on http://github.com/pymc-learn/pymc-learn.

en stat.ML, cs.LG

Detail Sumber

arXiv Open Access 2017

Reinterpreting Importance-Weighted Autoencoders

Chris Cremer, Quaid Morris, David Duvenaud

The standard interpretation of importance-weighted autoencoders is that they maximize a tighter lower bound on the marginal likelihood than the standard evidence lower bound. We give an alternate interpretation of this procedure: that it optimizes the standard variational lower bound, but using a more complex distribution. We formally derive this result, present a tighter lower bound, and visualize the implicit importance-weighted distribution.

en stat.ML

Detail Sumber

arXiv Open Access 2017

Consistency Results for Stationary Autoregressive Processes with Constrained Coefficients

Alessio Sancetta

We consider stationary autoregressive processes with coefficients restricted to an ellipsoid, which includes autoregressive processes with absolutely summable coefficients. We provide consistency results under different norms for the estimation of such processes using constrained and penalized estimators. As an application we show some weak form of universal consistency. Simulations show that directly including the constraint in the estimation can lead to more robust results.

en stat.ML

Detail Sumber

arXiv Open Access 2015

When are Kalman-filter restless bandits indexable?

Christopher R. Dance, Tomi Silander

We study the restless bandit associated with an extremely simple scalar Kalman filter model in discrete time. Under certain assumptions, we prove that the problem is indexable in the sense that the Whittle index is a non-decreasing function of the relevant belief state. In spite of the long history of this problem, this appears to be the first such proof. We use results about Schur-convexity and mechanical words, which are particular binary strings intimately related to palindromes.

en stat.ML

Detail Sumber

arXiv Open Access 2015

On the complexity of switching linear regression

Fabien Lauer

This technical note extends recent results on the computational complexity of globally minimizing the error of piecewise-affine models to the related problem of minimizing the error of switching linear regression models. In particular, we show that, on the one hand the problem is NP-hard, but on the other hand, it admits a polynomial-time algorithm with respect to the number of data points for any fixed data dimension and number of modes.

en stat.ML, cs.CC

Detail Sumber

arXiv Open Access 2015

`local' vs. `global' parameters -- breaking the gaussian complexity barrier

Shahar Mendelson

We show that if $F$ is a convex class of functions that is $L$-subgaussian, the error rate of learning problems generated by independent noise is equivalent to a fixed point determined by `local' covering estimates of the class, rather than by the gaussian averages. To that end, we establish new sharp upper and lower estimates on the error rate for such problems.

en stat.ML, math.ST

Detail Sumber

arXiv Open Access 2013

On the Consistency of the Bootstrap Approach for Support Vector Machines and Related Kernel Based Methods

Andreas Christmann, Robert Hable

It is shown that bootstrap approximations of support vector machines (SVMs) based on a general convex and smooth loss function and on a general kernel are consistent. This result is useful to approximate the unknown finite sample distribution of SVMs by the bootstrap approach.

en stat.ML, cs.LG

Detail Sumber

Hasil untuk "stat.ML"