Hasil untuk "math.ST"

Menampilkan 20 dari ~1430362 hasil · dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2021
Optimal Binary Classification Beyond Accuracy

Shashank Singh, Justin Khim

The vast majority of statistical theory on binary classification characterizes performance in terms of accuracy. However, accuracy is known in many cases to poorly reflect the practical consequences of classification error, most famously in imbalanced binary classification, where data are dominated by samples from one of two classes. The first part of this paper derives a novel generalization of the Bayes-optimal classifier from accuracy to any performance metric computed from the confusion matrix. Specifically, this result (a) demonstrates that stochastic classifiers sometimes outperform the best possible deterministic classifier and (b) removes an empirically unverifiable absolute continuity assumption that is poorly understood but pervades existing results. We then demonstrate how to use this generalized Bayes classifier to obtain regret bounds in terms of the error of estimating regression functions under uniform loss. Finally, we use these results to develop some of the first finite-sample statistical guarantees specific to imbalanced binary classification. Specifically, we demonstrate that optimal classification performance depends on properties of class imbalance, such as a novel notion called Uniform Class Imbalance, that have not previously been formalized. We further illustrate these contributions numerically in the case of $k$-nearest neighbor classification

en math.ST, cs.LG
arXiv Open Access 2019
Weak convergence theory for Poisson sampling designs

Leo Pasquazzi

This work provides some general theorems about unconditional and conditional weak convergence of empirical processes in the case of Poisson sampling designs. The theorems presented in this work are stronger than previously published results. Their proofs are based on the symmetrization technique and on a contraction principle.

en math.ST
arXiv Open Access 2015
How to model the covariance structure in a spatial framework: variogram or correlation function?

Giovanni Pistone, Grazia Vicario

The basic Kriging's model assumes a Gaussian distribution with stationary mean and stationary variance. In such a setting, the joint distribution of the spatial process is characterized by the common variance and the correlation matrix or, equivalently, by the common variance and the variogram matrix. We discuss in in detail the option to actually use the variogram as a parameterization.

en math.ST
arXiv Open Access 2010
Identification and well-posedness in a class of nonparametric problems

Victoria Zinde-Walsh

This is a companion note to Zinde-Walsh (2010), arXiv:1009.4217v1[MATH.ST], to clarify and extend results on identification in a number of problems that lead to a system of convolution equations. Examples include identification of the distribution of mismeasured variables, of a nonparametric regression function under Berkson type measurement error, some nonparametric panel data models, etc. The reason that identification in different problems can be considered in one approach is that they lead to the same system of convolution equations; moreover the solution can be given under more general assumptions than those usually considered, by examining these equations in spaces of generalized functions. An important issue that did not receive sufficient attention is that of well-posedness. This note gives conditions under which well-posedness obtains, an example that demonstrates that when well-posedness does not hold functions that are far apart can give rise to observable arbitrarily close functions and discusses misspecification and estimation from the stand-point of well-posedness.

en math.ST, stat.ME
arXiv Open Access 2007
Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting

Martin J. Wainwright

The problem of recovering the sparsity pattern of a fixed but unknown vector $β^* \in \real^p based on a set of $n$ noisy observations arises in a variety of settings, including subset selection in regression, graphical model selection, signal denoising, compressive sensing, and constructive approximation. Of interest are conditions on the model dimension $p$, the sparsity index $s$ (number of non-zero entries in $β^*$), and the number of observations $n$ that are necessary and/or sufficient to ensure asymptotically perfect recovery of the sparsity pattern. This paper focuses on the information-theoretic limits of sparsity recovery: in particular, for a noisy linear observation model based on measurement vectors drawn from the standard Gaussian ensemble, we derive both a set of sufficient conditions for asymptotically perfect recovery using the optimal decoder, as well as a set of necessary conditions that any decoder, regardless of its computational complexity, must satisfy for perfect recovery. This analysis of optimal decoding limits complements our previous work (ARXIV: math.ST/0605740) on sharp thresholds for sparsity recovery using the Lasso ($\ell_1$-constrained quadratic programming) with Gaussian measurement ensembles.

en math.ST, cs.IT
arXiv Open Access 2004
Least Angle Regression

Bradley Efron, Trevor Hastie, Iain Johnstone et al.

The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method;

arXiv Open Access 2005
Breakdown and groups

P. Laurie Davies, Ursula Gather

The concept of breakdown point was introduced by Hampel [Ph.D. dissertation (1968), Univ. California, Berkeley; Ann. Math. Statist. 42 (1971) 1887-1896] and developed further by, among others, Huber [Robust Statistics (1981). Wiley, New York] and Donoho and Huber [In A Festschrift for Erich L. Lehmann (1983) 157-184. Wadsworth, Belmont, CA]. It has proved most successful in the context of location, scale and regression problems. Attempts to extend the concept to other situations have not met with general acceptance. In this paper we argue that this is connected to the fact that in the location, scale and regression problems the translation and affine groups give rise to a definition of equivariance for statistical functionals. Comparisons in terms of breakdown points seem only useful when restricted to equivariant functionals and even here the connection between breakdown and equivariance is a tenuous one.

arXiv Open Access 2005
Addendum to the discussion of "Breakdown and groups"

P. Laurie Davies, Ursula Gather

In his discussion of Davies and Gather [Ann. Statist. 33 (2005) 977--1035] [math.ST/0508497] Tyler pointed out that the theory developed there could not be applied to the case of directional data. He related the breakdown of directional functionals to the problem of definability. In this addendum we provide a concept of breakdown defined in terms of definability and not in terms of bias. If a group of finite order $k$ acts on the sample space we show that the breakdown point can be bounded above by $(k-1)/k$. In the case of directional data there is a group of order $k=2$ giving an upper bound of 1/2.

arXiv Open Access 2006
Discussion of "Equi-energy sampler" by Kou, Zhou and Wong

Yves F. Atchadé, Jun S. Liu

We congratulate Samuel Kou, Qing Zhou and Wing Wong [math.ST/0507080] (referred to subsequently as KZW) for this beautifully written paper, which opens a new direction in Monte Carlo computation. This discussion has two parts. First, we describe a very closely related method, multicanonical sampling (MCS), and report a simulation example that compares the equi-energy (EE) sampler with MCS. Overall, we found the two algorithms to be of comparable efficiency for the simulation problem considered. In the second part, we develop some additional convergence results for the EE sampler.

arXiv Open Access 2006
Classifier Technology and the Illusion of Progress

David J. Hand

A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.

arXiv Open Access 2005
Equi-energy sampler with applications in statistical inference and statistical mechanics

S. C. Kou, Qing Zhou, Wing Hung Wong

We introduce a new sampling algorithm, the equi-energy sampler, for efficient statistical sampling and estimation. Complementary to the widely used temperature-domain methods, the equi-energy sampler, utilizing the temperature--energy duality, targets the energy directly. The focus on the energy function not only facilitates efficient sampling, but also provides a powerful means for statistical estimation, for example, the calculation of the density of states and microcanonical averages in statistical mechanics. The equi-energy sampler is applied to a variety of problems, including exponential regression in statistics, motif sampling in computational biology and protein folding in biophysics.

en math.ST, astro-ph
arXiv Open Access 2006
Discussion of "EQUI-energy sampler" by Kou, Zhou and Wong

Peter Minary, Michael Levitt

Novel sampling algorithms can significantly impact open questions in computational biology, most notably the in silico protein folding problem. By using computational methods, protein folding aims to find the three-dimensional structure of a protein chain given the sequence of its amino acid building blocks. The complexity of the problem strongly depends on the protein representation and its energy function. The more detailed the model, the more complex its corresponding energy function and the more challenge it sets for sampling algorithms. Kou, Zhou and Wong [math.ST/0507080] have introduced a novel sampling method, which could contribute significantly to the field of structural prediction.

arXiv Open Access 2006
Causal Inference Through Potential Outcomes and Principal Stratification: Application to Studies with "Censoring" Due to Death

Donald B. Rubin

Causal inference is best understood using potential outcomes. This use is particularly important in more complex settings, that is, observational studies or randomized experiments with complications such as noncompliance. The topic of this lecture, the issue of estimating the causal effect of a treatment on a primary outcome that is ``censored'' by death, is another such complication. For example, suppose that we wish to estimate the effect of a new drug on Quality of Life (QOL) in a randomized experiment, where some of the patients die before the time designated for their QOL to be assessed. Another example with the same structure occurs with the evaluation of an educational program designed to increase final test scores, which are not defined for those who drop out of school before taking the test. A further application is to studies of the effect of job-training programs on wages, where wages are only defined for those who are employed. The analysis of examples like these is greatly clarified using potential outcomes to define causal effects, followed by principal stratification on the intermediated outcomes (e.g., survival).

arXiv Open Access 2004
Asymptotics in Quantum Statistics

Richard D. Gill

Observations or measurements taken of a quantum system (a small number of fundamental particles) are inherently random. If the state of the system depends on unknown parameters, then the distribution of the outcome depends on these parameters too, and statistical inference problems result. Often one has a choice of what measurement to take, corresponding to different experimental set-ups or settings of measurement apparatus. This leads to a design problem--which measurement is best for a given statistical problem. This paper gives an introduction to this field in the most simple of settings, that of estimating the state of a spin-half particle given n independent copies of the particle. We show how in some cases asymptotically optimal measurements can be constructed. Other cases present interesting open problems, connected to the fact that for some models, quantum Fisher information is in some sense non-additive. In physical terms, we have non-locality without entanglement.

en math.ST, quant-ph
arXiv Open Access 2006
Support Vector Machines with Applications

Javier M. Moguerza, Alberto Muñoz

Support vector machines (SVMs) appeared in the early nineties as optimal margin classifiers in the context of Vapnik's statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classification and regression problems are usually obtained: only a few samples are involved in the determination of the classification or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

Halaman 3 dari 71519