Peter Matthew Jacobs, Foad Namjoo, Jeff M. Phillips
We revisit extending the Kolmogorov-Smirnov distance between probability distributions to the multidimensional setting and make new arguments about the proper way to approach this generalization. Our proposed formulation maximizes the difference over orthogonal dominating rectangular ranges (d-sided rectangles in R^d), and is an integral probability metric. We also prove that the distance between a distribution and a sample from the distribution converges to 0 as the sample size grows, and bound this rate. Moreover, we show that one can, up to this same approximation error, compute the distance efficiently in 4 or fewer dimensions; specifically the runtime is near-linear in the size of the sample needed for that error. With this, we derive a delta-precision two-sample hypothesis test using this distance. Finally, we show these metric and approximation properties do not hold for other popular variants.
This review paper is intended for the Handbook of Markov chain Monte Carlo's second edition. The authors will be grateful for any suggestions that could perfect it.
We propose a Cholesky factor parameterization of correlation matrices that facilitates a priori restrictions on the correlation matrix. It is a smooth and differentiable transform that allows additional boundary constraints on the correlation values. Our particular motivation is random sampling under positivity constraints on the space of correlation matrices using MCMC methods.
We consider how to use Hamiltonian Monte Carlo to sample from a distribution whose log-density is piecewise quadratic, conditioned on the sample lying on the level set of a piecewise affine, continuous function.
In this comment we discuss relative strengths and weaknesses of simplex and Dirichlet Dempster-Shafer inference as applied to multi-resolution tests of independence.
A method for drawing random samples of unit vectors $x$ in $R^p$ with density proportional to $x^TAx$ where $A$ is a symmetric, positive definite matrix. Includes an R function which implements the method.
A method for the introduction of second-order derivatives of the log likelihood into HMC algorithms is introduced, which does not require the Hessian to be evaluated at each leapfrog step but only at the start and end of trajectories.
We demonstrate a novel approach for the random sampling of Latin squares of order~$n$ via probabilistic divide-and-conquer. The algorithm divides the entries of the table modulo powers of $2$, and samples a corresponding binary contingency table at each level. The sampling distribution is based on the Boltzmann sampling heuristic, along with probabilistic divide-and-conquer.
Statistical moments are widely used in descriptive statistics. Therefore efficient and numerically stable implementations are important in practice. Pebay [1] derives online update formulas for arbitrary-order central moments. We present a simpler version that is also easier to implement.
We use the minorization-maximization principle (Lange, Hunter and Yang 2000) to establish the monotonicity of a multiplicative algorithm for computing Bayesian D-optimal designs. This proves a conjecture of Dette, Pepelyshev and Zhigljavsky (2008).