Hasil untuk "cs.DS"

Menampilkan 20 dari ~109220 hasil · dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2024
Diffusion Map Autoencoder

Julio Candanedo

Diffusion-Map-AutoEncoder (DMAE) pairs a diffusion-map encoder (using the Nyström method) with linear or RBF Gaussian-Process latent mean decoders, yielding closed-form inductive mappings and strong reconstructions.

en cs.DS, cs.LG
arXiv Open Access 2022
A Stack-Free Traversal Algorithm for Left-Balanced k-d Trees

Ingo Wald

We present an algorithm that allows for find-closest-point and kNN-style traversals of left-balanced k-d trees, without the need for either recursion or software-managed stacks; instead using only current and last previously traversed node to compute which node to traverse next.

en cs.DS
arXiv Open Access 2014
Suffix Arrays for Spaced-SNP Databases

Travis Gagie

Single-nucleotide polymorphisms (SNPs) account for most variations between human genomes. We show how, if the genomes in a database differ only by a reasonable number of SNPs and the substrings between those SNPs are unique, then we can store a fast compressed suffix array for that database.

en cs.DS
DOAJ Open Access 2012
Toward the asymptotic count of bi-modular hidden patterns under probabilistic dynamical sources: a case study

Loïck Lhote, Manuel E. Lladser

Consider a countable alphabet $\mathcal{A}$. A multi-modular hidden pattern is an $r$-tuple $(w_1,\ldots , w_r)$, where each $w_i$ is a word over $\mathcal{A}$ called a module. The hidden pattern is said to occur in a text $t$ when the later admits the decomposition $t = v_0 w_1v_1 \cdots v_{r−1}w_r v_r$, for arbitrary words $v_i$ over $\mathcal{A}$. Flajolet, Szpankowski and Vallée (2006) proved via the method of moments that the number of matches (or occurrences) with a multi-modular hidden pattern in a random text $X_1\cdots X_n$ is asymptotically Normal, when $(X_n)_{n\geq1}$ are independent and identically distributed $\mathcal{A}$-valued random variables. Bourdon and Vallée (2002) had conjectured however that asymptotic Normality holds more generally when $(X_n)_{n\geq1}$ is produced by an expansive dynamical source. Whereas memoryless and Markovian sequences are instances of dynamical sources with finite memory length, general dynamical sources may be non-Markovian i.e. convey an infinite memory length. The technical difficulty to count hidden patterns under sources with memory is the context-free nature of these patterns as well as the lack of logarithm-and exponential-type transformations to rewrite the product of non-commuting transfer operators. In this paper, we address a case study in which we have successfully overpassed the aforementioned difficulties and which may illuminate how to address more general cases via auto-correlation operators. Our main result shows that the number of matches with a bi-modular pattern $(w_1, w_2)$ normalized by the number of matches with the pattern $w_1$, where $w_1$ and $w_2$ are different alphabet characters, is indeed asymptotically Normal when $(X_n)_{n\geq1}$ is produced by a holomorphic probabilistic dynamical source.

Mathematics
DOAJ Open Access 2012
The Euclid Algorithm is totally gaussian

Brigitte Vallée

We consider Euclid’s gcd algorithm for two integers $(p, q)$ with $1 \leq p \leq q \leq N$, with the uniform distribution on input pairs. We study the distribution of the total cost of execution of the algorithm for an additive cost function $d$ on the set of possible digits, asymptotically for $N \to \infty$. For any additive cost of moderate growth $d$, Baladi and Vallée obtained a central limit theorem, and –in the case when the cost $d$ is lattice– a local limit theorem. In both cases, the optimal speed was attained. When the cost is non lattice, the problem was later considered by Baladi and Hachemi, who obtained a local limit theorem under an intertwined diophantine condition which involves the cost $d$ together with the “canonical” cost $c$ of the underlying dynamical system. The speed depends on the irrationality exponent that intervenes in the diophantine condition. We show here how to replace this diophantine condition by another diophantine condition, much more natural, which already intervenes in simpler problems of the same vein, and only involves the cost $d$. This “replacement” is made possible by using the additivity of cost $d$, together with a strong property satisfied by the Euclidean Dynamical System, which states that the cost $c$ is both “strongly” non additive and diophantine in a precise sense. We thus obtain a local limit theorem, whose speed is related to the irrationality exponent which intervenes in the new diophantine condition. We mainly use the previous proof of Baladi and Hachemi, and “just” explain how their diophantine condition may be replaced by our condition. Our result also provides a precise comparison between the rational trajectories of the Euclid dynamical system and the real trajectories.

Mathematics
DOAJ Open Access 2012
On the Number of 2-Protected Nodes in Tries and Suffix Trees

Jeffrey Gaither, Yushi Homma, Mark Sellke et al.

We use probabilistic and combinatorial tools on strings to discover the average number of 2-protected nodes in tries and in suffix trees. Our analysis covers both the uniform and non-uniform cases. For instance, in a uniform trie with $n$ leaves, the number of 2-protected nodes is approximately 0.803$n$, plus small first-order fluctuations. The 2-protected nodes are an emerging way to distinguish the interior of a tree from the fringe.

Mathematics
arXiv Open Access 2012
The McDougal Cave and Counting issues

Sayandeep Khan

In this paper I investigate the problem of tagging elements of a set, and the elements of those elements, uniquely, when they admit an order, and two boundary elements are tagged. A heuristic sorting algorithm is also investigated. (Updated grammar and spellings.)

en cs.DS, math.GM
arXiv Open Access 2009
On evaluation of permanents

Andreas Björklund, Thore Husfeldt, Petteri Kaski et al.

We study the time and space complexity of matrix permanents over rings and semirings.

en cs.DS, cs.DM
DOAJ Open Access 2007
Why almost all satisfiable $k$-CNF formulas are easy

Amin Coja-Oghlan, Michael Krivelevich, Dan Vilenchik

Finding a satisfying assignment for a $k$-CNF formula $(k \geq 3)$, assuming such exists, is a notoriously hard problem. In this work we consider the uniform distribution over satisfiable $k$-CNF formulas with a linear number of clauses (clause-variable ratio greater than some constant). We rigorously analyze the structure of the space of satisfying assignments of a random formula in that distribution, showing that basically all satisfying assignments are clustered in one cluster, and agree on all but a small, though linear, number of variables. This observation enables us to describe a polynomial time algorithm that finds $\textit{whp}$ a satisfying assignment for such formulas, thus asserting that most satisfiable $k$-CNF formulas are easy (whenever the clause-variable ratio is greater than some constant). This should be contrasted with the setting of very sparse $k$-CNF formulas (which are satisfiable $\textit{whp}$), where experimental results show some regime of clause density to be difficult for many SAT heuristics. One explanation for this phenomena, backed up by partially non-rigorous analytical tools from statistical physics, is the complicated clustering of the solution space at that regime, unlike the more "regular" structure that denser formulas possess. Thus in some sense, our result rigorously supports this explanation.

Mathematics

Halaman 5 dari 5461