Jaume Baixeries
This is an account of the characterization of database dependencies with Formal Concept Analysis.
Menampilkan 20 dari ~93751 hasil · dari CrossRef, DOAJ, arXiv
Jaume Baixeries
This is an account of the characterization of database dependencies with Formal Concept Analysis.
Christian Himpe
Metadata management for distributed data sources is a long-standing but ever-growing problem. To counter this challenge in a research-data and library-oriented setting, this work constructs a data architecture, derived from the data-lake: the metadata-lake. A proof-of-concept implementation of this proposed metadata aggregator is presented, too, and also evaluated.
Wang-Chiew Tan
This paper presents an opinion on the potential of using large language models to query on both unstructured and structured data. It also outlines some research challenges related to the topic of building question-answering systems for both types of data.
Andrey Goglachev
This article presents the PaSTiLa algorithm for automated labeling of large time series on a cluster with GPUs. The method automatically selects snippet length values based on the new proposed criterion and allows to search for patterns with high performance. Experiments showed high accuracy of pattern search and the advantage of the method compared to analogues.
Philip A. Bernstein
This article is a summary of eight of Jim Gray's transaction papers. It was written at the invitation of Pat Helland to be a chapter of a forthcoming book in the ACM Turing Award winners' series, "Curiosity, Clarity, and Caring: How Jim Gray's Passion for Learning, Teaching, and People Changed Computing."
Břetislav Šopík, Tomáš Strenáčik
We define edit distance for hierarchically structured data compatible with the hierarchical multi-instance learning paradigm. Example of such data is dataset represented in JSON format where inner Array objects are interpreted as unordered bags of elements. We prove correct analytical properties of the defined distance.
Jonas Nüßlein
In this paper we propose a R-step approximation to solve frequent itemset mining on quantum hardware like quantum annealing or QAOA. The idea is to search for the set of items where the minimal 2-item frequency is maximal. This can be represented as a maximum clique problem.
Giacomo Bergami
This technical report provides some lightweight introduction and some generic use case scenarios motivating the definition of a database supporting uncertainties in both queries and data. This technical report is only providing the logical framework, which implementation is going to be provided in the final paper.
Jef Wijsen
The helping Lemma 7 in [Maslowski and Wijsen, ICDT, 2014] is false. The lemma is used in (and only in) the proof of Theorem 3 of that same paper. In this corrigendum, we provide a new proof for the latter theorem.
Manish Sharma
Manish Sharma
Manish Sharma
Marius Rafailescu
Recently a new fault tolerant and simple mechanism was designed for solving commit consensus problem. It is based on replicated validation of messages sent between transaction participants and a special dispatcher validator manager node. This paper presents a correctness, safety proofs and performance analysis of this algorithm.
Massimo Carro
In this document, I present the main notions of NoSQL databases and compare four selected products (Riak, MongoDB, Cassandra, Neo4J) according to their capabilities with respect to consistency, availability, and partition tolerance, as well as performance. I also propose a few criteria for selecting the right tool for the right situation.
David Toman, Grant Weddell
We resolve an open problem concerning finite logical implication for path functional dependencies (PFDs).
Kostyantyn Demchuk, Douglas J. Leith
A novel fast algorithm for finding quasi identifiers in large datasets is presented. Performance measurements on a broad range of datasets demonstrate substantial reductions in run-time relative to the state of the art and the scalability of the algorithm to realistically-sized datasets up to several million records.
Tomasz Gogacz, Jerzy Marcinkowski
Bounded Derivation Depth property (BDD) and Finite Controllability (FC) are two properties of sets of datalog rules and tuple generating dependencies (known as Datalog +/- programs), which recently attracted some attention. We conjecture that the first of these properties implies the second, and support this conjecture by some evidence proving, among other results, that it holds true for all theories over binary signature.
Nguyen Duc Thuan
Covering-based rough set theory is an extension to classical rough set. The main purpose of this paper is to study covering rough sets from a topological point of view. The relationship among upper approximations based on topological spaces are explored.
Albrecht Schmidt
This is a proposal of an algebra which aims at distributed array processing. The focus lies on re-arranging and distributing array data, which may be multi-dimensional. The context of the work is scientific processing; thus, the core science operations are assumed to be taken care of in external libraries or languages. A main design driver is the desire to carry over some of the strategies of the relational algebra into the array domain.
Sabu M. Thampi
Knowledge has been lately recognized as one of the most important assets of organizations. Managing knowledge has grown to be imperative for the success of a company. This paper presents an overview of Knowledge Management and various aspects of secure knowledge management. A case study of knowledge management activities at Tata Steel is also discussed
Halaman 2 dari 4688