Hasil "cs.DB" - JURNALIN

arXiv Open Access 2024

Database Dependencies and Formal Concept Analysis

Jaume Baixeries

This is an account of the characterization of database dependencies with Formal Concept Analysis.

en cs.DB, cs.LO

Detail Sumber

arXiv Open Access 2024

DatAasee -- A Metadata-Lake as Metadata Catalog for a Virtual Data-Lake

Christian Himpe

Metadata management for distributed data sources is a long-standing but ever-growing problem. To counter this challenge in a research-data and library-oriented setting, this work constructs a data architecture, derived from the data-lake: the metadata-lake. A proof-of-concept implementation of this proposed metadata aggregator is presented, too, and also evaluated.

en cs.DB, cs.DL

Detail Sumber

arXiv Open Access 2023

Unstructured and structured data: Can we have the best of both worlds with large language models?

Wang-Chiew Tan

This paper presents an opinion on the potential of using large language models to query on both unstructured and structured data. It also outlines some research challenges related to the topic of building question-answering systems for both types of data.

en cs.DB, cs.CL

Detail Sumber

arXiv Open Access 2023

A parallel algorithm for automated labeling of large time series

Andrey Goglachev

This article presents the PaSTiLa algorithm for automated labeling of large time series on a cluster with GPUs. The method automatically selects snippet length values based on the new proposed criterion and allows to search for patterns with high performance. Experiments showed high accuracy of pattern search and the advantage of the method compared to analogues.

en cs.DB

Detail Sumber

arXiv Open Access 2023

Eight Transaction Papers by Jim Gray

Philip A. Bernstein

This article is a summary of eight of Jim Gray's transaction papers. It was written at the invitation of Pat Helland to be a chapter of a forthcoming book in the ACM Turing Award winners' series, "Curiosity, Clarity, and Caring: How Jim Gray's Passion for Learning, Teaching, and People Changed Computing."

en cs.DB, cs.DC

Detail Sumber

arXiv Open Access 2022

Tree edit distance for hierarchical data compatible with HMIL paradigm

Břetislav Šopík, Tomáš Strenáčik

We define edit distance for hierarchically structured data compatible with the hierarchical multi-instance learning paradigm. Example of such data is dataset represented in JSON format where inner Array objects are interpreted as unordered bags of elements. We prove correct analytical properties of the defined distance.

en cs.DB, cs.LG

Detail Sumber

arXiv Open Access 2019

Frequent Itemset Mining using QUBO

Jonas Nüßlein

In this paper we propose a R-step approximation to solve frequent itemset mining on quantum hardware like quantum annealing or QAOA. The idea is to search for the set of items where the minimal 2-item frequency is maximal. This can be represented as a maximum clique problem.

en cs.DB

Detail Sumber

arXiv Open Access 2019

A framework supporting imprecise queries and data

Giacomo Bergami

This technical report provides some lightweight introduction and some generic use case scenarios motivating the definition of a database supporting uncertainties in both queries and data. This technical report is only providing the logical framework, which implementation is going to be provided in the final paper.

en cs.DB

Detail Sumber

arXiv Open Access 2019

Corrigendum to "Counting Database Repairs that Satisfy Conjunctive Queries with Self-Joins"

Jef Wijsen

The helping Lemma 7 in [Maslowski and Wijsen, ICDT, 2014] is false. The lemma is used in (and only in) the proof of Theorem 3 of that same paper. In this corrigendum, we provide a new proof for the latter theorem.

en cs.DB

Detail Sumber

CrossRef Open Access 2018

Azure Cosmos DB Overview

Manish Sharma

1 sitasi en

Detail DOI Sumber

CrossRef Open Access 2018

Azure Cosmos DB Geo-Replication

Manish Sharma

1 sitasi en

Detail DOI Sumber

CrossRef Open Access 2018

Migrating to Azure Cosmos DB–MongoDB API

Manish Sharma

1 sitasi en

Detail DOI Sumber

arXiv Open Access 2017

Fault Tolerant Consensus Agreement Algorithm

Marius Rafailescu

Recently a new fault tolerant and simple mechanism was designed for solving commit consensus problem. It is based on replicated validation of messages sent between transaction participants and a special dispatcher validator manager node. This paper presents a correctness, safety proofs and performance analysis of this algorithm.

en cs.DB, cs.DC

Detail Sumber

arXiv Open Access 2014

NoSQL Databases

Massimo Carro

In this document, I present the main notions of NoSQL databases and compare four selected products (Riak, MongoDB, Cassandra, Neo4J) according to their capabilities with respect to consistency, availability, and partition tolerance, as well as performance. I also propose a few criteria for selecting the right tool for the right situation.

en cs.DB

Detail Sumber

arXiv Open Access 2014

Undecidability of Finite Model Reasoning in DLFD

David Toman, Grant Weddell

We resolve an open problem concerning finite logical implication for path functional dependencies (PFDs).

en cs.DB, cs.LO

Detail Sumber

arXiv Open Access 2014

A Fast Minimal Infrequent Itemset Mining Algorithm

Kostyantyn Demchuk, Douglas J. Leith

A novel fast algorithm for finding quasi identifiers in large datasets is presented. Performance measurements on a broad range of datasets demonstrate substantial reductions in run-time relative to the state of the art and the scalability of the algorithm to realistically-sized datasets up to several million records.

en cs.DB

Detail Sumber

arXiv Open Access 2014

On the BDD/FC Conjecture

Tomasz Gogacz, Jerzy Marcinkowski

Bounded Derivation Depth property (BDD) and Finite Controllability (FC) are two properties of sets of datalog rules and tuple generating dependencies (known as Datalog +/- programs), which recently attracted some attention. We conjecture that the first of these properties implies the second, and support this conjecture by some evidence proving, among other results, that it holds true for all theories over binary signature.

en cs.DB

Detail Sumber

arXiv Open Access 2012

Covering Rough Sets From a Topological Point of View

Nguyen Duc Thuan

Covering-based rough set theory is an extension to classical rough set. The main purpose of this paper is to study covering rough sets from a topological point of view. The relationship among upper approximations based on topological spaces are explored.

en cs.DB

Detail Sumber

arXiv Open Access 2008

An Array Algebra

Albrecht Schmidt

This is a proposal of an algebra which aims at distributed array processing. The focus lies on re-arranging and distributing array data, which may be multi-dimensional. The context of the work is scientific processing; thus, the core science operations are assumed to be taken care of in external libraries or languages. A main design driver is the desire to carry over some of the strategies of the relational algebra into the array domain.

en cs.DB

Detail Sumber

arXiv Open Access 2008

An Introduction to Knowledge Management

Sabu M. Thampi

Knowledge has been lately recognized as one of the most important assets of organizations. Managing knowledge has grown to be imperative for the success of a company. This paper presents an overview of Knowledge Management and various aspects of secure knowledge management. A case study of knowledge management activities at Tata Steel is also discussed

en cs.DB, cs.CR

Detail Sumber

Hasil untuk "cs.DB"