Hasil untuk "cs.DB"

Menampilkan 20 dari ~93769 hasil · dari DOAJ, CrossRef, arXiv

JSON API
arXiv Open Access 2024
Limitations of Validity Intervals in Data Freshness Management

Kyoung-Don Kang

In data-intensive real-time applications, such as smart transportation and manufacturing, ensuring data freshness is essential, as using obsolete data can lead to negative outcomes. Validity intervals serve as the standard means to specify freshness requirements in real-time databases. In this paper, we bring attention to significant drawbacks of validity intervals that have largely been unnoticed and introduce a new definition of data freshness, while discussing future research directions to address these limitations.

en cs.DB
arXiv Open Access 2024
LLM+KG@VLDB'24 Workshop Summary

Arijit Khan, Tianxing Wu, Xi Chen

The unification of large language models (LLMs) and knowledge graphs (KGs) has emerged as a hot topic. At the LLM+KG'24 workshop, held in conjunction with VLDB 2024 in Guangzhou, China, one of the key themes explored was important data management challenges and opportunities due to the effective interaction between LLMs and KGs. This report outlines the major directions and approaches presented by various speakers during the LLM+KG'24 workshop.

en cs.DB, cs.AI
arXiv Open Access 2023
Algorithm for Invalidation of Cached Results of Queries to a Single Table

Jakub Łopuszański

One of the most popular setups for a back-end of a high performance website consists of a relational database and a cache which stores results of performed queries. Several application frameworks support caching of queries made to the database, but few of them handle cache invalidation correctly, resorting to simpler solutions such as short TTL values, or flushing the whole cache after any write to the database. In this paper a simple, correct, efficient and tested in real world application solution is presented, which allows for infinite TTL, and very fine grained cache invalidation. Algorithm is proven to be correct in a concurrent environment, both theoretically and in practice.

en cs.DB
arXiv Open Access 2022
Giving the Right Answer: a Brief Overview on How to Extend Ranking and Skyline Queries

Sergio Cuzzucoli

To retrieve the best results in a database we use Top-K queries and Skyline queries but some problems arise. The formers rely too much on user preferences, which are difficult to quantify and may skew the fetching of the data, while the latters tend to output too much data. In this paper, we explore three different branches of research that seek to overcome such limitations: Flexible/Restricted Skylines, Skyline Ordering/Ranking, and Regret Minimization. We analyze how they work and we make comparisons among them to guide the reader to choose the approach that best fits their use cases.

en cs.DB
arXiv Open Access 2019
Adaptive filter ordering in Spark

Nikodimos Nikolaidis, Anastasios Gounaris

This report describes a technical methodology to render the Apache Spark execution engine adaptive. It presents the engineering solutions, which specifically target to adaptively reorder predicates in data streams with evolving statistics. The system extension developed is available as an open-source prototype. Indicative experimental results show its overhead and sensitivity to tuning parameters.

en cs.DB, cs.DC
arXiv Open Access 2018
Measuring and Computing Database Inconsistency via Repairs

Leopoldo Bertossi

We propose a generic numerical measure of inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. A particular inconsistency measure associated to cardinality-repairs is investigated; and we show that it can be computed via answer-set programs. Keywords: Integrity constraints in databases, inconsistent databases, database repairs, inconsistency measure.

en cs.DB, cs.AI
arXiv Open Access 2018
Repair-Based Degrees of Database Inconsistency: Computation and Complexity

Leopoldo Bertossi

We propose a generic numerical measure of the inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. In particular, an inconsistency measure associated to cardinality-repairs is investigated in detail. More specifically, it is shown that it can be computed via answer-set programs, but sometimes its computation can be intractable in data complexity. However, polynomial-time deterministic and randomized approximations are exhibited. The behavior of this measure under small updates is analyzed, obtaining fixed-parameter tractability results. Furthermore, alternative inconsistency measures are proposed and discussed.

en cs.DB, cs.AI
arXiv Open Access 2016
DB-Nets: on The Marriage of Colored Petri Nets and Relational Databases

Marco Montali, Andrey Rivkin

The integrated management of business processes and mas- ter data is being increasingly considered as a fundamental problem, by both the academia and the industry. In this position paper, we focus on the foundations of the problem, arguing that contemporary approaches struggle to find a suitable equilibrium between data- and process-related aspects. We then propose db-nets, a new formal model that balances such two pillars through the marriage of colored Petri nets and relational databases. We invite the research community to build on this model, discussing its potential in modeling, formal verification, and simulation.

en cs.DB
arXiv Open Access 2015
Bag-of-Features Image Indexing and Classification in Microsoft SQL Server Relational Database

Marcin Korytkowski, Rafal Scherer, Pawel Staszewski et al.

This paper presents a novel relational database architecture aimed to visual objects classification and retrieval. The framework is based on the bag-of-features image representation model combined with the Support Vector Machine classification and is integrated in a Microsoft SQL Server database.

en cs.DB, cs.CV
arXiv Open Access 2015
Efficient Iterative Processing in the SciDB Parallel Array Engine

Emad Soroush, Magdalena Balazinska, Simon Krughoff et al.

Many scientific data-intensive applications perform iterative computations on array data. There exist multiple engines specialized for array processing. These engines efficiently support various types of operations, but none includes native support for iterative processing. In this paper, we develop a model for iterative array computations and a series of optimizations. We evaluate the benefits of an optimized, native support for iterative array processing on the SciDB engine and real workloads from the astronomy domain.

en cs.DB
arXiv Open Access 2013
Extending the ER Model to relational Model novel transformation Algorithm: transforming relationship Types among Subtypes

Dhammika Pieris

A novel approach for creating ER conceptual models and an algorithm for transforming them to the relational model has been developed by modifying and extending the existing methods. A part of the new algorithm has previously been presented. This paper presents the rest of the algorithm. One of the objectives of this paper is to use it as a supportive document for ongoing empirical evaluations of the new approach being conducted using the cognitive engagement method and with the participation of different segments of the field as respondents.

en cs.DB
arXiv Open Access 2013
Geographica: A Benchmark for Geospatial RDF Stores

George Garbis, Kostis Kyzirakos, Manolis Koubarakis

Geospatial extensions of SPARQL like GeoSPARQL and stSPARQL have recently been defined and corresponding geospatial RDF stores have been implemented. However, there is no widely used benchmark for evaluating geospatial RDF stores which takes into account recent advances to the state of the art in this area. In this paper, we develop a benchmark, called Geographica, which uses both real-world and synthetic data to test the offered functionality and the performance of some prominent geospatial RDF stores.

en cs.DB
arXiv Open Access 2013
Performance analysis of modified algorithm for finding multilevel association rules

Arpna Shrivastava, R. C. Jain

Multilevel association rules explore the concept hierarchy at multiple levels which provides more specific information. Apriori algorithm explores the single level association rules. Many implementations are available of Apriori algorithm. Fast Apriori implementation is modified to develop new algorithm for finding multilevel association rules. In this study the performance of this new algorithm is analyzed in terms of running time in seconds.

arXiv Open Access 2011
Modification of GTD from Flat File Format to OLAP for Data Mining

Karanjit Singh, Shuchita Bhasin

This document is part of original research work by the authors in a bid to explore new fields for applying Data Mining Techniques. The sample data is part of a large data set from University of Maryland (UMD) and outlines how more meaningful patterns can be discovered by preprocessing the data in the form of OLAP cubes.

en cs.DB
arXiv Open Access 2010
Removal of Communication Gap

Zeeshan Ahmed, Sudhir Ganti

This research is about an online forum designed and developed to improve the communication process between alumni, new, old and upcoming students. In this research paper we present targeted problems, designed architecture, used technologies in development and final end product in detail.

en cs.DB
arXiv Open Access 2010
The WebContent XML Store

Benjamin Nguyen, Spyros Zoupanos

In this article, we describe the XML storage system used in the WebContent project. We begin by advocating the use of an XML database in order to store WebContent documents, and we present two different ways of storing and querying these documents : the use of a centralized XML database and the use of a P2P XML database.

en cs.DB
arXiv Open Access 2010
Proposing a New Method for Query Processing Adaption in DataBase

Mohammad-Reza Feizi-Derakhshi, Hasan Asil, Amir Asil

This paper proposes a multi agent system by compiling two technologies, query processing optimization and agents which contains features of personalized queries and adaption with changing of requirements. This system uses a new algorithm based on modeling of users' long-term requirements and also GA to gather users' query data. Experimented Result shows more adaption capability for presented algorithm in comparison with classic algorithms.

en cs.DB
arXiv Open Access 2010
The universality of iterated hashing over variable-length strings

Daniel Lemire

Iterated hash functions process strings recursively, one character at a time. At each iteration, they compute a new hash value from the preceding hash value and the next character. We prove that iterated hashing can be pairwise independent, but never 3-wise independent. We show that it can be almost universal over strings much longer than the number of hash values; we bound the maximal string length given the collision probability.

en cs.DB, cs.DS

Halaman 8 dari 4689