Hasil "cs.IR" - JURNALIN

arXiv Open Access 2026

Yuly Billig

We consider a matching problem for time series with values in an arbitrary metric space, with the stretching penalty given by the Hellinger kernel. To optimize this matching, we introduce the Elastic Time Warping algorithm with a cubic computational complexity.

en cs.IR, cs.DS

Detail Sumber

arXiv Open Access 2025

White Hat Search Engine Optimization using Large Language Models

Niv Bardas, Tommy Mordo, Oren Kurland et al.

We present novel white-hat search engine optimization techniques based on genAI and demonstrate their empirical merits.

en cs.IR, cs.GT

Detail Sumber

arXiv Open Access 2025

A model and package for German ColBERT

Thuong Dang, Qiqi Chen

In this work, we introduce a German version for ColBERT, a late interaction multi-dense vector retrieval method, with a focus on RAG applications. We also present the main features of our package for ColBERT models, supporting both retrieval and fine-tuning workflows.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2024

An Annotated Glossary for Data Commons, Data Meshes, and Other Data Platforms

Robert L. Grossman

Cloud-based data commons, data meshes, data hubs, and other data platforms are important ways to manage, analyze and share data to accelerate research and to support reproducible research. This is an annotated glossary of some of the more common terms used in articles and discussions about these platforms.

en cs.IR

Detail Sumber

arXiv Open Access 2024

On the Local Ultrametricity of Finite Metric Data

Patrick Erik Bradley

New local ultrametricity measures for finite metric data are proposed through the viewpoint that their Vietoris-Rips corners are samples from p-adic Mumford curves endowed with a Radon measure coming from a regular differential 1-form. This is experimentally applied to the iris dataset.

en cs.IR

Detail Sumber

arXiv Open Access 2023

Thistle: A Vector Database in Rust

Brad Windsor, Kevin Choi

We present Thistle, a fully functional vector database. Thistle is an entry into the domain of latent knowledge use in answering search queries, an ongoing research topic at both start-ups and search engine companies. We implement Thistle with several well-known algorithms, and benchmark results on the MS MARCO dataset. Results help clarify the latent knowledge domain as well as the growing Rust ML ecosystem.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2021

Optimized Recommender Systems with Deep Reinforcement Learning

Lucas Farris

Recommender Systems have been the cornerstone of online retailers. Traditionally they were based on rules, relevance scores, ranking algorithms, and supervised learning algorithms, but now it is feasible to use reinforcement learning algorithms to generate meaningful recommendations. This work investigates and develops means to setup a reproducible testbed, and evaluate different state of the art algorithms in a realistic environment. It entails a proposal, literature review, methodology, results, and comments.

en cs.IR, cs.LG

Detail Sumber

arXiv Open Access 2019

Information Retrieval and Its Sister Disciplines

Grace Hui Yang

This article presents a summary graph to show the relationships between Information Retrieval (IR) and other related disciplines. The figure tells the key differences between them and the conditions under which one would transition into another.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2019

A Personalized Subreddit Recommendation Engine

Abhishek K Das, Nikhil Bhat, Sukanto Guha et al.

This paper aims to improve upon the generic recommendations that Reddit provides for its users. We propose a novel personalized recommender system that learns from both, the presence and the content of user-subreddit interaction, using implicit and explicit signals to provide robust recommendations.

en cs.IR

Detail Sumber

arXiv Open Access 2019

A Progressive Visual Analytics Tool for Incremental Experimental Evaluation

Fabio Giachelle, Gianmaria Silvello

This paper presents a visual tool, AVIATOR, that integrates the progressive visual analytics paradigm in the IR evaluation process. This tool serves to speed-up and facilitate the performance assessment of retrieval models enabling a result analysis through visual facilities. AVIATOR goes one step beyond the common "compute wait visualize" analytics paradigm, introducing a continuous evaluation mechanism that minimizes human and computational resource consumption.

en cs.IR

Detail Sumber

arXiv Open Access 2018

Deep Neural Network for Learning to Rank Query-Text Pairs

Baoyang Song

This paper considers the problem of document ranking in information retrieval systems by Learning to Rank. We propose ConvRankNet combining a Siamese Convolutional Neural Network encoder and the RankNet ranking model which could be trained in an end-to-end fashion. We prove a general result justifying the linear test-time complexity of pairwise Learning to Rank approach. Experiments on the OHSUMED dataset show that ConvRankNet outperforms systematically existing feature-based models.

en cs.IR

Detail Sumber

arXiv Open Access 2017

User Profile Based Research Paper Recommendation

Harshita Sahijwani, Sourish Dasgupta

We design a recommender system for research papers based on topic-modeling. The users feedback to the results is used to make the results more relevant the next time they fire a query. The user's needs are understood by observing the change in the themes that the user shows a preference for over time.

en cs.IR

Detail Sumber

CrossRef Open Access 2016

IR-improved DGLAP-CS QCD parton showers in Pythia8

B.F.L. Ward

1 sitasi en

Detail DOI Sumber

arXiv Open Access 2016

Matrix Factorization Method for Decentralized Recommender Systems

Wenjie Zheng

Decentralized recommender system does not rely on the central service provider, and the users can keep the ownership of their ratings. This article brings the theoretically well-studied matrix factorization method into the decentralized recommender system, where the formerly prevalent algorithms are heuristic and hence lack of theoretical guarantee. Our preliminary simulation results show that this method is promising.

en cs.IR, cs.DC

Detail Sumber

arXiv Open Access 2014

Marginalizing over the PageRank Damping Factor

Christian Bauckhage

In this note, we show how to marginalize over the damping parameter of the PageRank equation so as to obtain a parameter-free version known as TotalRank. Our discussion is meant as a reference and intended to provide a guided tour towards an interesting result that has applications in information retrieval and classification.

en cs.IR

Detail Sumber

arXiv Open Access 2013

Mobile Recommender Systems Methods: An Overview

Djallel Bouneffouf

The information that mobiles can access becomes very wide nowadays, and the user is faced with a dilemma: there is an unlimited pool of information available to him but he is unable to find the exact information he is looking for. This is why the current research aims to design Recommender Systems (RS) able to continually send information that matches the user's interests in order to reduce his navigation time. In this paper, we treat the different approaches to recommend.

en cs.IR

Detail Sumber

arXiv Open Access 2013

Towards User Profile Modelling in Recommender System

Djallel Bouneffouf

The notion of profile appeared in the 1970s decade, which was mainly due to the need to create custom applications that could be adapted to the user. In this paper, we treat the different aspects of the user's profile, defining it, profile, its features and its indicators of interest, and then we describe the different approaches of modelling and acquiring the user's interests.

en cs.IR

Detail Sumber

arXiv Open Access 2013

A toy model of information retrieval system based on quantum probability

Roman Zapatrin

Recent numerical results show that non-Bayesian knowledge revision may be helpful in search engine training and optimization. In order to demonstrate how basic assumption about about the physical nature (and hence the observed statistics) of retrieved documents can affect the performance of search engines we suggest an idealized toy model with minimal number of parameters.

en cs.IR

Detail Sumber

arXiv Open Access 2012

Simple Search Engine Model: Adaptive Properties

Mahyuddin K. M. Nasution

In this paper we study the relationship between query and search engine by exploring the adaptive properties based on a simple search engine. We used set theory and utilized the words and terms for defining singleton space of event in a search engine model, and then provided the inclusion between one singleton to another.

en cs.IR

Detail Sumber

arXiv Open Access 2011

Information Retrieval of Jumbled Words

Venkata Ravinder Paruchuri

It is known that humans can easily read words where the letters have been jumbled in a certain way. This paper examines this problem by associating a distance measure with the jumbling process. Modifications to text were generated according to the Damerau-Levenshtein distance and it was checked if the users are able to read it. Graphical representations of the results are provided.

en cs.IR

Detail Sumber

Hasil untuk "cs.IR"