Hasil "cs.IR" - JURNALIN

arXiv Open Access 2024

PyTerrier-GenRank: The PyTerrier Plugin for Reranking with Large Language Models

Kaustubh D. Dhole

Using LLMs as rerankers requires experimenting with various hyperparameters, such as prompt formats, model choice, and reformulation strategies. We introduce PyTerrier-GenRank, a PyTerrier plugin to facilitate seamless reranking experiments with LLMs, supporting popular ranking strategies like pointwise and listwise prompting. We validate our plugin through HuggingFace and OpenAI hosted endpoints.

en cs.IR, cs.AI

Detail Sumber

arXiv Open Access 2024

Mitigating Position Bias with Regularization for Recommender Systems

Hao Wang

Fairness is a popular research topic in recent years. A research topic closely related to fairness is bias and debiasing. Among different types of bias problems, position bias is one of the most widely encountered symptoms. Position bias means that recommended items on top of the recommendation list has a higher likelihood to be clicked than items on bottom of the same list. To mitigate this problem, we propose to use regularization technique to reduce the bias effect. In the experiment section, we prove that our method is superior to other modern algorithms.

en cs.IR

Detail DOI Sumber

arXiv Open Access 2023

Measuring Americanization: A Global Quantitative Study of Interest in American Topics on Wikipedia

Piotr Konieczny, Włodzimierz Lewoniewski

We conducted a global comparative analysis of the coverage of American topics in different language versions of Wikipedia, using over 90 million Wikidata items and 40 million Wikipedia articles in 58 languages. Our study aimed to investigate whether Americanization is more or less dominant in different regions and cultures and to determine whether interest in American topics is universal.

en cs.IR, stat.AP

Detail Sumber

arXiv Open Access 2023

MMRec: Simplifying Multimodal Recommendation

Xin Zhou

This paper presents an open-source toolbox, MMRec for multimodal recommendation. MMRec simplifies and canonicalizes the process of implementing and comparing multimodal recommendation models. The objective of MMRec is to provide a unified and configurable arena that can minimize the effort in implementing and testing multimodal recommendation models. It enables multimodal models, ranging from traditional matrix factorization to modern graph-based algorithms, capable of fusing information from multiple modalities simultaneously. Our documentation, examples, and source code are available at \url{https://github.com/enoche/MMRec}.

en cs.IR, cs.MM

Detail DOI Sumber

arXiv Open Access 2022

Taking snapshots from a stream

Dominik Bojko, Jacek Cichoń

This work is devoted to a certain class of probabilistic snapshots for elements of the observed data stream. We show you how one can control their probabilistic properties and we show some potential applications. Our solution can be used to store information from the observed history with limited memory. It can be used for both web server applications and Ad hoc networks and, for example, for automatic taking snapshots from video stream online of unknown size.

en cs.IR, math.PR

Detail Sumber

arXiv Open Access 2022

IITD-DBAI: Multi-Stage Retrieval with Pseudo-Relevance Feedback and Query Reformulation

Shivani Choudhary

Resolving the contextual dependency is one of the most challenging tasks in the Conversational system. Our submission to CAsT-2021 aimed to preserve the key terms and the context in all subsequent turns and use classical Information retrieval methods. It was aimed to pull as relevant documents as possible from the corpus. We have participated in automatic track and submitted two runs in the CAsT-2021. Our submission has produced a mean NDCG@3 performance better than the median model.

en cs.IR, cs.AI

Detail Sumber

CrossRef Open Access 2021

Characterization of new primary air kerma standards for dosimetry in Co-60, Cs-137 and Ir-192 gamma ray sources

S. Pojtinger, L. Büermann

en

Detail DOI Sumber

arXiv Open Access 2021

ELSKE: Efficient Large-Scale Keyphrase Extraction

Johannes Knittel, Steffen Koch, Thomas Ertl

Keyphrase extraction methods can provide insights into large collections of documents such as social media posts. Existing methods, however, are less suited for the real-time analysis of streaming data, because they are computationally too expensive or require restrictive constraints regarding the structure of keyphrases. We propose an efficient approach to extract keyphrases from large document collections and show that the method also performs competitively on individual documents.

en cs.IR

Detail DOI Sumber

arXiv Open Access 2021

Multi-Objective Recommendations: A Tutorial

Yong Zheng, David, Wang

Recommender systems (RecSys) have been well developed to assist user decision making. Traditional RecSys usually optimize a single objective (e.g., rating prediction errors or ranking quality) in the model. There is an emerging demand in multi-objective optimization recently in RecSys, especially in the area of multi-stakeholder and multi-task recommender systems. This article provides an overview of multi-objective recommendations, followed by the discussions with case studies. The document is considered as a supplementary material for our tutorial on multi-objective recommendations at ACM SIGKDD 2021.

en cs.IR

Detail Sumber

arXiv Open Access 2020

Grouping headlines

Ciro Javier Diaz Penedo, Lucas Leonardo Silveira Costa

In this work we deal with the problem of grouping in headlines of the newspaper ABC (Australian Bro-adcasting Corporation) using unsupervised machine learning techniques. We present and discuss the results on the clusters found

en cs.IR, cs.LG

Detail Sumber

arXiv Open Access 2019

Sudden Death: A New Way to Compare Recommendation Diversification

Derek Bridge, Mesut Kaya, Pablo Castells

This paper describes problems with the current way we compare the diversity of different recommendation lists in offline experiments. We illustrate the problems with a case study. We propose the Sudden Death score as a new and better way of making these comparisons.

en cs.IR, cs.LG

Detail Sumber

arXiv Open Access 2019

Open Data Chatbot

Sophia Keyner, Vadim Savenkov, Svitlana Vakulenko

Recently, chatbots received an increased attention from industry and diverse research communities as a dialogue-based interface providing advanced human-computer interactions. On the other hand, Open Data continues to be an important trend and a potential enabler for government transparency and citizen participation. This paper shows how these two paradigms can be combined to help non-expert users find and discover open government datasets through dialogue.

en cs.IR

Detail Sumber

arXiv Open Access 2018

Regret vs. Bandwidth Trade-off for Recommendation Systems

Linqi Song, Christina Fragouli, Devavrat Shah

We consider recommendation systems that need to operate under wireless bandwidth constraints, measured as number of broadcast transmissions, and demonstrate a (tight for some instances) tradeoff between regret and bandwidth for two scenarios: the case of multi-armed bandit with context, and the case where there is a latent structure in the message space that we can exploit to reduce the learning phase.

en cs.IR, cs.LG

Detail Sumber

arXiv Open Access 2018

Evaluation of Information Retrieval Systems Using Structural Equation Modelling

Massimo Melucci

The interpretation of the experimental data collected by testing systems across input datasets and model parameters is of strategic importance for system design and implementation. In particular, finding relationships between variables and detecting the latent variables affecting retrieval performance can provide designers, engineers and experimenters with useful if not necessary information about how a system is performing. This paper discusses the use of Structural Equation Modelling (SEM) in providing an in-depth explanation of evaluation results and an explanation of failures and successes of a system; in particular, we focus on the case of Information Retrieval.

en cs.IR

Detail Sumber

arXiv Open Access 2015

On Projection Based Operators in Lp space for Exact Similarity Search

Andreas Wichert, Catarina Moreira

We investigate exact indexing for high dimensional Lp norms based on the 1-Lipschitz property and projection operators. The orthogonal projection that satisfies the 1-Lipschitz property for the Lp norm is described. The adaptive projection defined by the first principal component is introduced.

en cs.IR

Detail DOI Sumber

arXiv Open Access 2014

Fast Spammer Detection Using Structural Rank

Seungyeon Kim, Haesun Park, Guy Lebanon

Comments for a product or a news article are rapidly growing and became a medium of measuring quality products or services. Consequently, spammers have been emerged in this area to bias them toward their favor. In this paper, we propose an efficient spammer detection method using structural rank of author specific term-document matrices. The use of structural rank was found effective and far faster than similar methods.

en cs.IR, cs.CR

Detail Sumber

CrossRef Open Access 2013

Field-emission microscopy of the surface of an Ir-C-Cs point emitter

D. P. Bernatskii, V. G. Pavlov

4 sitasi en

Detail DOI Sumber

arXiv Open Access 2012

Design, implementation and experiment of a YeSQL Web Crawler

Pierre Joulin, Romain Deveaud, Eric SanJuan-Ibekwe et al.

We describe a novel, "focusable", scalable, distributed web crawler based on GNU/Linux and PostgreSQL that we designed to be easily extendible and which we have released under a GNU public licence. We also report a first use case related to an analysis of Twitter's streams about the french 2012 presidential elections and the URL's it contains.

en cs.IR

Detail Sumber

arXiv Open Access 2012

Information Retrieval Model: A Social Network Extraction Perspective

Mahyuddin K. M. Nasution, Shahrul Azman Noah

Future Information Retrieval, especially in connection with the internet, will incorporate the content descriptions that are generated with social network extraction technologies and preferably incorporate the probability theory for assigning the semantic. Although there is an increasing interest about social network extraction, but a little of them has a significant impact to infomation retrieval. Therefore this paper proposes a model of information retrieval from the social network extraction.

en cs.IR

Detail Sumber

arXiv Open Access 2012

Statistical reliability and path diversity based PageRank algorithm improvements

Dohy Hong

In this paper we present new improvement ideas of the original PageRank algorithm. The first idea is to introduce an evaluation of the statistical reliability of the ranking score of each node based on the local graph property and the second one is to introduce the notion of the path diversity. The path diversity can be exploited to dynamically modify the increment value of each node in the random surfer model or to dynamically adapt the damping factor. We illustrate the impact of such modifications through examples and simple simulations.

en cs.IR, cs.DM

Detail Sumber

Hasil untuk "cs.IR"