Hasil untuk "cs.IR"

Menampilkan 20 dari ~218177 hasil · dari CrossRef, arXiv, DOAJ

JSON API
arXiv Open Access 2023
Do you MIND? Reflections on the MIND dataset for research on diversity in news recommendations

Sanne Vrijenhoek

The MIND dataset is at the moment of writing the most extensive dataset available for the research and development of news recommender systems. This work analyzes the suitability of the dataset for research on diverse news recommendations. On the one hand we analyze the effect the different steps in the recommendation pipeline have on the distribution of article categories, and on the other hand we check whether the supplied data would be sufficient for more sophisticated diversity analysis. We conclude that while MIND is a great step forward, there is still a lot of room for improvement.

en cs.IR
arXiv Open Access 2022
Patent Search Using Triplet Networks Based Fine-Tuned SciBERT

Utku Umur Acikalin, Mucahid Kutlu

In this paper, we propose a novel method for the prior-art search task. We fine-tune SciBERT transformer model using Triplet Network approach, allowing us to represent each patent with a fixed-size vector. This also enables us to conduct efficient vector similarity computations to rank patents in query time. In our experiments, we show that our proposed method outperforms baseline methods.

en cs.IR
arXiv Open Access 2022
Unsupervised Search Algorithm Configuration using Query Performance Prediction

Haggai Roitman

Search engine configuration can be quite difficult for inexpert developers. Instead, an auto-configuration approach can be used to speed up development time. Yet, such an automatic process usually requires relevance labels to train a supervised model. In this work, we suggest a simple solution based on query performance prediction that requires no relevance labels but only a sample of queries in a given domain. Using two example usecases we demonstrate the merits of our solution.

en cs.IR, cs.CL
arXiv Open Access 2022
Matching Theory-based Recommender Systems in Online Dating

Yoji Tomita, Riku Togashi, Daisuke Moriwaki

Online dating platforms provide people with the opportunity to find a partner. Recommender systems in online dating platforms suggest one side of users to the other side of users. We discuss the potential interactions between reciprocal recommender systems (RRSs) and matching theory. We present our ongoing project to deploy a matching theory-based recommender system (MTRS) in a real-world online dating platform.

arXiv Open Access 2021
Customising Ranking Models for Enterprise Search on Bilingual Click-Through Dataset

Gizem Gezici

In this work, we provide the details about the process of establishing an end-to-end system for enterprise search on bilingual click-through dataset. The first part of the paper will be about the high-level workflow of the system. Then, in the second part we will elaborately mention about the ranking models to improve the search results in the vertical search of the technical documents in enterprise domain. Throughout the paper, we will mention the way which we combine the methods in IR literature. Finally, in the last part of the paper we will report our results using different ranking algorithms with $NDCG@k$ where k is the cut-off value.

en cs.IR
arXiv Open Access 2020
Automatic Discourse Segmentation: Review and Perspectives

Iria da Cunha, Juan-Manuel Torres-Moreno

Multilingual discourse parsing is a very prominent research topic. The first stage for discourse parsing is discourse segmentation. The study reported in this article addresses a review of two on-line available discourse segmenters (for English and Portuguese). We evaluate the possibility of developing similar discourse segmenters for Spanish, French and African languages.

en cs.IR, cs.CL
arXiv Open Access 2019
An Updated Duet Model for Passage Re-ranking

Bhaskar Mitra, Nick Craswell

We propose several small modifications to Duet---a deep neural ranking model---and evaluate the updated model on the MS MARCO passage ranking task. We report significant improvements from the proposed changes based on an ablation study.

en cs.IR, cs.CL
arXiv Open Access 2019
An InfoVis Tool for Interactive Component-Based Evaluation

Giacomo Rocco, Gianmaria Silvello

In this paper, we present an InfoVis tool based on Sankey diagrams for the exploration of large combinatorial combinations of IR components - the Grid of Points (GoP). The goal of this tool is to ease the comprehension of the behavior of single IR components within fully functioning off-the-shelf IR systems without recurring to complex statistical tools.

en cs.IR
arXiv Open Access 2017
Patterns of Multistakeholder Recommendation

Robin Burke, Himan Abdollahpouri

Recommender systems are personalized information systems. However, in many settings, the end-user of the recommendations is not the only party whose needs must be represented in recommendation generation. Incorporating this insight gives rise to the notion of multistakeholder recommendation, in which the interests of multiple parties are represented in recommendation algorithms and evaluation. In this paper, we identify patterns of stakeholder utility that characterize different multistakeholder recommendation applications, and provide a taxonomy of the different possible systems, only some of which have currently been implemented.

en cs.IR
arXiv Open Access 2017
A survey of Community Question Answering

Barun Patra

With the advent of numerous community forums, tasks associated with the same have gained importance in the recent past. With the influx of new questions every day on these forums, the issues of identifying methods to find answers to said questions, or even trying to detect duplicate questions, are of practical importance and are challenging in their own right. This paper aims at surveying some of the aforementioned issues, and methods proposed for tackling the same.

en cs.IR
arXiv Open Access 2015
A novel design of hidden web crawler using ontology

Manvi, Komal Kumar Bhatia, Ashutosh Dixit

Deep Web is content hidden behind HTML forms. Since it represents a large portion of the structured, unstructured and dynamic data on the Web, accessing Deep-Web content has been a long challenge for the database community. This paper describes a crawler for accessing Deep-Web using Ontologies. Performance evaluation of the proposed work showed that this new approach has promising results.

arXiv Open Access 2015
Metadata Embeddings for User and Item Cold-start Recommendations

Maciej Kula

I present a hybrid matrix factorisation model representing users and items as linear combinations of their content features' latent factors. The model outperforms both collaborative and content-based models in cold-start or sparse interaction data scenarios (using both user and item metadata), and performs at least as well as a pure collaborative matrix factorisation model where interaction data is abundant. Additionally, feature embeddings produced by the model encode semantic information in a way reminiscent of word embedding approaches, making them useful for a range of related tasks such as tag recommendations.

en cs.IR
arXiv Open Access 2015
Efficient filtering of adult content using textual information

Thomas Largillier, Guillaume Peyronnet, Sylvain Peyronnet

Nowadays adult content represents a non negligible proportion of the Web content. It is of the utmost importance to protect children from this content. Search engines, as an entry point for Web navigation are ideally placed to deal with this issue. In this paper, we propose a method that builds a safe index i.e. adult-content free for search engines. This method is based on a filter that uses only textual information from the web page and the associated URL.

en cs.IR
arXiv Open Access 2014
Designing an Ontology for the Data Documentation Initiative

Thomas Bosch, Andias Wira-Alam, Brigitte Mathiak

An ontology of the DDI 3 data model will be designed by following the ontology engineering methodology to be evolved based on state-of-the-art methodologies. Hence DDI 3 data and metadata can be represented in form of a standard web interchange format RDF and processed by highly available RDF tools. As a consequence the DDI community has the possibility to publish and link LOD data sets to become part of the LOD cloud.

en cs.IR, cs.DL
arXiv Open Access 2014
Étude des dimensions spécifiques du contexte dans un système de filtrage d'informations

Djallel Bouneffouf

In the context of business information systems, e-commerce and access to knowledge, the relevance of the information provided to use is a key fact to the success of information systems. Therefore the quality of access is determined by access to the right information at the right time, at the right place. In this context, it is important to consider the users needs when access to information and his contextual situation in order to provide relevant information, tailored to their needs and context use. In what follows we describe the prelude to a project that tries to combine all of these needs to improve information systems.

en cs.IR
arXiv Open Access 2012
The model of information retrieval based on the theory of hypercomplex numerical systems

D. V. Lande, Ya. A. Kalinovskiy, Yu. E. Boyarinova

The paper provided a description of a new model of information retrieval, which is an extension of vector-space model and is based on the principles of the theory of hypercomplex numerical systems. The model allows to some extent realize the idea of fuzzy search and allows you to apply in practice the model of information retrieval practical developments in the field of hypercomplex numerical systems.

en cs.IR

Halaman 10 dari 10909