Hasil untuk "cs.IR"

Menampilkan 20 dari ~218174 hasil · dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2024
New Method for Keyword Extraction for Patent Claims

Julien Rossi

The search for prior art is crucial in patent application processing, it consists in retrieving other documents relevant to the invention of the application. Most methods feed a search engine with keywords that are extracted by frequency-analysis methods. We suggest and demonstrate a new method that relies on the way information is provided in patent claims.

arXiv Open Access 2024
Supplier Recommendation in Online Procurement

Victor Coscrato, Derek Bridge

Supply chain optimization is key to a healthy and profitable business. Many companies use online procurement systems to agree contracts with suppliers. It is vital that the most competitive suppliers are invited to bid for such contracts. In this work, we propose a recommender system to assist with supplier discovery in road freight online procurement. Our system is able to provide personalized supplier recommendations, taking into account customer needs and preferences. This is a novel application of recommender systems, calling for design choices that fit the unique requirements of online procurement. Our preliminary results, using real-world data, are promising.

en cs.IR, cs.LG
arXiv Open Access 2023
Methodologies for Improving Modern Industrial Recommender Systems

Shusen Wang

Recommender system (RS) is an established technology with successful applications in social media, e-commerce, entertainment, and more. RSs are indeed key to the success of many popular APPs, such as YouTube, Tik Tok, Xiaohongshu, Bilibili, and others. This paper explores the methodology for improving modern industrial RSs. It is written for experienced RS engineers who are diligently working to improve their key performance indicators, such as retention and duration. The experiences shared in this paper have been tested in some real industrial RSs and are likely to be generalized to other RSs as well. Most contents in this paper are industry experience without publicly available references.

en cs.IR, cs.LG
arXiv Open Access 2022
Similarity search on neighbor's graphs with automatic Pareto optimal performance and minimum expected quality setups based on hyperparameter optimization

Eric S. Tellez, Guillermo Ruiz

This manuscript introduces an autotuned algorithm for searching nearest neighbors based on neighbor graphs and optimization metaheuristics to produce Pareto-optimal searches for quality and search speed automatically; the same strategy is also used to produce indexes that achieve a minimum quality. Our approach is described and benchmarked with other state-of-the-art similarity search methods, showing convenience and competitiveness.

en cs.IR, cs.AI
arXiv Open Access 2021
Web Mining for Estimating Regulatory Blockchain Readiness

Elias Iosif, Klitos Christodoulou, Andreas Vlachos

The regulatory framework of cryptocurrencies (and, in general, blockchain tokens) is of paramount importance. This framework drives nearly all key decisions in the respective business areas. In this work, a computational model is proposed for quantitatively estimating the regulatory stance of countries with respect to cryptocurrencies. This is conducted via web mining utilizing web search engines. The proposed model is experimentally validated. In addition, unsupervised learning (clustering) is applied for better analyzing the automatically derived estimations. Overall, very good performance is achieved by the proposed algorithmic approach.

en cs.IR
arXiv Open Access 2020
Checking Fact Worthiness using Sentence Embeddings

Sidharth Singla

Checking and confirming factual information in texts and speeches is vital to determine the veracity and correctness of the factual statements. This work was previously done by journalists and other manual means but it is a time-consuming task. With the advancements in Information Retrieval and NLP, research in the area of Fact-checking is getting attention for automating it. CLEF-2018 and 2019 organised tasks related to Fact-checking and invited participants. This project focuses on CLEF-2019 Task-1 Check-Worthiness and experiments using the latest Sentence-BERT pre-trained embeddings, topic Modeling and sentiment score are performed. Evaluation metrics such as MAP, Mean Reciprocal Rank, Mean R-Precision and Mean Precision@N present the improvement in the results using the techniques.

en cs.IR
arXiv Open Access 2019
Song Hit Prediction: Predicting Billboard Hits Using Spotify Data

Kai Middlebrook, Kian Sheik

In this work, we attempt to solve the Hit Song Science problem, which aims to predict which songs will become chart-topping hits. We constructed a dataset with approximately 1.8 million hit and non-hit songs and extracted their audio features using the Spotify Web API. We test four models on our dataset. Our best model was random forest, which was able to predict Billboard song success with 88% accuracy.

en cs.IR, cs.LG
arXiv Open Access 2019
How robust is MovieLens? A dataset analysis for recommender systems

Anne-Marie Tousch

Research publication requires public datasets. In recommender systems, some datasets are largely used to compare algorithms against a --supposedly-- common benchmark. Problem: for various reasons, these datasets are heavily preprocessed, making the comparison of results across papers difficult. This paper makes explicit the variety of preprocessing and evaluation protocols to test the robustness of a dataset (or lack of flexibility). While robustness is good to compare results across papers, for flexible datasets we propose a method to select a preprocessing protocol and share results more transparently.

en cs.IR, cs.LG
arXiv Open Access 2019
Multi-stakeholder Recommendation and its Connection to Multi-sided Fairness

Himan Abdollahpouri, Robin Burke

There is growing research interest in recommendation as a multi-stakeholder problem, one where the interests of multiple parties should be taken into account. This category subsumes some existing well-established areas of recommendation research including reciprocal and group recommendation, but a detailed taxonomy of different classes of multi-stakeholder recommender systems is still lacking. Fairness-aware recommendation has also grown as a research area, but its close connection with multi-stakeholder recommendation is not always recognized. In this paper, we define the most commonly observed classes of multi-stakeholder recommender systems and discuss how different fairness concerns may come into play in such systems.

en cs.IR
arXiv Open Access 2018
EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph

Simon Gottschalk, Elena Demidova

The provision of multilingual event-centric temporal knowledge graphs such as EventKG enables structured access to representations of a large number of historical and contemporary events in a variety of language contexts. Timelines provide an intuitive way to facilitate an overview of events related to a query entity - i.e., an entity or an event of user interest - over a certain period of time. In this paper, we present EventKG+TL - a novel system that generates cross-lingual event timelines using EventKG and facilitates an overview of the language-specific event relevance and popularity along with the cross-lingual differences.

arXiv Open Access 2017
TableQA: Question Answering on Tabular Data

Svitlana Vakulenko, Vadim Savenkov

Tabular data is difficult to analyze and to search through, yielding for new tools and interfaces that would allow even non tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even without having to fully understand the dataset structure. The goal of our demonstration is to showcase answering natural language questions from tabular data, and to discuss related system configuration and model training aspects. Our prototype is publicly available and open-sourced (see https://svakulenko.ai.wu.ac.at/tableqa).

en cs.IR
arXiv Open Access 2016
Discriminative Information Retrieval for Knowledge Discovery

Tongfei Chen, Benjamin Van Durme

We propose a framework for discriminative Information Retrieval (IR) atop linguistic features, trained to improve the recall of tasks such as answer candidate passage retrieval, the initial step in text-based Question Answering (QA). We formalize this as an instance of linear feature-based IR (Metzler and Croft, 2007), illustrating how a variety of knowledge discovery tasks are captured under this approach, leading to a 44% improvement in recall for candidate triage for QA.

en cs.IR
arXiv Open Access 2016
A Minimum Spanning Tree Representation of Anime Similarities

Canggih Puspo Wibowo

In this work, a new way to represent Japanese animation (anime) is presented. We applied a minimum spanning tree to show the relation between anime. The distance between anime is calculated through three similarity measurements, namely crew, score histogram, and topic similarities. Finally, the centralities are also computed to reveal the most significance anime. The result shows that the minimum spanning tree can be used to determine the similarity anime. Furthermore, by using centralities calculation, we found some anime that are significant to others.

en cs.IR, cs.DM
arXiv Open Access 2015
A Method of Passage-Based Document Retrieval in Question Answering System

Man-Hung Jong, Chong-Han Ri, Hyok-Chol Choe et al.

We propose a method for using the scoring values of passages to effectively retrieve documents in a Question Answering system. For this, we suggest evaluation function that considers proximity between each question terms in passage. And using this evaluation function , we extract a documents which involves scoring values in the highest collection, as a suitable document for question. The proposed method is very effective in document retrieval of Korean question answering system.

en cs.IR
arXiv Open Access 2015
Regroupement sémantique de définitions en espagnol

Gerardo Sierra, Juan-Manuel Torres-Moreno, Alejandro Molina

This article focuses on the description and evaluation of a new unsupervised learning method of clustering of definitions in Spanish according to their semantic. Textual Energy was used as a clustering measure, and we study an adaptation of the Precision and Recall to evaluate our method.

en cs.IR, cs.CL
arXiv Open Access 2014
A Generalized Framework for Ontology-Based Information Retrieval Application to a public-transportation system

Amir Zidi, Mourad Abed

In this paper we present a generic framework for ontology-based information retrieval. We focus on the recognition of semantic information extracted from data sources and the mapping of this knowledge into ontology. In order to achieve more scalability, we propose an approach for semantic indexing based on entity retrieval model. In addition, we have used ontology of public transportation domain in order to validate these proposals. Finally, we evaluated our system using ontology mapping and real world data sources. Experiments show that our framework can provide meaningful search results.

arXiv Open Access 2013
Optimizing an Utility Function for Exploration / Exploitation Trade-off in Context-Aware Recommender System

Djallel Bouneffouf

In this paper, we develop a dynamic exploration/ exploitation (exr/exp) strategy for contextual recommender systems (CRS). Specifically, our methods can adaptively balance the two aspects of exr/exp by automatically learning the optimal tradeoff. This consists of optimizing a utility function represented by a linearized form of the probability distributions of the rewards of the clicked and the non-clicked documents already recommended. Within an offline simulation framework we apply our algorithms to a CRS and conduct an evaluation with real event log data. The experimental results and detailed analysis demonstrate that our algorithms outperform existing algorithms in terms of click-through-rate (CTR).

en cs.IR
arXiv Open Access 2013
A two-step model and the algorithm for recalling in recommender systems

Keisuke Hara, Tomihisa Kamada

When a user finds an interesting recommendation in a recommender system, the user may want to recall related items recommended in the past to reconsider or to enjoy them again. If the system can pick up such "recalled" items at each user's request, it must deepen the user experience. We propose a model and the algorithm for such personalized "recalling" in conventional recommender systems, which is an application of neural networks for associative memory. In our model, the "recalled" items can reflect each user's personality beyond naive similarities between items.

en cs.IR

Halaman 8 dari 10909