Hasil untuk "cs.IR"

Menampilkan 20 dari ~218147 hasil · dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2024
A Survey of Retrieval Algorithms in Ad and Content Recommendation Systems

Yu Zhao, Fang Liu

This survey examines the most effective retrieval algorithms utilized in ad recommendation and content recommendation systems. Ad targeting algorithms rely on detailed user profiles and behavioral data to deliver personalized advertisements, thereby driving revenue through targeted placements. Conversely, organic retrieval systems aim to improve user experience by recommending content that matches user preferences. This paper compares these two applications and explains the most effective methods employed in each.

en cs.IR, cs.AI
CrossRef Open Access 2023
Analisa Stabilitas Dinding Penahan Tanah dan Lereng (Plaxis) (Penanganan Longsor Jalan 028.15 Jl. Ir. Soekarno (Ngawi) Km.185,Cs.)

Dani Satrio Wibowo, Azis Al Huda

Evaluasi yang dilakukan untuk desain yang sudah ada pada perkuatan diding penahan tanah telah memenuhi kriteria saat ini. Analisa Goeteknik dinding penahan tanah dilakukan pada lokasi paket sepanjang 900 meter pada Ruas Jalan Ir. Soekarno ( Ringroad Ngawi ) yang terletak pada KM 183+200 sampai dengan KM. 184+100. Paket Penanganan Longsor Jalan 028.15 Jl. Ir. Soekarno (Ngawi) Km. 185 Cs, merupakan kegiatan pekerjaan penanganan longsor jalan yang dilaksanakan oleh Kementrian PU melalui Satuan Kerja Pelaksanaan Jalan Nasional Wilayah II Propinsi Jawa Timur melalui APBN Tahun Anggaran 2020 dan dibawah koordinasi Pejabat Pembuat Komitmen (PPK) 2.6 Jawa Timur.

arXiv Open Access 2021
Simulations for novel problems in recommendation: analyzing misinformation and data characteristics

Alejandro Bellogín, Yashar Deldjoo

In this position paper, we discuss recent applications of simulation approaches for recommender systems tasks. In particular, we describe how they were used to analyze the problem of misinformation spreading and understand which data characteristics affect the performance of recommendation algorithms more significantly. We also present potential lines of future work where simulation methods could advance the work in the recommendation community.

en cs.IR
arXiv Open Access 2021
A Theoretical Framework for Online Information Search

Rohit Negi

A significant part of human activity today consists of searching for a piece of information online, utilizing knowledge repositories. This endeavor may be time-consuming if the individual searching for the information is unfamiliar with the subject matter of that information. However, experts can aid individuals find relevant information by searching online. This paper describes a theoretical framework to model the dynamic process by which requests for information come to a system of experts, who then answer the requests by searching for those pieces of information.

en cs.IR, cs.IT
arXiv Open Access 2021
A Benchmark Dataset for Micro-video Thumbnail Selection

Liu Bo

The thumbnail, as the first sight of a micro-video, plays a pivotal role in attracting users to click and watch. Although several pioneer efforts have been dedicated to jointly considering the quality and representativeness for selecting the thumbnail, they are limited in exploring the influence of users` interests. While in the real scenario, the more the thumbnails satisfy the users, the more likely the micro-videos will be clicked. In this paper, we aim to select the thumbnail of a given micro-video that meets most users` interests. Towards this end, we construct a large-scale dataset for the micro-video thumbnails. Ultimately, we conduct several baselines on the dataset and demonstrate the effectiveness of our dataset.

en cs.IR
arXiv Open Access 2020
Linking Social Media Posts to News with Siamese Transformers

Jacob Danovitch

Many computational social science projects examine online discourse surrounding a specific trending topic. These works often involve the acquisition of large-scale corpora relevant to the event in question to analyze aspects of the response to the event. Keyword searches present a precision-recall trade-off and crowd-sourced annotations, while effective, are costly. This work aims to enable automatic and accurate ad-hoc retrieval of comments discussing a trending topic from a large corpus, using only a handful of seed news articles.

en cs.IR, cs.CL
arXiv Open Access 2019
Interactive Topic Modeling with Anchor Words

Sanjoy Dasgupta, Stefanos Poulis, Christopher Tosh

The formalism of anchor words has enabled the development of fast topic modeling algorithms with provable guarantees. In this paper, we introduce a protocol that allows users to interact with anchor words to build customized and interpretable topic models. Experimental evidence validating the usefulness of our approach is also presented.

en cs.IR, cs.LG
arXiv Open Access 2017
An Empirical Study of Some Selected IR Models for Bengali Monolingual Information Retrieval

Kamal Sarkar, Avisek Gupta

This paper presents an evaluation and an analysis of some selected information retrieval models for Bengali monolingual information retrieval task. Two models, TF-IDF model and the Okapi BM25 model have been considered for our study. The developed IR models are tested on FIRE ad hoc retrieval data sets released for different years from 2008 to 2012 and the obtained results have been reported in this paper.

en cs.IR
arXiv Open Access 2017
Clustering of Musical Pieces through Complex Networks: an Assessment over Guitar Solos

Stefano Ferretti

Musical pieces can be modeled as complex networks. This fosters innovative ways to categorize music, paving the way towards novel applications in multimedia domains, such as music didactics, multimedia entertainment and digital music generation. Clustering these networks through their main metrics allows grouping similar musical tracks. To show the viability of the approach, we provide results on a dataset of guitar solos.

en cs.IR, cs.SD
arXiv Open Access 2017
Semantic classifier approach to document classification

Piotr Borkowski, Krzysztof Ciesielski, Mieczysław A. Kłopotek

In this paper we propose a new document classification method, bridging discrepancies (so-called semantic gap) between the training set and the application sets of textual data. We demonstrate its superiority over classical text classification approaches, including traditional classifier ensembles. The method consists in combining a document categorization technique with a single classifier or a classifier ensemble (SEMCOM algorithm - Committee with Semantic Categorizer).

en cs.IR, cs.CL
arXiv Open Access 2017
Spotting Information biases in Chinese and Western Media

Dominik Wurzer, Yumeng Qin

Newswire and Social Media are the major sources of information in our time. While the topical demographic of Western Media was subjects of studies in the past, less is known about Chinese Media. In this paper, we apply event detection and tracking technology to examine the information overlap and differences between Chinese and Western - Traditional Media and Social Media. Our experiments reveal a biased interest of China towards the West, which becomes particularly apparent when comparing the interest in celebrities.

en cs.IR
arXiv Open Access 2017
Multi-Stakeholder Recommendation: Applications and Challenges

Yong Zheng

Recommender systems have been successfully applied to assist decision making by producing a list of item recommendations tailored to user preferences. Traditional recommender systems only focus on optimizing the utility of the end users who are the receiver of the recommendations. By contrast, multi-stakeholder recommendation attempts to generate recommendations that satisfy the needs of both the end users and other parties or stakeholders. This paper provides an overview and discussion about the multi-stakeholder recommendations from the perspective of practical applications, available data sets, corresponding research challenges and potential solutions.

en cs.IR
arXiv Open Access 2017
Character-based Neural Embeddings for Tweet Clustering

Svitlana Vakulenko, Lyndon Nixon, Mihai Lupu

In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line at https://github.com/vendi12/tweet2vec_clustering

en cs.IR, cs.CL
arXiv Open Access 2016
Learning to Rank Personalized Search Results in Professional Networks

Viet Ha-Thuc, Shakti Sinha

LinkedIn search is deeply personalized - for the same queries, different searchers expect completely different results. This paper presents our approach to achieving this by mining various data sources available in LinkedIn to infer searchers' intents (such as hiring, job seeking, etc.), as well as extending the concept of homophily to capture the searcher-result similarities on many aspects. Then, learning-to-rank (LTR) is applied to combine these signals with standard search features.

en cs.IR, cs.LG
arXiv Open Access 2014
A Topic Model Approach to Multi-Modal Similarity

Rasmus Troelsgård, Bjørn Sand Jensen, Lars Kai Hansen

Calculating similarities between objects defined by many heterogeneous data modalities is an important challenge in many multimedia applications. We use a multi-modal topic model as a basis for defining such a similarity between objects. We propose to compare the resulting similarities from different model realizations using the non-parametric Mantel test. The approach is evaluated on a music dataset.

en cs.IR, stat.ML
arXiv Open Access 2014
Facets and Typed Relations as Tools for Reasoning Processes in Information Retrieval

Winfried Gödert

Faceted arrangement of entities and typed relations for representing different associations between the entities are established tools in knowledge representation. In this paper, a proposal is being discussed combining both tools to draw inferences along relational paths. This approach may yield new benefit for information retrieval processes, especially when modeled for heterogeneous environments in the Semantic Web. Faceted arrangement can be used as a se-lection tool for the semantic knowledge modeled within the knowledge repre-sentation. Typed relations between the entities of different facets can be used as restrictions for selecting them across the facets.

en cs.IR
arXiv Open Access 2014
Parallel and Distributed Collaborative Filtering: A Survey

Efthalia Karydi, Konstantinos G. Margaritis

Collaborative filtering is amongst the most preferred techniques when implementing recommender systems. Recently, great interest has turned towards parallel and distributed implementations of collaborative filtering algorithms. This work is a survey of the parallel and distributed collaborative filtering implementations, aiming not only to provide a comprehensive presentation of the field's development, but also to offer future research orientation by highlighting the issues that need to be further developed.

en cs.IR, cs.DC
arXiv Open Access 2014
Looking at Vector Space and Language Models for IR using Density Matrices

Alessandro Sordoni, Jian-Yun Nie

In this work, we conduct a joint analysis of both Vector Space and Language Models for IR using the mathematical framework of Quantum Theory. We shed light on how both models allocate the space of density matrices. A density matrix is shown to be a general representational tool capable of leveraging capabilities of both VSM and LM representations thus paving the way for a new generation of retrieval models. We analyze the possible implications suggested by our findings.

en cs.IR
arXiv Open Access 2012
Sequential Document Representations and Simplicial Curves

Guy Lebanon

The popular bag of words assumption represents a document as a histogram of word occurrences. While computationally efficient, such a representation is unable to maintain any sequential information. We present a continuous and differentiable sequential document representation that goes beyond the bag of words assumption, and yet is efficient and effective. This representation employs smooth curves in the multinomial simplex to account for sequential information. We discuss the representation and its geometric properties and demonstrate its applicability for the task of text classification.

en cs.IR, cs.LG
arXiv Open Access 2011
Clustering and Classification in Text Collections Using Graph Modularity

Grigory Pivovarov, Sergei Trunov

A new fast algorithm for clustering and classification of large collections of text documents is introduced. The new algorithm employs the bipartite graph that realizes the word-document matrix of the collection. Namely, the modularity of the bipartite graph is used as the optimization functional. Experiments performed with the new algorithm on a number of text collections had shown a competitive quality of the clustering (classification), and a record-breaking speed.

en cs.IR, cs.DL

Halaman 5 dari 10908