Hasil "cs.IR" - JURNALIN

arXiv Open Access 2025

Experimental Evaluation of Dynamic Topic Modeling Algorithms

Ngozichukwuka Onah, Nadine Steinmetz, Hani Al-Sayeh et al.

The amount of text generated daily on social media is gigantic and analyzing this text is useful for many purposes. To understand what lies beneath a huge amount of text, we need dependable and effective computing techniques from self-powered topic models. Nevertheless, there are currently relatively few thorough quantitative comparisons between these models. In this study, we compare these models and propose an assessment metric that documents how the topics change in time.

en cs.IR

Detail Sumber

arXiv Open Access 2024

The Effectiveness of Graph Contrastive Learning on Mathematical Information Retrieval

Pei-Syuan Wang, Hung-Hsuan Chen

This paper details an empirical investigation into using Graph Contrastive Learning (GCL) to generate mathematical equation representations, a critical aspect of Mathematical Information Retrieval (MIR). Our findings reveal that this simple approach consistently exceeds the performance of the current leading formula retrieval model, TangentCFT. To support ongoing research and development in this field, we have made our source code accessible to the public at https://github.com/WangPeiSyuan/GCL-Formula-Retrieval/.

en cs.IR

Detail Sumber

arXiv Open Access 2024

Accessibility in Information Retrieval

Leif Azzopardi, Vishwa Vinay

This paper introduces the concept of accessibility from the field of transportation planning and adopts it within the context of Information Retrieval (IR). An analogy is drawn between the fields, which motivates the development of document accessibility measures for IR systems. Considering the accessibility of documents within a collection given an IR System provides a different perspective on the analysis and evaluation of such systems which could be used to inform the design, tuning and management of current and future IR systems.

en cs.IR, cs.DL

Detail DOI Sumber

arXiv Open Access 2023

LightGCN: Evaluated and Enhanced

Milena Kapralova, Luca Pantea, Andrei Blahovici

This paper analyses LightGCN in the context of graph recommendation algorithms. Despite the initial design of Graph Convolutional Networks for graph classification, the non-linear operations are not always essential. LightGCN enables linear propagation of embeddings, enhancing performance. We reproduce the original findings, assess LightGCN's robustness on diverse datasets and metrics, and explore Graph Diffusion as an augmentation of signal propagation in LightGCN.

en cs.IR, cs.LG

Detail Sumber

CrossRef Open Access 2019

Conductive nanofibrous Chitosan/PEDOT:PSS tissue engineering scaffolds

Ali Abedi, Mahdi Hasanzadeh, Lobat Tayebi

125 sitasi en

Detail DOI Sumber

arXiv Open Access 2022

A Health Focused Text Classification Tool (HFTCT)

Baadr Suleman M Alwheepy, Leandros Maglaras, Nick Ayres

Due to the high number of users on social media and the massive amounts of queries requested every second to share a new video, picture, or message, social platforms struggle to manage this humungous amount of data that is endlessly coming in. HFTCT relies on wordlists to classify opinions. It can carry out its tasks reasonably well; however, sometimes, the wordlists themselves fail to be reliable as they are a limited source of positive and negative words.

en cs.IR

Detail Sumber

arXiv Open Access 2020

A Study of Neural Matching Models for Cross-lingual IR

Puxuan Yu, James Allan

In this study, we investigate interaction-based neural matching models for ad-hoc cross-lingual information retrieval (CLIR) using cross-lingual word embeddings (CLWEs). With experiments conducted on the CLEF collection over four language pairs, we evaluate and provide insight into different neural model architectures, different ways to represent query-document interactions and word-pair similarity distributions in CLIR. This study paves the way for learning an end-to-end CLIR system using CLWEs.

en cs.IR, cs.CL

Detail DOI Sumber

arXiv Open Access 2020

Modularized Transfomer-based Ranking Framework

Luyu Gao, Zhuyun Dai, Jamie Callan

Recent innovations in Transformer-based ranking models have advanced the state-of-the-art in information retrieval. However, these Transformers are computationally expensive, and their opaque hidden states make it hard to understand the ranking process. In this work, we modularize the Transformer ranker into separate modules for text representation and interaction. We show how this design enables substantially faster ranking using offline pre-computed representations and light-weight online interactions. The modular design is also easier to interpret and sheds light on the ranking process in Transformer rankers.

en cs.IR

Detail Sumber

arXiv Open Access 2019

Sentiment Classification using N-gram IDF and Automated Machine Learning

Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto

We propose a sentiment classification method with a general machine learning framework. For feature representation, n-gram IDF is used to extract software-engineering-related, dataset-specific, positive, neutral, and negative n-gram expressions. For classifiers, an automated machine learning tool is used. In the comparison using publicly available datasets, our method achieved the highest F1 values in positive and negative sentences on all datasets.

en cs.IR, cs.CL

Detail Sumber

arXiv Open Access 2019

Accurate and Fast Retrieval for Complex Non-metric Data via Neighborhood Graphs

Leonid Boytsov, Eric Nyberg

We demonstrate that a graph-based search algorithm-relying on the construction of an approximate neighborhood graph-can directly work with challenging non-metric and/or non-symmetric distances without resorting to metric-space mapping and/or distance symmetrization, which, in turn, lead to substantial performance degradation. Although the straightforward metrization and symmetrization is usually ineffective, we find that constructing an index using a modified, e.g., symmetrized, distance can improve performance. This observation paves a way to a new line of research of designing index-specific graph-construction distance functions.

en cs.IR, cs.LG

Detail DOI Sumber

CrossRef Open Access 2018

New quaternary chalcogenide glasses with wide range IR transparency

S D Ramarao, Dundappa Mumbaraddi, Deepti Kalsi et al.

5 sitasi en

Detail DOI Sumber

arXiv Open Access 2018

Information Retrieval in the Cloud

Jochen L. Leidner

There has been a recent trend to migrate IT infrastructure into the cloud. In this paper, we discuss the impact of this trend on searching for textual and other data, i.e. the distributed indexing and retrieval of information, from an organizational context. Keywords: information retrieval (IR); federated search; cloud search.

en cs.IR, cs.CY

Detail Sumber

arXiv Open Access 2018

Fairness-Aware Recommendation of Information Curators

Ziwei Zhu, Jianling Wang, Yin Zhang et al.

This paper highlights our ongoing efforts to create effective information curator recommendation models that can be personalized for individual users, while maintaining important fairness properties. Concretely, we introduce the problem of information curator recommendation, provide a high-level overview of a fairness-aware recommender, and introduce some preliminary experimental evidence over a real-world Twitter dataset. We conclude with some thoughts on future directions.

en cs.IR

Detail Sumber

arXiv Open Access 2017

Related Fact Checks: a tool for combating fake news

Sreya Guha

The emergence of "Fake News" and misinformation via online news and social media has spurred an interest in computational tools to combat this phenomenon. In this paper we present a new "Related Fact Checks" service, which can help a reader critically evaluate an article and make a judgment on its veracity by bringing up fact checks that are relevant to the article. We describe the core technical problems that need to be solved in building a "Related Fact Checks" service, and present results from an evaluation of an implementation.

en cs.IR, cs.CY

Detail Sumber

arXiv Open Access 2017

Hybrid Collaborative Recommendation via Semi-AutoEncoder

Shuai Zhang, Lina Yao, Xiwei Xu et al.

In this paper, we present a novel structure, Semi-AutoEncoder, based on AutoEncoder. We generalize it into a hybrid collaborative filtering model for rating prediction as well as personalized top-n recommendations. Experimental results on two real-world datasets demonstrate its state-of-the-art performances.

en cs.IR

Detail Sumber

arXiv Open Access 2016

Memory Based Collaborative Filtering with Lucene

Claudio Gennaro

Memory Based Collaborative Filtering is a widely used approach to provide recommendations. It exploits similarities between ratings across a population of users by forming a weighted vote to predict unobserved ratings. Bespoke solutions are frequently adopted to deal with the problem of high quality recommendations on large data sets. A disadvantage of this approach, however, is the loss of generality and flexibility of the general collaborative filtering systems. In this paper, we have developed a methodology that allows one to build a scalable and effective collaborative filtering system on top of a conventional full-text search engine such as Apache Lucene.

en cs.IR

Detail Sumber

arXiv Open Access 2016

Attention Span For Personalisation

Joan Figuerola Hurtado

A click on an item is arguably the most widely used feature in recommender systems. However, a click is one out of 174 events a browser can trigger. This paper presents a framework to effectively collect and store data from event streams. A set of mining methods is provided to extract user engagement features such as: attention span, scrolling depth and visible impressions. In this work, we present an experiment where recommendations based on attention span drove 340% higher click-through-rate than clicks.

en cs.IR

Detail Sumber

arXiv Open Access 2015

Approaches to the Intelligent Subject Search

V. K. Ivanov, B. V. Palyukh, A. N. Sotnikov

This article presents main results of the pilot study of approaches to the subject information search based on automated semantic processing of mass scientific and technical data. The authors focus on technology of building and qualification of search queries with the following filtering and ranking of search data. Software architecture, specific features of subject search and research results application are considered.

en cs.IR

Detail DOI Sumber

arXiv Open Access 2013

Tweets Miner for Stock Market Analysis

Bohdan Pavlyshenko

In this paper, we present a software package for the data mining of Twitter microblogs for the purpose of using them for the stock market analysis. The package is written in R langauge using apropriate R packages. The model of tweets has been considered. We have also compared stock market charts with frequent sets of keywords in Twitter microblogs messages.

en cs.IR, cs.CL

Detail Sumber

arXiv Open Access 2012

Challenges in Kurdish Text Processing

Kyumars Sheykh Esmaili

Despite having a large number of speakers, the Kurdish language is among the less-resourced languages. In this work we highlight the challenges and problems in providing the required tools and techniques for processing texts written in Kurdish. From a high-level perspective, the main challenges are: the inherent diversity of the language, standardization and segmentation issues, and the lack of language resources.

en cs.IR, cs.CL

Detail Sumber

Hasil untuk "cs.IR"