Hasil untuk "cs.IR"

Menampilkan 20 dari ~218172 hasil · dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2024
Enhancing Semantic Interoperability Across Materials Science With HIVE4MAT

Jane Greenberg, Kio Polson, Scott McClellan et al.

HIVE4MAT is a linked data interactive application for navigating ontologies of value to materials science. HIVE enables automatic indexing of textual resources with standardized terminology. This article presents the motivation underlying HIVE4MAT, explains the system architecture, reports on two evaluations, and discusses future plans.

en cs.IR
CrossRef Open Access 2023
Insights into the Transport Cycle of LAT1 and Interaction with the Inhibitor JPH203

Chiara Brunocilla, Lara Console, Filomena Rovella et al.

The large Amino Acid Transporter 1 (LAT1) is an interesting target in drug discovery since this transporter is overexpressed in several human cancers. Furthermore, due to its location in the blood-brain barrier (BBB), LAT1 is interesting for delivering pro-drugs to the brain. In this work, we focused on defining the transport cycle of LAT1 using an in silico approach. So far, studies of the interaction of LAT1 with substrates and inhibitors have not considered that the transporter must undergo at least four different conformations to complete the transport cycle. We built outward-open and inward-occluded conformations of LAT1 using an optimized homology modelling procedure. We used these 3D models and the cryo-EM structures in outward-occluded and inward-open conformations to define the substrate/protein interaction during the transport cycle. We found that the binding scores for the substrate depend on the conformation, with the occluded states as the crucial steps affecting the substrate affinity. Finally, we analyzed the interaction of JPH203, a high-affinity inhibitor of LAT1. The results indicate that conformational states must be considered for in silico analyses and early-stage drug discovery. The two built models, together with the available cryo-EM 3D structures, provide important information on the LAT1 transport cycle, which could be used to speed up the identification of potential inhibitors through in silico screening.

arXiv Open Access 2023
THUIR at WSDM Cup 2023 Task 1: Unbiased Learning to Rank

Jia Chen, Haitao Li, Weihang Su et al.

This paper introduces the approaches we have used to participate in the WSDM Cup 2023 Task 1: Unbiased Learning to Rank. In brief, we have attempted a combination of both traditional IR models and transformer-based cross-encoder architectures. To further enhance the ranking performance, we also considered a series of features for learning to rank. As a result, we won 2nd place on the final leaderboard.

en cs.IR
arXiv Open Access 2022
An Analysis of the Features Considerable for NFT Recommendations

Dinuka Piyadigama, Guhanathan Poravi

This research explores the methods that NFTs can be recommended to people who interact with NFT-marketplaces to explore NFTs of preference and similarity to what they have been searching for. While exploring past methods that can be adopted for recommendations, the use of NFT traits for recommendations has been explored. The outcome of the research highlights the necessity of using multiple Recommender Systems to present the user with the best possible NFTs when interacting with decentralized systems.

en cs.IR, cs.CY
arXiv Open Access 2022
Survey of Query-based Text Summarization

Hang Yu, Jiawei Han

Query-based text summarization is an important real world problem that requires to condense the prolix text data into a summary under the guidance of the query information provided by users. The topic has been studied for a long time and there are many existing interesting research related to query-based text summarization. Yet much of the work is not systematically surveyed. This survey aims at summarizing some interesting work in query-based text summarization methods as well as related generic text summarization methods. Not all taxonomies in this paper exist the related work to the best of our knowledge and some analysis will be presented.

en cs.IR, cs.CL
arXiv Open Access 2020
Longformer for MS MARCO Document Re-ranking Task

Ivan Sekulić, Amir Soleimani, Mohammad Aliannejadi et al.

Two step document ranking, where the initial retrieval is done by a classical information retrieval method, followed by neural re-ranking model, is the new standard. The best performance is achieved by using transformer-based models as re-rankers, e.g., BERT. We employ Longformer, a BERT-like model for long documents, on the MS MARCO document re-ranking task. The complete code used for training the model can be found on: https://github.com/isekulic/longformer-marco

en cs.IR
arXiv Open Access 2020
Predicting Afrobeats Hit Songs Using Spotify Data

Adewale Adeagbo

This study approached the Hit Song Science problem with the aim of predicting which songs in the Afrobeats genre will become popular among Spotify listeners. A dataset of 2063 songs was generated through the Spotify Web API, with the provided audio features. Random Forest and Gradient Boosting algorithms proved to be successful with approximately F1 scores of 86%.

en cs.IR, cs.LG
arXiv Open Access 2017
Sentiment analysis of twitter data

Hamid Bagheri, Md Johirul Islam

Social networks are the main resources to gather information about people's opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this technical paper, we show the application of sentimental analysis and how to connect to Twitter and run sentimental analysis queries. We run experiments on different queries from politics to humanity and show the interesting results. We realized that the neutral sentiments for tweets are significantly high which clearly shows the limitations of the current works.

en cs.IR, cs.CL
arXiv Open Access 2016
A Study on the usage of Data Structures in Information Retrieval

V. R. Kanagavalli, G. Maheeja

This paper tries to throw light in the usage of data structures in the field of information retrieval. Information retrieval is an area of study which is gaining momentum as the need and urge for sharing and exploring information is growing day by day. Data structures have been the area of research for a long period in the arena of computer science. The need to have efficient data structures has become even more important as the data grows in an exponential nature.

en cs.IR
arXiv Open Access 2016
E3 : Keyphrase based News Event Exploration Engine

Nikita Jain, Swati Gupta, Dhaval Patel

This paper presents a novel system E3 for extracting keyphrases from news content for the purpose of offering the news audience a broad overview of news events, with especially high content volume. Given an input query, E3 extracts keyphrases and enrich them by tagging, ranking and finding role for frequently associated keyphrases. Also, E3 finds the novelty and activeness of keyphrases using news publication date, to identify the most interesting and informative keyphrases.

arXiv Open Access 2016
Image Retrieval with a Bayesian Model of Relevance Feedback

Dorota Glowacka, Yee Whye Teh, John Shawe-Taylor

A content-based image retrieval system based on multinomial relevance feedback is proposed. The system relies on an interactive search paradigm where at each round a user is presented with k images and selects the one closest to their ideal target. Two approaches, one based on the Dirichlet distribution and one based the Beta distribution, are used to model the problem motivating an algorithm that trades exploration and exploitation in presenting the images in each round. Experimental results show that the new approach compares favourably with previous work.

en cs.IR
arXiv Open Access 2015
Document Clustering using K-Medoids

Monica Jha

People are always in search of matters for which they are prone to use internet, but again it has huge assemblage of data due to which it becomes difficult for the reader to get the most accurate data. To make it easier for people to gather accurate data, similar information has to be clustered at one place. There are many algorithms used for clustering of relevant information in one platform. In this paper, K-Medoids clustering algorithm has been employed for formation of clusters which is further used for document summarization.

en cs.IR
arXiv Open Access 2015
The Influence of Commercial Intent of Search Results on Their Perceived Relevance

Dirk Lewandowski

We carried out a retrieval effectiveness test on the three major web search engines (i.e., Google, Microsoft and Yahoo). In addition to relevance judgments, we classified the results according to their commercial intent and whether or not they carried any advertising. We found that all search engines provide a large number of results with a commercial intent. Google provides significantly more commercial results than the other search engines do. However, the commercial intent of a result did not influence jurors in their relevance judgments.

en cs.IR
arXiv Open Access 2014
Health Information Search Behavior on the Web: A Pilot Study

Shanu Sushmita, Si-Chi Chin

Searching health information on web has become an integral part of today's world, and many people turn to the Web for healthcare information and healthcare assessment. Our pilot study investigates users' preferences for the type of search results (image, news, video, etc.), and investigates users' ability to accurately interpret online health information for the purpose of self diagnosis. The preliminary results reveal that blog and news articles are most sought by users when searching online information and there exist challenges in the use of online health information for self-diagnosis.

en cs.IR
arXiv Open Access 2014
Text Classification Using Association Rules, Dependency Pruning and Hyperonymization

Yannis Haralambous, Philippe Lenca

We present new methods for pruning and enhancing item- sets for text classification via association rule mining. Pruning methods are based on dependency syntax and enhancing methods are based on replacing words by their hyperonyms of various orders. We discuss the impact of these methods, compared to pruning based on tfidf rank of words.

en cs.IR, cs.CL
arXiv Open Access 2013
Large scale citation matching using Apache Hadoop

Mateusz Fedoryszak, Dominika Tkaczyk, Łukasz Bolikowski

During the process of citation matching links from bibliography entries to referenced publications are created. Such links are indicators of topical similarity between linked texts, are used in assessing the impact of the referenced document and improve navigation in the user interfaces of digital libraries. In this paper we present a citation matching method and show how to scale it up to handle great amounts of data using appropriate indexing and a MapReduce paradigm in the Hadoop environment.

en cs.IR, cs.DL
arXiv Open Access 2012
Product/Brand extraction from WikiPedia

K. Massoudi, G. Modena

In this paper we describe the task of extracting product and brand pages from wikipedia. We present an experimental environment and setup built on top of a dataset of wikipedia pages we collected. We introduce a method for recognition of product pages modelled as a boolean probabilistic classification task. We show that this approach can lead to promising results and we discuss alternative approaches we considered.

en cs.IR, cs.AI

Halaman 7 dari 10909