Leslie D. McIntosh, Alexandra Sinclair, Simon Linacre
This paper presents a forensic scientometric case study of the Pharmakon Neuroscience Research Network, a fabricated research collective that operated primarily between 2019 and 2022 while embedding itself within legitimate scholarly publishing channels.
The Platform for Content-Structure Inference (PCSI, pronounced "pixie") facilitates the sharing of information about the process of converting Web resources into structured content objects that conform to a predefined format. PCSI records encode methods for deriving structured content from classes of URLs, and report the results of applying particular methods to particular URLs. The methods are scripts written in Hex, a variant of Awk with facilities for traversing the HTML DOM.
Search and recommendation systems are essential in many services, and they are often developed separately, leading to complex maintenance and technical debt. In this paper, we present a unified deep learning model that efficiently handles key aspects of both tasks.
Search bias analysis is getting more attention in recent years since search results could affect In this work, we aim to establish an automated model for evaluating ideological bias in online news articles. The dataset is composed of news articles in search results as well as the newspaper articles. The current automated model results show that model capability is not sufficient to be exploited for annotating the documents automatically, thereby computing bias in search results.
We review the services implementing the OpenRefine reconciliation API, comparing their design to the state of the art in record linkage. Due to the design of the API, the matching scores returned by the services are of little help to guide matching decisions. This suggests possible improvements to the specifications of the API, which could improve user workflows by giving more control over the scoring mechanism to the client.
This work presents a general query term weighting approach based on query performance prediction (QPP). To this end, a given term is weighed according to its predicted effect on query performance. Such an effect is assumed to be manifested in the responses made by the underlying retrieval method for the original query and its (simple) variants in the form of a single-term expanded query. Focusing on search re-ranking as the underlying application, the effectiveness of the proposed term weighting approach is demonstrated using several state-of-the-art QPP methods evaluated over TREC corpora.
The BM25 ranking function is one of the most well known query relevance document scoring functions and many variations of it are proposed. The BM25F function is one of its adaptations designed for modeling documents with multiple fields. The Expanded Span method extends a BM25-like function by taking into considerations of the proximity between term occurrences. In this note, we combine these two variations into one scoring method in view of proximity-based scoring of documents with multiple fields.
We share the implementation details and testing results for video retrieval system based exclusively on features extracted by convolutional neural networks. We show that deep learned features might serve as universal signature for semantic content of video useful in many search and retrieval tasks. We further show that graph-based storage structure for video index allows to efficiently retrieving the content with complicated spatial and temporal search queries.
Context-aware recommender systems extend traditional recommenders by adapting their suggestions to users' contextual situations. CARSKit is a Java-based open-source library specifically designed for the context-aware recommendation, where the state-of-the-art context-aware recommendation algorithms have been implemented. This report provides the basic user's guide to CARSKit, including how to prepare the data set, how to configure the experimental settings, and how to evaluate the algorithms, as well as interpreting the outputs. The instructions in this guide are applicable for CARSKit v0.3.5 and above.
We present a system that constructs and maintains an up-to-date co-occurrence network of medical concepts based on continuously mining the latest biomedical literature. Users can explore this network visually via a concise online interface to quickly discover important and novel relationships between medical entities. This enables users to rapidly gain contextual understanding of their medical topics of interest, and we believe this constitutes a significant user experience improvement over contemporary search engines operating in the biomedical literature domain.
We present our solution to the Yandex Personalized Web Search Challenge. The aim of this challenge was to use the historical search logs to personalize top-N document rankings for a set of test users. We used over 100 features extracted from user- and query-depended contexts to train neural net and tree-based learning-to-rank and regression models. Our final submission, which was a blend of several different models, achieved an NDCG@10 of 0.80476 and placed 4'th amongst the 194 teams winning 3'rd prize.
This presentation focuses on the automatic expansion of Arabic request using morphological analyzer and Arabic Wordnet. The expanded request is sent to Google.
The Information and Communication Technologies revolution brought a digital world with huge amounts of data available. Enterprises use mining technologies to search vast amounts of data for vital insight and knowledge. Mining tools such as data mining, text mining, and web mining are used to find hidden knowledge in large databases or the Internet.
Due to the development of social media technology, it becomes easier for users to gather together to form groups. Take the Last.fm for example, users can join groups they may be interested where they can share their loved songs and discuss topics about songs and singers. However, the number of groups grows over time, users need effective groups recommendations in order to meet more like-minded users.
In this paper, we explain social information retrieval (SIR) and collaborative information retrieval (CIR). We see SIR as a way of knowing who to collaborate with in resolving an information problem while CIR entails the process of mutual understanding and solving of an information problem among collaborators. We are interested in the transition from SIR to CIR hence we developed a communication model to facilitate knowledge sharing during CIR.
Abstract The IR spectra of Cs–vanadates (C=[Cs]/[V]=0–0.332) were recorded in the 650–1200 cm−1 region. In the V2O5, which had cautiously been purified, the V=O stretching band was found at 1022 cm−1, and the V–O–V stretching band, at 815 cm−1. Upon the addition of Cs, sharp new bands began to appear at 965 cm−1 at C=0.0042 and also at 1000 cm−1 at C=0.0120, and a shift in the 815-cm−1 band was found. Finally, at C=0.332 the original 1022-cm−1 band was replaced by the 965 and 1000 cm−1 bands, the intensity ratio (I965⁄I1000) of which was 2.0. The X-ray diffraction pattern of the last sample was in agreement with that of CsV3O8. The IR and the X-ray results suggest that the CsV3O8 phase appears at around C=0.004. Based on the band shifts of 1022→965 and 1000 cm−1, the force constant and the bond strength of V=O in CsV3O8 were discussed. With respect to the band shift in the 815-cm−1 region with the increase in the Cs content, the corretion of Δ\ ildeν to C1⁄3 is shown in C<0.020.
AbstractApart from well known areas of overlap between endocrinology and psychiatry (e.g. studies, in psychiatric disorders, of neurohormones and of the response to manipulations of hypothalamic-pituitary-target gland axis, and analysis of behavioural and psychological disturbances in endocrinological disorders) there is a more intimate intrinsic relationship between the brain and the endocrine system which is less well known or studied. Many of the extracranial endocrine glands have autonomic innervation. Like the pituitary gland which is under direct neural (as well as humoral) diencephalic control, the extracranial endocrine glands are under direct neural control, integrated by the hypothalamus and “head ganglion of the autonomic nervous system”. Yet it is only in the case of the pancreatic islets that this integration has been clearly defined. It is postulated that by this innervation the somatic endocrine glands can respond to homeostatic needs with a rapid initial secretion before the more sustained outpouring of humoral agents typically regulated by blood-borne constituents including pituitary hormones. This is a vast area awaiting further investigation.