Explainable recommendation through counterfactual reasoning seeks to identify the influential aspects of items in recommendations, which can then be used as explanations. However, state-of-the-art approaches, which aim to minimize changes in product aspects while reversing their recommended decisions according to an aggregated decision boundary score, often lead to factual inaccuracies in explanations. To solve this problem, in this work we propose a novel method of Comparative Counterfactual Explanations for Recommendation (CoCountER). CoCountER creates counterfactual data based on soft swap operations, enabling explanations for recommendations of arbitrary pairs of comparative items. Empirical experiments validate the effectiveness of our approach.
Traditional methods for crawling and parsing web applications predominantly rely on extracting hyperlinks from initial pages and recursively following linked resources. This approach constructs a graph where nodes represent unstructured data from web pages, and edges signify transitions between them. However, these techniques are limited in capturing the dynamic and interactive behaviors inherent to modern web applications. In contrast, the proposed method models each node as a structured representation of the application's current state, with edges reflecting user-initiated actions or transitions. This structured representation enables a more comprehensive and functional understanding of web applications, offering valuable insights for downstream tasks such as automated testing and behavior analysis.
We present Legommenders, a unique library designed for content-based recommendation that enables the joint training of content encoders alongside behavior and interaction modules, thereby facilitating the seamless integration of content understanding directly into the recommendation pipeline. Legommenders allows researchers to effortlessly create and analyze over 1,000 distinct models across 15 diverse datasets. Further, it supports the incorporation of contemporary large language models, both as feature encoder and data generator, offering a robust platform for developing state-of-the-art recommendation models and enabling more personalized and effective content delivery.
Assessing the degree of similarity of code fragments is crucial for ensuring software quality, but it remains challenging due to the need to capture the deeper semantic aspects of code. Traditional syntactic methods often fail to identify these connections. Recent advancements have addressed this challenge, though they frequently sacrifice interpretability. To improve this, we present an approach aiming to improve the transparency of the similarity assessment by using GraphCodeBERT, which enables the identification of semantic relationships between code fragments. This approach identifies similar code fragments and clarifies the reasons behind that identification, helping developers better understand and trust the results. The source code for our implementation is available at https://www.github.com/jorge-martinez-gil/graphcodebert-interpretability.
We characterize the compactness of embedding derivatives from Hardy space $H^p$ into Lebesgue space $L^q(μ)$. We also completely characterize the boundedness and compactness of derivative area operators from $H^p$ into $L^q(\mathbb{S}_n)$, $0<p, q<\infty$. Some of the tools used in the proof of the one-dimensional case are not available in higher dimensions, such as the strong factorization of Hardy spaces. Therefore, we need the theory of tent spaces which was established by Coifman, Mayer and Stein in 1985.
The extraction of individual reference strings from the reference section of scientific publications is an important step in the citation extraction pipeline. Current approaches divide this task into two steps by first detecting the reference section areas and then grouping the text lines in such areas into reference strings. We propose a classification model that considers every line in a publication as a potential part of a reference string. By applying line-based conditional random fields rather than constructing the graphical model based on the individual words, dependencies and patterns that are typical in reference sections provide strong features while the overall complexity of the model is reduced.
The proposed methodology is procedural i.e. it follows finite number of steps that extracts relevant documents according to users query. It is based on principles of Data Mining for analyzing web data. Data Mining first adapts integration of data to generate warehouse. Then, it extracts useful information with the help of algorithm. The task of representing extracted documents is done by using Vector Based Statistical Approach that represents each document in set of Terms.
Traditional sentiment analysis approaches tackle problems like ternary (3-category) and fine-grained (5-category) classification by learning the tasks separately. We argue that such classification tasks are correlated and we propose a multitask approach based on a recurrent neural network that benefits by jointly learning them. Our study demonstrates the potential of multitask models on this type of problems and improves the state-of-the-art results in the fine-grained sentiment classification problem.
El artículo presenta la llegada del nuevo milenio, un número cada vez mayor de empresarios se unieron a la aplicación del diseño sostenible que comenzó a replantearse en las empresas y el rol que juegan con el desarrollo del medio ambiente, el planeta y en la sociedad. Podemos decir que el diseño sostenible busca generar soluciones a través de servicios y estilos de vida, pero no exclusivamente a través de objetos. Con el fin de introducir una definición elaborada de diseño sostenible es necesario mencionar los sistemas sostenibles, que básicamente, se refieren a cualquier tipo de red o servicio social que puede existir y replicarse. Además de sistemas sostenibles hay otros principios dentro del diseño sostenible. Por último, cualquier tipo de resultado obtenido para satisfacer la necesidad debe ser sostenible a largo plazo entendiéndose como un proceso que permita una comunidad lograr un resultado a través de estrategias de diseño.
This paper explores the use of a learned classifier for post-OCR text correction. Experiments with the Arabic language show that this approach, which integrates a weighted confusion matrix and a shallow language model, improves the vast majority of segmentation and recognition errors, the most frequent types of error on our dataset.
Citation count is a quantifiable measure to indicate the number of times an article is cited by other articles. It is believed that if an article is cited often then it must be an important or influential article; however, there is no guarantee that the most cited articles are good in quality. In this paper, the author suggests argumentation count, a new metric for citation analysis. The proposed metric, argumentation count is a triplet of quantities for each concept of an article that helps in providing a quantifiable measure about the usefulness of an article.
In this paper, we introduce a novel situation aware approach to improve a context based recommender system. To build situation aware user profiles, we rely on evidence issued from retrieval situations. A retrieval situation refers to the social spatio temporal context of the user when he interacts with the recommender system. A situation is represented as a combination of social spatio temporal concepts inferred from ontological knowledge given social group, location and time information. User's interests are inferred from past user's interaction with the recommender system related to the identified situations. They are represented using concepts issued from a domain ontology. We also propose a method to dynamically adapt the system to the user's interest's evolution.
We propose in this paper a method for measuring the similarity between ontological concepts and terms. Our metric can take into account not only the common words of two strings to compare but also other features such as the position of the words in these strings, or the number of deletion, insertion or replacement of words required for the construction of one of the two strings from each other. The proposed method was then used to determine the ontological concepts which are equivalent to the terms that qualify toponymes. It aims to find the topographical type of the toponyme.
Most existing approaches in Context-Aware Recommender Systems (CRS) focus on recommending relevant items to users taking into account contextual information, such as time, location, or social aspects. However, few of them have considered the problem of user's content dynamicity. We introduce in this paper an algorithm that tackles the user's content dynamicity by modeling the CRS as a contextual bandit algorithm and by including a situation clustering algorithm to improve the precision of the CRS. Within a deliberately designed offline simulation framework, we conduct evaluations with real online event log data. The experimental results and detailed analysis reveal several important discoveries in context aware recommender system.
Luis Marujo, Márcio Viveiros, João Paulo da Silva Neto
This paper describes an enhanced automatic keyphrase extraction method applied to Broadcast News. The keyphrase extraction process is used to create a concept level for each news. On top of words resulting from a speech recognition system output and news indexation and it contributes to the generation of a tag/keyphrase cloud of the top news included in a Multimedia Monitoring Solution system for TV and Radio news/programs, running daily, and monitoring 12 TV channels and 4 Radios.
AbstractThe title compounds are synthesized by solid state reactions of Ba, Ga, Se, MCl (M: Cs, Rb, K), and BaCl2 in the ratio 5:10:20:2:1 (850 °C, 100 h).
We describe the Universal Recommender, a recommender system for semantic datasets that generalizes domain-specific recommenders such as content-based, collaborative, social, bibliographic, lexicographic, hybrid and other recommenders. In contrast to existing recommender systems, the Universal Recommender applies to any dataset that allows a semantic representation. We describe the scalable three-stage architecture of the Universal Recommender and its application to Internet Protocol Television (IPTV). To achieve good recommendation accuracy, several novel machine learning and optimization problems are identified. We finally give a brief argument supporting the need for machine learning recommenders.
This paper presents the principles of ontology-supported and ontology-driven conceptual navigation. Conceptual navigation realizes the independence between resources and links to facilitate interoperability and reusability. An engine builds dynamic links, assembles resources under an argumentative scheme and allows optimization with a possible constraint, such as the user's available time. Among several strategies, two are discussed in detail with examples of applications. On the one hand, conceptual specifications for linking and assembling are embedded in the resource meta-description with the support of the ontology of the domain to facilitate meta-communication. Resources are like agents looking for conceptual acquaintances with intention. On the other hand, the domain ontology and an argumentative ontology drive the linking and assembling strategies.
Muhammad Marwan Muhammad Fuad, Pierre-François Marteau
Similarity search is an important problem in information retrieval. This similarity is based on a distance. Symbolic representation of time series has attracted many researchers recently, since it reduces the dimensionality of these high dimensional data objects. We propose a new distance metric that is applied to symbolic data objects and we test it on time series data bases in a classification task. We compare it to other distances that are well known in the literature for symbolic data objects. We also prove, mathematically, that our distance is metric.
UNICORE (Uniform Interface to Computer Resources) is a software infrastructure supporting seamless and secure access to distributed resources. UNICORE allows uniform access to different hardware and software platforms as well as different organizational environments. Based on the abstract job model it offers services for security, translation of abstract jobs into real batch jobs for different target systems, and a public key infrastructure. This paper describes the UNICORE architecture and the services provided.