The article discusses the current problem of revealing the document collections of libraries in the Internet space in order to provide reliable information about the events of the Great Patriotic War, as well as to create high-quality information resources on military history and to promote historical knowledge among the younger generation. Examples of various online resources about the Great Patriotic War are provided. Based on the results of monitoring the websites of central regional libraries, a selective review of online projects dedicated to the anniversary of the Great Victory is presented. However, the identification of some of these projects is challenging (not only for users), leading to inaccuracies in their descriptions, information noise, and unreliable knowledge (most often, there is a lack of information about the authors, creators, and partner organizations, as well as the digital publicationʼs output data, etc.). The author emphasizes that the development of bibliographic culture and overcoming the anonymity of work in the digital environment are interconnected and essential for the professional community. The rich potential of Russian libraries should be realized in this area.
Anusha M D, Deepthi Vikram, Bharathi Raja Chakravarthi
et al.
Tulu, a low-resource Dravidian language predominantly spoken in southern India, has limited computational resources despite its growing digital presence. This study presents the first benchmark dataset for Offensive Language Identification (OLI) in code-mixed Tulu social media content, collected from YouTube comments across various domains. The dataset, annotated with high inter-annotator agreement (Krippendorff's alpha = 0.984), includes 3,845 comments categorized into four classes: Not Offensive, Not Tulu, Offensive Untargeted, and Offensive Targeted. We evaluate a suite of deep learning models, including GRU, LSTM, BiGRU, BiLSTM, CNN, and attention-based variants, alongside transformer architectures (mBERT, XLM-RoBERTa). The BiGRU model with self-attention achieves the best performance with 82% accuracy and a 0.81 macro F1-score. Transformer models underperform, highlighting the limitations of multilingual pretraining in code-mixed, under-resourced contexts. This work lays the foundation for further NLP research in Tulu and similar low-resource, code-mixed languages.
The purpose of the article is to consider the concept of content as an actual term in modern information sphere and to analyse the information content of different types of media Internet-resources. The research methodology consists of methods of analysis, synthesis, abstraction, generalisation, observation, information monitoring, and diagnostics. The scientific novelty consists in summarising the existing and obtaining new knowledge about information content in media Internet-resources on the basis of theoretical analysis and practical research of popular media organisations represented in the digital environment. Conclusions. In today's media and digital environment, information content is an important element of an organisation's information policy. After analysing the regulatory sources, we see that the legislative framework for content in Ukraine is currently not sufficiently regulated. Theoretical analysis has shown that a significant challenge in working with content is the lack of a single definition of the term “content”. However, most researchers treat content as information content of any communication platform in order to meet the needs of users. In the digital environment, the key media for presenting information content are Internet-resources. It has been determined that the most effective channel and way to promote content is social networks, as they provide a wide coverage of the target audience. Today, a feature of the content are teaser headlines. They are a mandatory component of the media that seek to increase their audience and popularity. A properly chosen and original teaser title can help attract new audiences and preserve the existing one. The importance of the teaser headline for the media cannot be underestimated as it is one of the important factors in the successful functioning of virtual media. Keywords: information content, media Internet-resources, digital environment, social networks, information and communication technologies.
The emergence of beauty vloggers is currently a trend in the world of beauty because they can influence skincare or make-up products through video content that is shared. This study intends to find out the information seeking behavior of beauty vloggers in the Trenggalek Regency area to meet information needs in the process of creating video content, as well as the obstacles faced by beauty vloggers in searching for information as an effort to meet information needs in the process of creating video content and how to overcome them. The type of approach used in this study is a qualitative descriptive approach. The data collection technique used was interviews with three beauty vloggers, observation and documentation. As well as data analysis used, namely data reduction, data presentation and drawing conclusions. The results of the study show that the information seeking behavior model used by beauty vloggers in the Trenggalek Regency area is in accordance with the theory put forward by David Ellis which is divided into 8 stages including Starting, Chaining, Browsing, Differenting, Monitoring, Extracting, Verifying, Ending via whatsapp, tick tock, instagram and google. The results of the study also show the obstacles faced by beauty vloggers which are divided into 2 factors including internal factors in the form of lack of information availability and external factors in the form of inadequate internet networks. However, the constraints from these internal factors can be handled by re-searching information, while external factors can be handled by postponing information search activities and looking for a place that has network access that supports it.
Bibliography. Library science. Information resources
La Direction Générale des Impôts (DGI) accélère sa transformation numérique. La mise en place d’une gestion moderne de l’information, utilisant des technologies innovantes, devenait sa priorité stratégique. Elle s’appuie sur le digital pour gagner en productivité, performance et innovation. Or, la réussite de la transformation digitale passe par l’humain avant la tech. D’où notre question de recherche : comment l’acculturation digitale des pratiques informationnelles des professionnels des impôts devient-elle une nécessité absolue dans ces nouveaux environnements digitaux ? Notre objectif est de diagnostiquer le niveau d’appropriation de la veille sur Internet comme composante principale de l’intelligence économique. La méthodologie adoptée est exploratoire, interprétative, et qualitative. Des entretiens semi-directifs sont menés auprès de 70 hauts cadres exerçant dans les activités cœur de métier : gestion fiscale, recouvrement, et contrôle des insuffisances. Parmi les plus saillants résultats, on estime le taux moyen de besoin en acculturation à 85,55%.
Bibliography. Library science. Information resources
Neural networks are known to exploit spurious artifacts (or shortcuts) that co-occur with a target label, exhibiting heuristic memorization. On the other hand, networks have been shown to memorize training examples, resulting in example-level memorization. These kinds of memorization impede generalization of networks beyond their training distributions. Detecting such memorization could be challenging, often requiring researchers to curate tailored test sets. In this work, we hypothesize -- and subsequently show -- that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. We quantify the diversity in the neural activations through information-theoretic measures and find support for our hypothesis on experiments spanning several natural language and vision tasks. Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabelled in-distribution examples. Lastly, we demonstrate the utility of our findings for the problem of model selection. The associated code and other resources for this work are available at https://rachitbansal.github.io/information-measures.
Common information (CI) is ubiquitous in information theory and related areas such as theoretical computer science and discrete probability. However, because there are multiple notions of CI, a unified understanding of the deep interconnections between them is lacking. This monograph seeks to fill this gap by leveraging a small set of mathematical techniques that are applicable across seemingly disparate problems. In Part I, we review the operational tasks and properties associated with Wyner's and Gács-Körner-Witsenhausen's (GKW's) CI. In PartII, we discuss extensions of the former from the perspective of distributed source simulation. This includes the Rényi CI which forms a bridge between Wyner's CI and the exact CI. Via a surprising equivalence between the Rényi CI of order~$\infty$ and the exact CI, we demonstrate the existence of a joint source in which the exact CI strictly exceeds Wyner's CI. Other closely related topics discussed in Part II include the channel synthesis problem and the connection of Wyner's and exact CI to the nonnegative rank of matrices. In Part III, we examine GKW's CI with a more refined lens via the noise stability or NICD problem in which we quantify the agreement probability of extracted bits from a bivariate source. We then extend this to the $k$-user NICD and $q$-stability problems, and discuss various conjectures in information theory and discrete probability, such as the Courtade-Kumar, Li-Médard and Mossell-O'Donnell conjectures. Finally, we consider hypercontractivity and Brascamp-Lieb inequalities, which further generalize noise stability via replacing the Boolean functions therein by nonnnegative functions. The key ideas behind the proofs in Part III can be presented in a pedagogically coherent manner and unified via information-theoretic and Fourier-analytic methods.
Stefano Alberto Russo, Sara Bertocco, Claudio Gheller
et al.
Rosetta is a science platform for resource-intensive, interactive data analysis which runs user tasks as software containers. It is built on top of a novel architecture based on framing user tasks as microservices - independent and self-contained units - which allows to fully support custom and user-defined software packages, libraries and environments. These include complete remote desktop and GUI applications, besides common analysis environments as the Jupyter Notebooks. Rosetta relies on Open Container Initiative containers, which allow for safe, effective and reproducible code execution; can use a number of container engines and runtimes; and seamlessly supports several workload management systems, thus enabling containerized workloads on a wide range of computing resources. Although developed in the astronomy and astrophysics space, Rosetta can virtually support any science and technology domain where resource-intensive, interactive data analysis is required.
PurposeThe study's main purpose is to scrutinize Google Scholar profiles and find the answer to the question, “Do authors play fair or manipulate Google Scholar Bibliometric Indicators like h-index and i10-index?”Design/methodology/approachThe authors scrutinized the Google Scholar profiles of the top 50 library and science researchers claiming authorship of 21,022 publications. The bibliographic information of all the 21,022 publications like authorship and subject details were verified to identify accuracy, discrepancies and manipulation in their authorship claims. The actual and fabricated entries of all the authors along with their citations were recorded in the Microsoft Office Excel 2007 for further analyses and interpretation using simple arithmetic calculations.FindingsThe results show that the h-index of authors obtained from the Google Scholar should not be approved at its face value as the variations exist in the publication count and citations, which ultimately affect their h-index and i10 index. The results reveal that the majority of the authors have variations in publication count (58%), citations (58%), h-index (42%) and i10-index (54%). The magnitude of variation in the number of publications, citations, h-index and i10-index is very high, especially for the top-ranked authors.Research limitations/implicationsThe scope of the study is strictly restricted to the faculty members of library and information science and cannot be generalized across disciplines. Further, the scope of the study is limited to Google Scholar and caution needs to be taken to extend results to other databases like Web of Science and Scopus.Practical implicationsThe study has practical implications for authors, publishers, and academic institutions. Authors must stop the unethical research practices; publishers must adopt techniques to overcome the problem and academic institutions need to take precautions before hiring, recruiting, promoting and allocating resources to the candidates on the face value of the Google Scholar h-index. Besides, Google needs to work on the weak areas of Google Scholar to improve its efficacy.Originality/valueThe study brings to light the new ways of manipulating bibliometric indicators like h-index, and i10-index provided by Google Scholar using false authorship claims.
Ananda Fernanda de Jesus, Fabiano Ferreira de Castro, Rogério Aparecido Sá Ramalho
Objective: A greater understanding of the role of libraries in Linked Data was sought, aiming to identify: 1) how the representation of information resources in Linked Data occurs; 2) how libraries can contribute to Linked Data; 3) how libraries can benefit from data published by other data sources. Methods: An exploratory analysis of the theme was carried out, based on a bibliographic survey in the Database in Information Science (BRAPCI) and in the Periodical Portal of the Coordination for the Improvement of Higher Education Personnel (CAPES). In addition, the official documents of the World Wide Web Consortium (W3C), responsible for monitoring the development of the Web and Linked Data, were analyzed. Results: It was identified that the cooperation process between libraries and sources of information external to the bibliographic domain can occur from two aspects, that of publisher and that of data consumer. Each of these forms of action can be carried out independently, require different decision making and have different advantages. Conclusions: It is concluded that libraries should adopt the recommendations of the W3C to enhance the reuse of their connected data. It is concluded that the institutions of the bibliographic domain need to be mobilized to elaborate instruments and criteria that can base their performance as a data consumer.
Objective. To represent library communication in the socio-cultural space of modern society. Methods. The methodology of the article was based on a set of general and special methods of scientific knowledge. Results. We substantiated the essence and content of the term "library communication" as a pre-planned activity in the socio-cultural space of modern society. There are presented the structural content elements of library communication:, goal, objective, essence, driving mechanism, purpose, implementation, basic tools, main communication product, indifferent components, sources, environment and specifics of functioning, levels and result. We established that library practice today is constantly enriched with new phenomena and concepts to denote them, for which library science as a science should formulate appropriate terms, provide theoretical and methodological justification. Conclusions. It is argued that in the context of the theory of social communications, the institutionalization of the term "library communication" as a separate type of professional communication and the driving force of the transformation of the library business is relevant.
Bibliography. Library science. Information resources
The widespread use of artificial intelligence (AI) in many domains has revealed numerous ethical issues from data and design to deployment. In response, countless broad principles and guidelines for ethical AI have been published, and following those, specific approaches have been proposed for how to encourage ethical outcomes of AI. Meanwhile, library and information services too are seeing an increase in the use of AI-powered and machine learning-powered information systems, but no practical guidance currently exists for libraries to plan for, evaluate, or audit the ethics of intended or deployed AI. We therefore report on several promising approaches for promoting ethical AI that can be adapted from other contexts to AI-powered information services and in different stages of the software lifecycle.
In this paper, we focus on the convex mutual information, which was found at the lowest level split in multilevel coding schemes with communications over the additive white Gaussian noise (AWGN) channel. Theoretical analysis shows that communication achievable rates (ARs) do not necessarily below mutual information in the convex region. In addition, simulation results are provided as an evidence.
A new approach to teaching web source evaluation is necessary for an internet that is increasingly littered with sources of questionable merit and motivation. Initially pioneered by K–12 educational specialists, the journalistic model avoids the cognitive duality of the checklist and a reliance on opaque terms and concepts. Instead, it recommends students apply the six journalistic questions of what, who, where, when, why, and how when evaluating freely available web sources. This approach outlines an evaluative procedure that is open-ended, discursive, and analytic in nature as opposed to formulaic and binaristic. It also requires students to consider both the context of the information need and a source’s potential use as central to its evaluation.
Bibliography. Library science. Information resources, Information resources (General)
Human-robot interaction is becoming an interesting area of research in cognitive science, notably, for the study of social cognition. Interaction theorists consider primary intersubjectivity a non-mentalist, pre-theoretical, non-conceptual sort of processes that ground a certain level of communication and understanding, and provide support to higher-level cognitive skills. We argue this sort of low level cognitive interaction, where control is shared in dyadic encounters, is susceptible of study with neural robots. Hence, in this work we pursue three main objectives. Firstly, from the concept of active inference we study primary intersubjectivity as a second person perspective experience characterized by predictive engagement, where perception, cognition, and action are accounted for an hermeneutic circle in dyadic interaction. Secondly, we propose an open-source methodology named \textit{neural robotics library} (NRL) for experimental human-robot interaction, and a demonstration program for interacting in real-time with a virtual Cartesian robot (VCBot). Lastly, through a study case, we discuss some ways human-robot (hybrid) intersubjectivity can contribute to human science research, such as to the fields of developmental psychology, educational technology, and cognitive rehabilitation.
Kunio M. Sayanagi, Cindy L. Young, Lynn Bowman
et al.
We advocate for a mission concept study for a space telescope dedicated to solar system science in Earth orbit. Such a study was recommended by the Committee on Astrobiology and Planetary Science (CAPS) report "Getting Ready for the Next Planetary Science Decadal Survey." The Mid-Decadal Review also recommended NASA to assess the role and value of space telescopes for planetary science. The need for high-resolution, UV-Visible capabilities is especially acute for planetary science with the impending end of the Hubble Space Telescope (HST); however, NASA has not funded a planetary telescope concept study, and the need to assess its value remains. Here, we present potential design options that should be explored to inform the decadal survey.
Alberto Baccini, Lucio Barabesi, Mahdi Khelfaoui
et al.
This paper explores, by using suitable quantitative techniques, to what extent the intellectual proximity among scholarly journals is also a proximity in terms of social communities gathered around the journals. Three fields are considered: statistics, economics and information and library sciences. Co-citation networks (CC) represent the intellectual proximity among journals. The academic communities around the journals are represented by considering the networks of journals generated by authors writing in more than one journal (interlocking authorship: IA), and the networks generated by scholars sitting in the editorial board of more than one journal (interlocking editorship: IE). For comparing the whole structure of the networks, the dissimilarity matrices are considered. The CC, IE and IA networks appear to be correlated for the three fields. The strongest correlations is between CC and IA for the three fields. Lower and similar correlations are obtained for CC and IE, and for IE and IA. The CC, IE and IA networks are then partitioned in communities. Information and library sciences is the field where communities are more easily detectable, while the most difficult field is economics. The degrees of association among the detected communities show that they are not independent. For all the fields, the strongest association is between CC and IA networks; the minimum level of association is between IE and CC. Overall, these results indicate that the intellectual proximity is also a proximity among authors and among editors of the journals. Thus, the three maps of editorial power, intellectual proximity and authors communities tell similar stories.