Hasil untuk "Bibliography. Library science. Information resources"

Menampilkan 20 dari ~18658169 hasil · dari DOAJ, arXiv, CrossRef

JSON API
arXiv Open Access 2024
AMIDER: A Multidisciplinary Research Database and Its Application to Promote Open Science

Masayoshi Kozai, Yoshimasa Tanaka, Shuji Abe et al.

The AMIDER, Advanced Multidisciplinary Integrated-Database for Exploring new Research, is a newly developed research data catalog to demonstrate an advanced database application. AMIDER is characterized as a multidisciplinary database equipped with a user-friendly web application. Its catalog view displays diverse research data at once beyond any limitation of each individual discipline. Some useful functions, such as a selectable data download, data format conversion, and display of data visual information, are also implemented. Further advanced functions, such as visualization of dataset mutual relationship, are also implemented as a preliminary trial. These characteristics and functions are expected to enhance the accessibility to individual research data, even from non-expertized users, and be helpful for collaborations among diverse scientific fields beyond individual disciplines. Multidisciplinary data management is also one of AMIDER's uniqueness, where various metadata schemas can be mapped to a uniform metadata table, and standardized and self-describing data formats are adopted. AMIDER website (https://amider.rois.ac.jp/) had been launched in April 2024. As of July 2024, over 15,000 metadata in various research fields of polar science have been registered in the database, and approximately 500 visitors are viewing the website every day on average. Expansion of the database to further multidisciplinary scientific fields, not only polar science, is planned, and advanced attempts, such as applying Natural Language Processing (NLP) to metadata, have also been considered.

en cs.DB
DOAJ Open Access 2023
Impacts of Heuristic-Systematic Clues on Health Information Adoption of Mobile Short Video Apps: Based on SEM and fsQCA

LI Li, HAN Ping, ZHANG Hong, ZHANG Weijuan

[Purpose/Significance] Mobile short video apps like Tiktok have become one of the most frequently used tools to obtain/exchange health information in post-epidemic era. This study analyzed the complex relationships between the antecedents and users' health information adoption in mobile short video apps in terms of necessary and sufficient conditions, in order to optimize users' experience with health information adoption systematically, which in turn generates users' sustainability. [Method/Process] Based on both the information adoption model (IAM) and the heuristic-systematic model (HSM), this study constructed a health information adoption model of mobile short videos, in which five predictor variables (content quality, expression quality, information source credibility, app's reputation, perceived information usefulness) and an outcome variable (health information adoption) were included. The construct validity, reliability and symmetrical relationships between the five predictor variables and an outcome condition were examined using structural equation model (SEM) method. Fuzzy-set qualitative comparative analysis (fsQCA) was used to examine and reveal the configuration models. [Results/Conclusions] The results from PLS-SEM method and fsQCA validated the IAM and HSM in the context of mobile short video apps. Specifically, the results from PLS-SEM show that systematic and heuristic clues jointly affect users' adoption of health information. Content quality, app's reputation and information resource credibility significantly impact perceived information usefulness, and further affect the health information adoption. Expression quality has a significant effect on content quality, but no obvious effects on perceived information usefulness and health information adoption. The results of fsQCA reveal three configurations to affect users' health information adoption (i.e., content quality * expression quality * perceived information usefulness * information resource credibility), and another three configurations to affect users' non-adoption i.e., ~content quality * ~perceived information usefulness * ~information resource credibility), in which the latter configurations are asymmetric with the former. The configurations where systematic clues are predominant (S1a and S1b) are more sufficient for promoting users' health information adoption. N1(~ content quality * ~ perceived information usefulness * ~ information resource credibility). This shows the most sufficient configuration path in non-adoption. This study is aimed at exploring users' adoption and non-adoption decision-making process in a growing context of mobile short video apps like Tiktok, in order to contribute to helpful advice for the better management of these apps, which will eventually optimize users' experience with health information behavior. Users always engaged in many kinds of information activities such as information sharing, retweeting, adoption, and some relationships between these activities may exists. Due to the limited length, we did not consider users' other information behavior i.e., retweeting) and the relationship between them in this context, which could possibly be further research directions.

Bibliography. Library science. Information resources, Agriculture
arXiv Open Access 2022
Generalised Mutual Information for Discriminative Clustering

Louis Ohl, Pierre-Alexandre Mattei, Charles Bouveyron et al.

In the last decade, recent successes in deep clustering majorly involved the mutual information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight how the maximisation of MI does not lead to satisfying clusters. We identified the Kullback-Leibler divergence as the main reason of this behaviour. Hence, we generalise the mutual information by changing its core distance, introducing the generalised mutual information (GEMINI): a set of metrics for unsupervised neural network training. Unlike MI, some GEMINIs do not require regularisations when training. Some of these metrics are geometry-aware thanks to distances or kernels in the data space. Finally, we highlight that GEMINIs can automatically select a relevant number of clusters, a property that has been little studied in deep clustering context where the number of clusters is a priori unknown.

en stat.ML, cs.AI
arXiv Open Access 2022
Function Computation Without Secure Links: Information and Leakage Rates

Remi A. Chou, Joerg Kliewer

Consider $L$ users, who each hold private data, and one fusion center who must compute a function of the private data of the $L$ users. To accomplish this task, each user may utilize a public and noiseless broadcast channel in a non-interactive manner. In this setting, and in the absence of any additional resources such as secure links, we study the optimal communication rates and minimum information leakages on the private user data that are achievable. Specifically, we study the information leakage of the user data at the fusion center (beyond the knowledge of the function output), as well as at predefined groups of colluding users who eavesdrop one another. We derive the capacity region when the user data is independent, and inner and outer regions for the capacity region when the user data is correlated.

en cs.IT
arXiv Open Access 2022
Optimal service resource management strategy for IoT-based health information system considering value co-creation of users

Ji Fang, Vincent CS Lee, Haiyan Wang

This paper explores optimal service resource management strategy, a continuous challenge for health information service to enhance service performance, optimise service resource utilisation and deliver interactive health information service. An adaptive optimal service resource management strategy was developed considering a value co-creation model in health information service with a focus on collaborative and interactive with users. The deep reinforcement learning algorithm was embedded in the Internet of Things (IoT)-based health information service system (I-HISS) to allocate service resources by controlling service provision and service adaptation based on user engagement behaviour. The simulation experiments were conducted to evaluate the significance of the proposed algorithm under different user reactions to the health information service.

en cs.LG, math.OC
DOAJ Open Access 2021
Do I Have To Be An “Other” To Be Myself? Exploring Gender Diversity In Taxonomy, Data Collection, And Through The Research Data Lifecycle

Ari Gofman, Sam A. Leif, Hannah Gunderman et al.

Objective: Existing studies estimate that between 0.3% and 2% of adults in the U.S. (between 900,000 and 2.6 million in 2020) identify as a nonbinary gender or otherwise gender nonconforming. In response to the RDAP 2021 theme of radical change, this article examines the need to change how datasets represent nonbinary persons and how research involving gender data should approach the curation of this data at each stage of the research lifecycle. Methods: In this article, we examine some of the known challenges of gender inclusion in datasets and summarize some solutions underway. Using a critical lens, we examine the difference between current practice and inclusive practice in gender representation, describing inclusive practices at each stage of the research lifecycle from writing a data management plan to sharing data. Results: Data structures that limit gender to “male” and “female” or ontological structures that use mapping to collapse gender demographics to binary values exclude nonbinary and gender diverse populations. Some data collection instruments attempt inclusivity by adding the gender category of “other,” but using the “other” gender category labels nonbinary persons as intrinsically alien. Inclusive change must go farther, to move from alienation to inclusive categories. We describe several techniques for inclusively representing gender in data, from the data management planning stage, to collecting data, cleaning data, and sharing data. To facilitate better sharing of gender data, repositories must also allow mapping that includes nonbinary genders explicitly and allow for ontological mapping for long-term representation of diverse gender identities. Conclusions: A good practice during research design is to consider two levels of critique in the data collection plan. First, consider the research question at hand and remove unnecessary gendering from the data. Secondly, if the research question needs gender, make sure to include nonbinary genders explicitly. Allies must take on this problem without leaving it to those who are most affected by it. Further, more voices calling for inclusionary practices surrounding data rises to a crescendo that cannot be ignored.

Bibliography. Library science. Information resources
arXiv Open Access 2021
Information-Theoretic Limits for Steganography in Multimedia

Hassan Y. El-Arsh, Amr Abdelaziz, Ahmed Elliethy et al.

Steganography is the art and science of hiding data within innocent-looking objects (cover objects). Multimedia objects such as images and videos are an attractive type of cover objects due to their high embedding rates. There exist many techniques for performing steganography in both the literature and the practical world. Meanwhile, the definition of the steganographic capacity for multimedia and how to be calculated has not taken full attention. In this paper, for multivariate quantized-Gaussian-distributed multimedia, we study the maximum achievable embedding rate with respect to the statistical properties of cover objects against the maximum achievable performance by any steganalytic detector. Toward this goal, we evaluate the maximum allowed entropy of the hidden message source subject to the maximum probability of error of the steganalytic detector which is bounded by the KL-divergence between the statistical distributions for the cover and the stego objects. We give the exact scaling constant that governs the relationship between the entropies of the hidden message and the cover object.

en cs.CR, cs.IT
arXiv Open Access 2021
"What can I cook with these ingredients?" -- Understanding cooking-related information needs in conversational search

Alexander Frummet, David Elsweiler, Bernd Ludwig

As conversational search becomes more pervasive, it becomes increasingly important to understand the user's underlying information needs when they converse with such systems in diverse domains. We conduct an in-situ study to understand information needs arising in a home cooking context as well as how they are verbally communicated to an assistant. A human experimenter plays this role in our study. Based on the transcriptions of utterances, we derive a detailed hierarchical taxonomy of diverse information needs occurring in this context, which require different levels of assistance to be solved. The taxonomy shows that needs can be communicated through different linguistic means and require different amounts of context to be understood. In a second contribution we perform classification experiments to determine the feasibility of predicting the type of information need a user has during a dialogue using the turn provided. For this multi-label classification problem, we achieve average F1 measures of 40% using BERT-based models. We demonstrate with examples, which types of need are difficult to predict and show why, concluding that models need to include more context information in order to improve both information need classification and assistance to make such systems usable.

en cs.IR
arXiv Open Access 2021
A Class of Nonbinary Symmetric Information Bottleneck Problems

Michael Dikshtein, Shlomo Shamai

We study two dual settings of information processing. Let $ \mathsf{Y} \rightarrow \mathsf{X} \rightarrow \mathsf{W} $ be a Markov chain with fixed joint probability mass function $ \mathsf{P}_{\mathsf{X}\mathsf{Y}} $ and a mutual information constraint on the pair $ (\mathsf{W},\mathsf{X}) $. For the first problem, known as Information Bottleneck, we aim to maximize the mutual information between the random variables $ \mathsf{Y} $ and $ \mathsf{W} $, while for the second problem, termed as Privacy Funnel, our goal is to minimize it. In particular, we analyze the scenario for which $ \mathsf{X} $ is the input, and $ \mathsf{Y} $ is the output of modulo-additive noise channel. We provide analytical characterization of the optimal information rates and the achieving distributions.

en cs.IT
arXiv Open Access 2020
A Graph Symmetrisation Bound on Channel Information Leakage under Blowfish Privacy

Tobias Edwards, Benjamin I. P. Rubinstein, Zuhe Zhang et al.

Blowfish privacy is a recent generalisation of differential privacy that enables improved utility while maintaining privacy policies with semantic guarantees, a factor that has driven the popularity of differential privacy in computer science. This paper relates Blowfish privacy to an important measure of privacy loss of information channels from the communications theory community: min-entropy leakage. Symmetry in an input data neighbouring relation is central to known connections between differential privacy and min-entropy leakage. But while differential privacy exhibits strong symmetry, Blowfish neighbouring relations correspond to arbitrary simple graphs owing to the framework's flexible privacy policies. To bound the min-entropy leakage of Blowfish-private mechanisms we organise our analysis over symmetrical partitions corresponding to orbits of graph automorphism groups. A construction meeting our bound with asymptotic equality demonstrates tightness.

en cs.IT, cs.CR
arXiv Open Access 2019
Multi-Server Private Information Retrieval with Coded Side Information

Fatemeh Kazemi, Esmaeil Karimi, Anoosheh Heidarzadeh et al.

In this paper, we study the multi-server setting of the \emph{Private Information Retrieval with Coded Side Information (PIR-CSI)} problem. In this problem, there are $K$ messages replicated across $N$ servers, and there is a user who wishes to download one message from the servers without revealing any information to any server about the identity of the requested message. The user has a side information which is a linear combination of a subset of $M$ messages in the database. The parameter $M$ is known to all servers in advance, whereas the indices and the coefficients of the messages in the user's side information are unknown to any server \emph{a priori}. We focus on a class of PIR-CSI schemes, referred to as \emph{server-symmetric schemes}, in which the queries/answers to/from different servers are symmetric in structure. We define the \emph{rate} of a PIR-CSI scheme as its minimum download rate among all problem instances, and define the \emph{server-symmetric capacity} of the PIR-CSI problem as the supremum of rates over all server-symmetric PIR-CSI schemes. Our main results are as follows: (i) when the side information is not a function of the user's requested message, the capacity is given by ${(1+{1}/{N}+\dots+{1}/{N^{\left\lceil \frac{K}{M+1}\right\rceil -1}})^{-1}}$ for any ${1\leq M\leq K-1}$; and (ii) when the side information is a function of the user's requested message, the capacity is equal to $1$ for $M=2$ and $M=K$, and it is equal to ${N}/{(N+1)}$ for any ${3 \leq M \leq K-1}$. The converse proofs rely on new information-theoretic arguments, and the achievability schemes are inspired by our recently proposed scheme for single-server PIR-CSI as well as the Sun-Jafar scheme for multi-server PIR.

en cs.IT
arXiv Open Access 2019
Information-Theoretic Perspective of Federated Learning

Linara Adilova, Julia Rosenzweig, Michael Kamp

An approach to distributed machine learning is to train models on local datasets and aggregate these models into a single, stronger model. A popular instance of this form of parallelization is federated learning, where the nodes periodically send their local models to a coordinator that aggregates them and redistributes the aggregation back to continue training with it. The most frequently used form of aggregation is averaging the model parameters, e.g., the weights of a neural network. However, due to the non-convexity of the loss surface of neural networks, averaging can lead to detrimental effects and it remains an open question under which conditions averaging is beneficial. In this paper, we study this problem from the perspective of information theory: We measure the mutual information between representation and inputs as well as representation and labels in local models and compare it to the respective information contained in the representation of the averaged model. Our empirical results confirm previous observations about the practical usefulness of averaging for neural networks, even if local dataset distributions vary strongly. Furthermore, we obtain more insights about the impact of the aggregation frequency on the information flow and thus on the success of distributed learning. These insights will be helpful both in improving the current synchronization process and in further understanding the effects of model aggregation.

en cs.LG, cs.DC
DOAJ Open Access 2018
Eye Tracking Method in Human-Computer Interaction: Assessing the Interaction based on the Eye Movement Data

Mahdi Zahedi Nooghabi, Rahmatollah Fattahi, Javad Salehi Fadardi et al.

Nowadays most of the day today services we receive are based upon computer systems. Services such as information searching or online shopping are considered among the most frequent online information systems’ services. Users assess and process the information they receive from information systems. The theory of mind information processing asserts that humans process and analyze the information they receive from their environment. This theory also deals with the perception and recognition. User-interface paves the way regarding using and reaching the goal for the ultimate users. If user-interface is designed properly, the way through which the user reaches his/her goal would be a logical one; otherwise the lack of solidarity would result in system misuse. In other words if the user-interface grabs the user’s attention, the interaction with the user would be successful. For studying the human-computer interaction a lot of methods have been proposed. Eye-tracking method is one of them. This method makes it possible to gather qualitative and quantitative data in this regard. Eyesight is very important regarding human perception, thus its data could be invaluable. The basis of this method is mind-eye theory which says eyes’movements could show the attention of a person regarding a picture or stimuli. This attention could report passion or problem. There are different types of eye movement, such as a fixation and saccade. Eye tracking delivers a voluminous data regarding users’ attention in the form of quick and unconscious processes. Analyzing the eye-movement data is hard and its data extraction is tedious. Moreover the test environment and users’ health are also of great importance in this regard. At last one must mention that eye movement data is invaluable for assessing the bottom-up cognition of the world and the top-down perception of mind.

Bibliography. Library science. Information resources
arXiv Open Access 2018
Supporting High-Performance and High-Throughput Computing for Experimental Science

E. A. Huerta, Roland Haas, Shantenu Jha et al.

The advent of experimental science facilities-instruments and observatories, such as the Large Hadron Collider, the Laser Interferometer Gravitational Wave Observatory, and the upcoming Large Synoptic Survey Telescope-has brought about challenging, large-scale computational and data processing requirements. Traditionally, the computing infrastructure to support these facility's requirements were organized into separate infrastructure that supported their high-throughput needs and those that supported their high-performance computing needs. We argue that to enable and accelerate scientific discovery at the scale and sophistication that is now needed, this separation between high-performance computing and high-throughput computing must be bridged and an integrated, unified infrastructure provided. In this paper, we discuss several case studies where such infrastructure has been implemented. These case studies span different science domains, software systems, and application requirements as well as levels of sustainability. A further aim of this paper is to provide a basis to determine the common characteristics and requirements of such infrastructure, as well as to begin a discussion of how best to support the computing requirements of existing and future experimental science facilities.

en cs.DC, astro-ph.HE
DOAJ Open Access 2017
Стигматизація як один із засобів маніпулятивного впливу мас-медіа на аудиторію

M. V. Smyrnova

Наведено огляд публікацій, у яких досліджено особливості процесу стигматизації, її причини і наслідки. Описано розуміння стигматизатора у двох аспектах — як учасника процесу стигматизації на індивідуальному та суспільному рівнях. Запропоновано визначення поняття стигми та стигматизації в соціальних комунікаціях. Розглянуто стигматизацію як наслідок помилок у сприйнятті аудиторією інформації з мас-медіа та як засіб маніпулятивного впливу медіа на суспільну свідомість. Виокремлено види стигматизації на основі попередніх класифікацій та проаналізовано наявність цих видів у рейтингових всеукраїнських онлайн-виданнях.

Bibliography. Library science. Information resources
arXiv Open Access 2017
Compressed Sensing with Prior Information via Maximizing Correlation

Xu Zhang, Wei Cui, Yulong Liu

Compressed sensing (CS) with prior information concerns the problem of reconstructing a sparse signal with the aid of a similar signal which is known beforehand. We consider a new approach to integrate the prior information into CS via maximizing the correlation between the prior knowledge and the desired signal. We then present a geometric analysis for the proposed method under sub-Gaussian measurements. Our results reveal that if the prior information is good enough, then the proposed approach can improve the performance of the standard CS. Simulations are provided to verify our results.

en cs.IT

Halaman 25 dari 932909