Popescu-Rohrlich box fraction of nonobjective information and distinguishing quantum theory
Chellasamy Jebarathinam
It is demonstrated that identifying information-theoretic limitations of quantum Bell nonlocality alone cannot completely distinguish quantum theory from generalized nonsignaling theories. To this end, an information-theoretic concept of certifying nonobjective information by the Popescu-Rohrlich box fraction is employed. Furthermore, in the aforementioned demonstration, a partial answer to the question of what distinguishes quantum theory from generalized nonsignaling theories emerges beyond the one provided by the principle of information causality alone. This is accomplished by demonstrating that postquantum models identified by the information causality are isolated by the emergence of the Popescu-Rohrlich box fraction of nonobjective information in Bell-local boxes of a nonsignaling model, over the other nonsignaling models.
SCNR Maximization for MIMO ISAC Assisted by Fluid Antenna System
Yuqi Ye, Li You, Hao Xu
et al.
The integrated sensing and communication (ISAC) technology has been extensively researched to enhance communication rates and radar sensing capabilities. Additionally, a new technology known as fluid antenna system (FAS) has recently been proposed to obtain higher communication rates for future wireless networks by dynamically altering the antenna position to obtain a more favorable channel condition. The application of the FAS technology in ISAC scenarios holds significant research potential. In this paper, we investigate a FAS-assisted multiple-input multiple-output (MIMO) ISAC system for maximizing the radar sensing signal-clutter-noise ratio (SCNR) under communication signal-to-interference-plus-noise ratio (SINR) and antenna position constraints. We devise an iterative algorithm that tackles the optimization problem by maximizing a lower bound of SCNR with respect to the transmit precoding matrix and the antenna position. By addressing the non-convexity of the problem through this iterative approach, our method significantly improves the SCNR. Our simulation results demonstrate that the proposed scheme achieves a higher SCNR compared to the baselines.
A Survey of Datasets for Information Diffusion Tasks
Fuxia Guo, Xiaowen Wang, Yanwei Xie
et al.
Information diffusion across various new media platforms gradually influences perceptions, decisions, and social behaviors of individual users. In communication studies, the famous Five W's of Communication model (5W Model) has displayed the process of information diffusion clearly. At present, although plenty of studies and corresponding datasets about information diffusion have emerged, a systematic categorization of tasks and an integration of datasets are still lacking. To address this gap, we survey a systematic taxonomy of information diffusion tasks and datasets based on the "5W Model" framework. We first categorize the information diffusion tasks into ten subtasks with definitions and datasets analysis, from three main tasks of information diffusion prediction, social bot detection, and misinformation detection. We also collect the publicly available dataset repository of information diffusion tasks with the available links and compare them based on six attributes affiliated to users and content: user information, social network, bot label, propagation content, propagation network, and veracity label. In addition, we discuss the limitations and future directions of current datasets and research topics to advance the future development of information diffusion. The dataset repository can be accessed at our website https://github.com/fuxiaG/Information-Diffusion-Datasets.
A Framework for Curriculum Transformation in Quantum Information Science and Technology Education
Simon Goorney, Jonas Bley, Stefan Heusler
et al.
The field of Quantum Information Science & Technology (QIST) is booming. Due to this, many new educational courses and university programs are needed in order to prepare a workforce for the developing industry. Owing to its specialist nature, teaching approaches in this field can easily become disconnected from the substantial degree of science education research which aims to support the best approaches to teaching in Science, Technology, Engineering & Mathematics (STEM) fields. In order to connect these two communities with a pragmatic and repeatable methodology, we have synthesised this educational research into a decision-tree based theoretical model for the transformation of QIST curricula, intended to provide a didactical perspective for practitioners. The Quantum Curriculum Transformation Framework (QCTF) consists of four steps: 1. choose a topic, 2. choose one or more targeted skills, 3. choose a learning goal and 4. choose a teaching approach that achieves this goal. We show how this can be done using an example curriculum and more specifically quantum teleportation as a basic concept of quantum communication within this curriculum. By approaching curriculum creation and transformation in this way, educational goals and outcomes are more clearly defined which is in the interest of the individual and the industry alike. The framework is intended to structure the narrative of QIST teaching, and with future testing and refinement it will form a basis for further research in the didactics of QIST.
en
physics.ed-ph, quant-ph
Single-Server Private Information Retrieval with Side Information Under Arbitrary Popularity Profiles
Alejandro Gomez-Leos, Anoosheh Heidarzadeh
This paper introduces a generalization of the Private Information Retrieval with Side Information (PIR-SI) problem called Popularity-Aware PIR-SI (PA-PIR-SI). The PA-PIR-SI problem includes one or more remote servers storing copies of a dataset of $K$ messages, and a user who knows $M$ out of $K$ messages -- the identities of which are unknown to the server -- as a prior side information, and wishes to retrieve one of the remaining $K-M$ messages. The goal of the user is to minimize the amount of information they must download from the server while revealing no information about the identity of the desired message. In contrast to PIR-SI, in PA-PIR-SI, the dataset messages are not assumed to be equally popular. That is, given the $M$ side information messages, each of the remaining $K-M$ messages is not necessarily equally likely to be the message desired by the user. In this work, we focus on the single-server setting of PA-PIR-SI, and establish lower and upper bounds on the capacity of this setting -- defined as the maximum possible achievable download rate. Our upper bound holds for any message popularity profile, and is the same as the capacity of single-server PIR-SI. We prove the lower bound by presenting a PA-PIR-SI scheme which takes a novel probabilistic approach -- carefully designed based on the popularity profile -- to integrate two existing PIR-SI schemes. The rate of our scheme is strictly higher than that of the only existing PIR-SI scheme applicable to the PA-PIR-SI setting.
Information-theoretically secure equality-testing protocol with dispute resolution
Go Kato, Mikio Fujiwara, Toyohiro Tsurumaru
There are often situations where two remote users each have data, and wish to (i) verify the equality of their data, and (ii) whenever a discrepancy is found afterwards, determine which of the two modified his data. The most common example is where they want to authenticate messages they exchange. Another possible example is where they have a huge database and its mirror in remote places, and whenever a discrepancy is found between their data, they can determine which of the two users is to blame. Of course, if one is allowed to use computational assumptions, this function can be realized readily, e.g., by using digital signatures. However, if one needs information-theoretic security, there is no known method that realizes this function efficiently, i.e., with secret key, communication, and trusted third parties all being sufficiently small. In order to realize this function efficiently with information-theoretic security, we here define the ``equality-testing protocol with dispute resolution'' as a new framework. The most significant difference between our protocol and the previous methods with similar functions is that we allow the intervention of a trusted third party when checking the equality of the data. In this new framework, we also present an explicit protocol that is information-theoretically secure and efficient.
Understanding Technology Use in Global Virtual Teams: Research Methodologies and Methods
Tony Clear, Stephen G. MacDonell
Context: The globalisation of activities associated with software development and use has introduced many challenges in practice and for research. While the predominant approach to research in software engineering has followed a positivist science model, this approach may be sub-optimal when addressing problems with a dominant social or cultural dimension, such as those frequently encountered when studying work practices in a globally distributed team setting. The investigation of such a team reported in this paper provides one example of an alternative approach to research in a global context, through a longitudinal interpretive field study seeking to understand how global virtual teams mediated the use of technology. Objective: Our focus in this paper is on the conduct of research in the context of global software activities, particularly as applied to the actions and interactions of global virtual teams. Method: We describe how we undertook a substantial field study of global virtual teams, and highlight how the adopted structuration theory enabled us to deliver effectively against our goals. Results: We believe that the approach taken suited a research context in which situated practices were occurring over time in a highly complex domain, ensuring that our results were both strongly grounded and relevant to practice. It has resulted in the generation of substantive theory and techniques that have been adapted and applied on a pilot basis in further field settings. Conclusion: We conclude that globally distributed teamwork presents a complex context which demands new research approaches, beyond the limited set customarily applied by software engineering researchers. We advocate experimenting with different research methodologies and methods so that we have a more rounded repertoire to address the most important and relevant issues in global software development research.(Abridged)
Information cohomology of classical vector-valued observables
Juan Pablo Vigneaux
We provide here a novel algebraic characterization of two information measures associated with a vector-valued random variable, its differential entropy and the dimension of the underlying space, purely based on their recursive properties (the chain rule and the nullity-rank theorem, respectively). More precisely, we compute the information cohomology of Baudot and Bennequin with coefficients in a module of continuous probabilistic functionals over a category that mixes discrete observables and continuous vector-valued observables, characterizing completely the 1-cocycles; evaluated on continuous laws, these cocycles are linear combinations of the differential entropy and the dimension.
Tensor Factorization with Label Information for Fake News Detection
Frosso Papanastasiou, Georgios Katsimpras, Georgios Paliouras
The buzz over the so-called "fake news" has created concerns about a degenerated media environment and led to the need for technological solutions. As the detection of fake news is increasingly considered a technological problem, it has attracted considerable research. Most of these studies primarily focus on utilizing information extracted from textual news content. In contrast, we focus on detecting fake news solely based on structural information of social networks. We suggest that the underlying network connections of users that share fake news are discriminative enough to support the detection of fake news. Thereupon, we model each post as a network of friendship interactions and represent a collection of posts as a multidimensional tensor. Taking into account the available labeled data, we propose a tensor factorization method which associates the class labels of data samples with their latent representations. Specifically, we combine a classification error term with the standard factorization in a unified optimization process. Results on real-world datasets demonstrate that our proposed method is competitive against state-of-the-art methods by implementing an arguably simpler approach.
Fabrication technology for high light-extraction ultraviolet thin-film flip-chip (UV TFFC) LEDs grown on SiC
Burhan K. SaifAddin, Abdullah Almogbel, Christian J. Zollner
et al.
The light output of deep ultraviolet (UV-C) AlGaN light-emitting diodes (LEDs) is limited due to their poor light extraction efficiency (LEE). To improve the LEE of AlGaN LEDs, we developed a fabrication technology to process AlGaN LEDs grown on SiC into thin-film flip-chip LEDs (TFFC LEDs) with high LEE. This process transfers the AlGaN LED epi onto a new substrate by wafer-to-wafer bonding, and by removing the absorbing SiC substrate with a highly selective SF6 plasma etch that stops at the AlN buffer layer. We optimized the inductively coupled plasma (ICP) SF6 etch parameters to develop a substrate-removal process with high reliability and precise epitaxial control, without creating micromasking defects or degrading the health of the plasma etching system. The SiC etch rate by SF6 plasma was ~46 μm/hr at a high RF bias (400 W), and ~7 μm/hr at a low RF bias (49 W) with very high etch selectivity between SiC and AlN. The high SF6 etch selectivity between SiC and AlN was essential for removing the SiC substrate and exposing a pristine, smooth AlN surface. We demonstrated the epi-transfer process by fabricating high light extraction TFFC LEDs from AlGaN LEDs grown on SiC. To further enhance the light extraction, the exposed N-face AlN was anisotropically etched in dilute KOH. The LEE of the AlGaN LED improved by ~3X after KOH roughening at room temperature. This AlGaN TFFC LED process establishes a viable path to high external quantum efficiency (EQE) and power conversion efficiency (PCE) UV-C LEDs.
en
physics.app-ph, cond-mat.mtrl-sci
The Capacity of Symmetric Private Information Retrieval
Hua Sun, Syed A. Jafar
Private information retrieval (PIR) is the problem of retrieving as efficiently as possible, one out of $K$ messages from $N$ non-communicating replicated databases (each holds all $K$ messages) while keeping the identity of the desired message index a secret from each individual database. Symmetric PIR (SPIR) is a generalization of PIR to include the requirement that beyond the desired message, the user learns nothing about the other $K-1$ messages. The information theoretic capacity of SPIR (equivalently, the reciprocal of minimum download cost) is the maximum number of bits of desired information that can be privately retrieved per bit of downloaded information. We show that the capacity of SPIR is $1-1/N$ regardless of the number of messages $K$, if the databases have access to common randomness (not available to the user) that is independent of the messages, in the amount that is at least $1/(N-1)$ bits per desired message bit, and zero otherwise. Extensions to the capacity region of SPIR and the capacity of finite length SPIR are provided.
On the Effects of Low-Quality Training Data on Information Extraction from Clinical Reports
Diego Marcheggiani, Fabrizio Sebastiani
In the last five years there has been a flurry of work on information extraction from clinical documents, i.e., on algorithms capable of extracting, from the informal and unstructured texts that are generated during everyday clinical practice, mentions of concepts relevant to such practice. Most of this literature is about methods based on supervised learning, i.e., methods for training an information extraction system from manually annotated examples. While a lot of work has been devoted to devising learning methods that generate more and more accurate information extractors, no work has been devoted to investigating the effect of the quality of training data on the learning process. Low quality in training data often derives from the fact that the person who has annotated the data is different from the one against whose judgment the automatically annotated data must be evaluated. In this paper we test the impact of such data quality issues on the accuracy of information extraction systems as applied to the clinical domain. We do this by comparing the accuracy deriving from training data annotated by the authoritative coder (i.e., the one who has also annotated the test data, and by whose judgment we must abide), with the accuracy deriving from training data annotated by a different coder. The results indicate that, although the disagreement between the two coders (as measured on the training set) is substantial, the difference is (surprisingly enough) not always statistically significant.
Highly-cited papers in Library and Information Science (LIS): Authors, institutions, and network structures
Johann Bauer, Loet Leydesdorff, Lutz Bornmann
As a follow-up to the highly-cited authors list published by Thomson Reuters in June 2014, we analyze the top-1% most frequently cited papers published between 2002 and 2012 included in the Web of Science (WoS) subject category "Information Science & Library Science." 798 authors contributed to 305 top-1% publications; these authors were employed at 275 institutions. The authors at Harvard University contributed the largest number of papers, when the addresses are whole-number counted. However, Leiden University leads the ranking, if fractional counting is used. Twenty-three of the 798 authors were also listed as most highly-cited authors by Thomson Reuters in June 2014 (http://highlycited.com/). Twelve of these 23 authors were involved in publishing four or more of the 305 papers under study. Analysis of co-authorship relations among the 798 highly-cited scientists shows that co-authorships are based on common interests in a specific topic. Three topics were important between 2002 and 2012: (1) collection and exploitation of information in clinical practices, (2) the use of internet in public communication and commerce, and (3) scientometrics.
Mining Local Gazetteers of Literary Chinese with CRF and Pattern based Methods for Biographical Information in Chinese History
Chao-Lin Liu, Chih-Kai Huang, Hongsu Wang
et al.
Person names and location names are essential building blocks for identifying events and social networks in historical documents that were written in literary Chinese. We take the lead to explore the research on algorithmically recognizing named entities in literary Chinese for historical studies with language-model based and conditional-random-field based methods, and extend our work to mining the document structures in historical documents. Practical evaluations were conducted with texts that were extracted from more than 220 volumes of local gazetteers (Difangzhi). Difangzhi is a huge and the single most important collection that contains information about officers who served in local government in Chinese history. Our methods performed very well on these realistic tests. Thousands of names and addresses were identified from the texts. A good portion of the extracted names match the biographical information currently recorded in the China Biographical Database (CBDB) of Harvard University, and many others can be verified by historians and will become as new additions to CBDB.
Can electoral popularity be predicted using socially generated big data?
Taha Yasseri, Jonathan Bright
Today, our more-than-ever digital lives leave significant footprints in cyberspace. Large scale collections of these socially generated footprints, often known as big data, could help us to re-investigate different aspects of our social collective behaviour in a quantitative framework. In this contribution we discuss one such possibility: the monitoring and predicting of popularity dynamics of candidates and parties through the analysis of socially generated data on the web during electoral campaigns. Such data offer considerable possibility for improving our awareness of popularity dynamics. However they also suffer from significant drawbacks in terms of representativeness and generalisability. In this paper we discuss potential ways around such problems, suggesting the nature of different political systems and contexts might lend differing levels of predictive power to certain types of data source. We offer an initial exploratory test of these ideas, focussing on two data streams, Wikipedia page views and Google search queries. On the basis of this data, we present popularity dynamics from real case examples of recent elections in three different countries.
Value of information in noncooperative games
Nils Bertschinger, David H. Wolpert, Eckehard Olbrich
et al.
In some games, additional information hurts a player, e.g., in games with first-mover advantage, the second-mover is hurt by seeing the first-mover's move. What properties of a game determine whether it has such negative "value of information" for a particular player? Can a game have negative value of information for all players? To answer such questions, we generalize the definition of marginal utility of a good to define the marginal utility of a parameter vector specifying a game. So rather than analyze the global structure of the relationship between a game's parameter vector and player behavior, as in previous work, we focus on the local structure of that relationship. This allows us to prove that generically, every game can have negative marginal value of information, unless one imposes a priori constraints on allowed changes to the game's parameter vector. We demonstrate these and related results numerically, and discuss their implications.
Backing off from Infinity: Performance Bounds via Concentration of Spectral Measure for Random MIMO Channels
Yuxin Chen, Andrea J. Goldsmith, Yonina C. Eldar
The performance analysis of random vector channels, particularly multiple-input-multiple-output (MIMO) channels, has largely been established in the asymptotic regime of large channel dimensions, due to the analytical intractability of characterizing the exact distribution of the objective performance metrics. This paper exposes a new non-asymptotic framework that allows the characterization of many canonical MIMO system performance metrics to within a narrow interval under moderate-to-large channel dimensionality, provided that these metrics can be expressed as a separable function of the singular values of the matrix. The effectiveness of our framework is illustrated through two canonical examples. Specifically, we characterize the mutual information and power offset of random MIMO channels, as well as the minimum mean squared estimation error of MIMO channel inputs from the channel outputs. Our results lead to simple, informative, and reasonably accurate control of various performance metrics in the finite-dimensional regime, as corroborated by the numerical simulations. Our analysis framework is established via the concentration of spectral measure phenomenon for random matrices uncovered by Guionnet and Zeitouni, which arises in a variety of random matrix ensembles irrespective of the precise distributions of the matrix entries.
Distributed High Dimensional Information Theoretical Image Registration via Random Projections
Zoltan Szabo, Andras Lorincz
Information theoretical measures, such as entropy, mutual information, and various divergences, exhibit robust characteristics in image registration applications. However, the estimation of these quantities is computationally intensive in high dimensions. On the other hand, consistent estimation from pairwise distances of the sample points is possible, which suits random projection (RP) based low dimensional embeddings. We adapt the RP technique to this task by means of a simple ensemble method. To the best of our knowledge, this is the first distributed, RP based information theoretical image registration approach. The efficiency of the method is demonstrated through numerical examples.
Living at the Edge: A Large Deviations Approach to the Outage MIMO Capacity
P. Kazakopoulos, P. Mertikopoulos, A. L. Moustakas
et al.
Using a large deviations approach we calculate the probability distribution of the mutual information of MIMO channels in the limit of large antenna numbers. In contrast to previous methods that only focused at the distribution close to its mean (thus obtaining an asymptotically Gaussian distribution), we calculate the full distribution, including its tails which strongly deviate from the Gaussian behavior near the mean. The resulting distribution interpolates seamlessly between the Gaussian approximation for rates $R$ close to the ergodic value of the mutual information and the approach of Zheng and Tse for large signal to noise ratios $ρ$. This calculation provides us with a tool to obtain outage probabilities analytically at any point in the $(R, ρ, N)$ parameter space, as long as the number of antennas $N$ is not too small. In addition, this method also yields the probability distribution of eigenvalues constrained in the subspace where the mutual information per antenna is fixed to $R$ for a given $ρ$. Quite remarkably, this eigenvalue density is of the form of the Marcenko-Pastur distribution with square-root singularities, and it depends on the values of $R$ and $ρ$.
en
cs.IT, cond-mat.stat-mech
Information Spectrum Approach to Second-Order Coding Rate in Channel Coding
Masahito Hayashi
Second-order coding rate of channel coding is discussed for general sequence of channels. The optimum second-order transmission rate with a constant error constraint $ε$ is obtained by using the information spectrum method. We apply this result to the discrete memoryless case, the discrete memoryless case with a cost constraint, the additive Markovian case, and the Gaussian channel case with an energy constraint. We also clarify that the Gallager bound does not give the optimum evaluation in the second-order coding rate.