Large language models (LLMs) are increasingly integral to information retrieval (IR), powering ranking, evaluation, and AI-assisted content creation. This widespread adoption necessitates a critical examination of potential biases arising from the interplay between these LLM-based components. This paper synthesizes existing research and presents novel experiment designs that explore how LLM-based rankers and assistants influence LLM-based judges. We provide the first empirical evidence of LLM judges exhibiting significant bias towards LLM-based rankers. Furthermore, we observe limitations in LLM judges' ability to discern subtle system performance differences. Contrary to some previous findings, our preliminary study does not find evidence of bias against AI-generated content. These results highlight the need for a more holistic view of the LLM-driven information ecosystem. To this end, we offer initial guidelines and a research agenda to ensure the reliable use of LLMs in IR evaluation.
With the development of innovation and entrepreneurship education, the classification and integration of educational resources have become the key to improving education quality. However, traditional methods cannot deal with complex and multi-dimensional educational resource data. Therefore, this study proposes a classification and integration model of innovation and entrepreneurship education resources based on GNN-PSO (graph neural networks and particle swarm optimization). The model uses the powerful feature extraction ability of GNN to dig deep into the internal relationship between educational resources. At the same time, it optimizes the classification and integration process with the help of the PSO algorithm's global search advantage. In the experimental link, we constructed an experimental dataset containing 10,000 innovation and entrepreneurship education resources, covering multi-dimensional information such as courses, cases, and teachers. Through comparative experiments, the GNN-PSO model's classification accuracy reached 92.5 %, 15.3 percentage points higher than that of the traditional machine learning model. Regarding resource integration efficiency, the processing time of the GNN-PSO model is shortened by 40 %, which significantly improves the management efficiency of educational resources. In addition, this study explores the application effect of the model under different educational resources. It finds that the GNN-PSO model has good generalization ability and scalability. The experimental results confirm that the classification and integration model of innovation and entrepreneurship education resources based on the GNN-PSO algorithm improves classification accuracy and optimizes the resource integration process, providing strong support for the development of innovation and entrepreneurship education.
Information technology, Electronic computers. Computer science
With global climate change, urbanization, and agricultural resource limitations, precision agriculture and crop monitoring are crucial worldwide. Integrating multi-source remote sensing data with deep learning enables accurate crop mapping, but selecting optimal network architectures remains challenging. To improve remote sensing-based fruit planting classification and support orchard management and rural revitalization, this study explored feature selection and network optimization. We proposed an improved CF-EfficientNet model (incorporating FGMF and CGAR modules) for fruit planting classification. Multi-source remote sensing data (Sentinel-1, Sentinel-2, and SRTM) were used to extract spectral, vegetation, polarization, terrain, and texture features, thereby constructing a high-dimensional feature space. Feature selection identified 13 highly discriminative bands, forming an optimal dataset, namely the preferred bands (PBs). At the same time, two classification datasets—multi-spectral bands (MS) and preferred bands (PBs)—were constructed, and five typical deep learning models were introduced to compare performance: (1) EfficientNetB0, (2) AlexNet, (3) VGG16, (4) ResNet18, (5) RepVGG. The experimental results showed that the EfficientNetB0 model based on the preferred band performed best in terms of overall accuracy (87.1%) and Kappa coefficient (0.677). Furthermore, a Fine-Grained Multi-scale Fusion (FGMF) and a Condition-Guided Attention Refinement (CGAR) were incorporated into EfficientNetB0, and the traditional SGD optimizer was replaced with Adam to construct the CF-EfficientNet architecture. The results indicated that the improved CF-EfficientNet model achieved high performance in crop classification, with an overall accuracy of 92.6% and a Kappa coefficient of 0.830. These represent improvements of 5.5 percentage points and 0.153, compared with the baseline model, demonstrating superiority in both classification accuracy and stability.
Biological information processing manifests a huge variety in its complexity and capability among different organisms, which presumably stems from the evolutionary optimization under limited computational resources. Starting from the simplest memory-less responsive behaviors, more complicated information processing using internal memory may have developed in the evolution as more resources become available. In this letter, we report that optimal information processing strategy can show discontinuous transitions along with the available resources, i.e., reliability of sensing and intrinsic dynamics, or the cost of memory control. In addition, we show that transition is not always progressive but can be regressed. Our result obtained under a minimal setup suggests that the capability and complexity of information processing would be an evolvable trait that can switch back and forth between different strategies and architectures in a punctuated manner.
In the Internet of Vehicles (IoV), Age of Information (AoI) has become a vital performance metric for evaluating the freshness of information in communication systems. Although many studies aim to minimize the average AoI of the system through optimized resource scheduling schemes, they often fail to adequately consider the queue characteristics. Moreover, the vehicle mobility leads to rapid changes in network topology and channel conditions, making it difficult to accurately reflect the unique characteristics of vehicles with the calculated AoI under ideal channel conditions. This paper examines the impact of Doppler shifts caused by vehicle speeds on data transmission in error-prone channels. Based on the M/M/1 and D/M/1 queuing theory models, we derive expressions for the Age of Information and optimize the system's average AoI by adjusting the data extraction rates of vehicles (which affect system utilization). We propose an online optimization algorithm that dynamically adjusts the vehicles' data extraction rates based on environmental changes to ensure optimal AoI. Simulation results have demonstrated that adjusting the data extraction rates of vehicles can significantly reduce the system's AoI. Additionally, in the network scenario of this work, the AoI of the D/M/1 system is lower than that of the M/M/1 system.
William B. Leacock, Kurt T. Smith, William W. Deacy
Abstract Background Access to salmon resources is vital to coastal brown bear (Ursus arctos) populations. Deciphering patterns of travel allowing coastal brown bears to exploit salmon resources dispersed across the landscape is critical to understanding their behavioral ecology, maintaining landscape connectivity for the species, and developing conservation strategies. Methods We modeled travel behavior of 51 radio-collared female Kodiak brown bears (U. a. middendorffi) from 2008 to 2015 during the sockeye salmon (Oncorhynchus nerka) stream spawning season to identify landscape patterns associated with travel pathways. To accomplish this, we first identified behavioral states of marked individuals, and then developed a resource selection function (RSF) to evaluate environmental covariates that were predictors of selection during travel behavior. Results Landcover edges, elderberry-salmonberry stands, lowland tundra, elevation, terrain position, and stream length influenced selection for travel corridors. The RSF validated well and was comparable to corridors identified by pathways used by bears while travelling. Conclusions Models identified spatial predictions of the relative probability of selection while bears were travelling during the salmon spawning season and identified areas that contained potential movement corridors important for bears inhabiting Kodiak Island. Our results characterized factors influencing travel, identified important movement corridors, and provided managers with information to make informed resource management decisions.
The Huangshui River Basin is one of the most important water sources in the Qinghai Province and is of great importance for ecological protection measures, agricultural irrigation and tourism. Based on previous studies and fieldwork related to plant species in China, this study presents comprehensive data on vascular plants distributed in the Huangshui River Basin of Qinghai Province.Ethical Compliance: All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.Data Access Statement: Research data supporting this publication are available from the repository at located at https://www.scidb.cn/en/anonymous/QUpuZVEz.Conflict of Interest declaration: The authors declare that they have NO affiliations with or involvement in any organisation or entity with any financial interest in the subject matter or materials discussed in this manuscript.The checklist of plants includes ferns, gymnosperms and angiosperms, covering three phyla, five classes, 49 orders, 139 families, 709 genera and 2,382 species. It includes numerous Asteraceae, Gramineae, Rosaceae and Fabaceae along with statistical data on the number of species distributed in different regions. The dataset presented in this article provides important background information on vascular plants in the Huangshui River Basin and, therefore, plays a crucial role in the protection and management of plant resources in this region.
This paper considers a two-player game where each player chooses a resource from a finite collection of options. Each resource brings a random reward. Both players have statistical information regarding the rewards of each resource. Additionally, there exists an information asymmetry where each player has knowledge of the reward realizations of different subsets of the resources. If both players choose the same resource, the reward is divided equally between them, whereas if they choose different resources, each player gains the full reward of the resource. We first implement the iterative best response algorithm to find an $ε$-approximate Nash equilibrium for this game. This method of finding a Nash equilibrium may not be desirable when players do not trust each other and place no assumptions on the incentives of the opponent. To handle this case, we solve the problem of maximizing the worst-case expected utility of the first player. The solution leads to counter-intuitive insights in certain special cases. To solve the general version of the problem, we develop an efficient algorithmic solution that combines online convex optimization and the drift-plus penalty technique.
Ruslan Suleymanov, Azamat Suleymanov, Gleb Zaitsev
et al.
Traditional land-use systems can be modified under the conditions of climate change. Higher air temperatures and loss of productive soil moisture lead to reduced crop yields. Irrigation is a possible solution to these problems. However, intense irrigation may have contributed to land degradation. This research assessed the ameliorative potential of soil and produced large-scale digital maps of soil properties for arable plot planning for the construction and operation of irrigation systems. Our research was carried out in the southern forest–steppe zone (Southern Ural, Russia). The soil cover of the site is represented by agrochernozem soils (Luvic Chernozem). We examined the morphological, physicochemical and agrochemical properties of the soil, as well as its heavy metal contents. The random forest (RF) non-linear approach was used to estimate the spatial distribution of the properties and produce maps. We found that soils were characterized by high organic carbon content (SOC) and neutral acidity and were well supplied with nitrogen and potassium concentrations. The agrochernozem was characterized by favorable water–physical properties and showed good values for water infiltration and moisture categories. The contents of heavy metals (lead, cadmium, mercury, cobalt, zinc and copper) did not exceed permissible levels. The soil quality rating interpretation confirms that these soils have high potential fertility and are convenient for irrigation activities. The spatial distribution of soil properties according to the generated maps were not homogeneous. The results showed that remote sensing covariates were the most critical variables in explaining soil properties variability. Our findings may be useful for developing reclamation strategies for similar soils that can restore soil health and improve crop productivity.
Somayeh Tamjid, Fatemeh Nooshinfard, Moluk S. Hoseini Beheshti
et al.
Following recent trends in information management systems, conventional word-based information retrieval methods are changing to concept-based approaches by means of the broad application of ontologies. More specifically, the use of ontologies for knowledge management is significant in the medical sciences and human disease domains due to the diversity and necessity of information sharing between numerous data repositories such as medical records, health record systems, and so on. Furthermore, ontologies make natural language processing approaches more feasible by reducing semantic ambiguity and making concepts comprehensible to computer-based deductions. In this research, a semi-automated approach for ontology development is proposed, which assists in identifying structural components of an ontology and determining possible relations between them based on scientific text records. The proposed approach, in a general view, includes the gathering of a large volume of technical data in text format, processing, and extraction of results with a minimal contribution of human-based supervision. The processing stage is coded in Matlab code named TmbOnt_Alfa and applies two main techniques including word frequency and Lexico-Synactic patterns analysis, to identify concepts and relations, respectively. The role of the human supervisor is narrowed to entering target terms, eliminating unnecessary outputs, and finalizing the ontology structure. In order to evaluate the efficiency of the proposed method, a case study for ontological development in the field of glaucoma has been conducted, and results are compared with medical subject headings of MESH descriptors, the Persian medical thesaurus, ontology of diseases, and Bioassay ontology (BAO).
According to results, the developed ontology, when compared by Glaucoma entry, covered 80% of the medical titles in Mesh, 100% of the medical terms developed in the Persian Medical Thesaurus, and 100% of the Persian medical descriptors. Moreover, the resultant ontology structure is compatible with more than 90% of the same ontology represented in Bioassay and 57% of the ontology of diseases (DO). It also proposed an average of 30% more terms for existing ontological structures.
According to results, the developed ontology, when compared by Glaucoma entry, covered 80% of the medical titles in Mesh, 100% of the medical terms developed in the Persian Medical Thesaurus, and 100% of the Persian medical descriptors. Moreover, the resultant ontology structure is compatible with more than 90% of the same ontology represented in Bioassay and 57% of the ontology of diseases (DO). It also proposed an average of 30% more terms for existing ontological structures.
Bibliography. Library science. Information resources
The quadratic decaying property of the information rate function states that given a fixed conditional distribution $p_{\mathsf{Y}|\mathsf{X}}$, the mutual information between the (finite) discrete random variables $\mathsf{X}$ and $\mathsf{Y}$ decreases at least quadratically in the Euclidean distance as $p_\mathsf{X}$ moves away from the capacity-achieving input distributions. It is a property of the information rate function that is particularly useful in the study of higher order asymptotics and finite blocklength information theory, where it was already implicitly used by Strassen [1] and later, more explicitly, by Polyanskiy-Poor-Verdú [2]. However, the proofs outlined in both works contain gaps that are nontrivial to close. This comment provides an alternative, complete proof of this property.
<i>Paris polyphylla</i> is an important medicinal plant that can biosynthesize polyphyllins with multiple effective therapies, ranging from anti-inflammation to antitumor; however, the genetic diversity of <i>Paris polyphylla</i> is still unclear. To explore the genetic characteristics of cultivation populations in primary planting areas, we developed 10 expressed sequence tag simple sequence repeat (EST-SSR) markers related to polyphyllin backbone biosynthesis and utilized them in 136 individuals from 10 cultivated populations of <i>P. polyphylla</i> var. <i>yunnanensis</i>. The genetic diversity index showed that ten loci had relatively high genetic polymorphism levels. Shannon information of loci suggested that more information occurred within population and less information occurred among population. In addition, the overall populations exhibited a low degree of differentiation among populations, but maintained a high degree of genetic diversity among individuals, resulting in high gene flow and general hybridization. The genetic structure analysis revealed that 10 populations possibly derived from two ancestral groups and all individuals were found with different levels of admixture. The two groups were different from the cultivation groups at population level, suggesting the cross-pollination among cultivars. These findings will provide insights into the genetic diversity of the germplasm resources and facilitate marker-assisted breeding for this medicinal herb.
When an individual's DNA is sequenced, sensitive medical information becomes available to the sequencing laboratory. A recently proposed way to hide an individual's genetic information is to mix in DNA samples of other individuals. We assume that the genetic content of these samples is known to the individual but unknown to the sequencing laboratory. Thus, these DNA samples act as "noise" to the sequencing laboratory, but still allow the individual to recover their own DNA samples afterward. Motivated by this idea, we study the problem of hiding a binary random variable $X$ (a genetic marker) with the additive noise provided by mixing DNA samples, using mutual information as a privacy metric. This is equivalent to the problem of finding a worst-case noise distribution for recovering $X$ from the noisy observation among a set of feasible discrete distributions. We characterize upper and lower bounds to the solution of this problem, which are empirically shown to be very close. The lower bound is obtained through a convex relaxation of the original discrete optimization problem, and yields a closed-form expression. The upper bound is computed via a greedy algorithm for selecting the mixing proportions.
We consider universal quantization with side information for Gaussian observations, where the side information is a noisy version of the sender's observation with noise variance unknown to the sender. In this paper, we propose a universally rate optimal and practical quantization scheme for all values of unknown noise variance. Our scheme uses Polar lattices from prior work, and proceeds based on a structural decomposition of the underlying auxiliaries so that even when recovery fails in a round, the parties agree on a common "reference point" that is closer than the previous one. We also present the finite blocklength analysis showing an sub-exponential convergence for distortion and exponential convergence for rate. The overall complexity of our scheme is $O(N^2\log^2 N)$ for any target distortion and fixed rate larger than the rate-distortion bound.
Quantum resource theories (QRTs) provide a unified theoretical framework for understanding inherent quantum-mechanical properties that serve as resources in quantum information processing, but resources motivated by physics may possess intractable mathematical structure to analyze, such as non-uniqueness of maximally resourceful states, lack of convexity, and infinite dimension. We investigate state conversion and resource measures in general QRTs under minimal assumptions to figure out universal properties of physically motivated quantum resources that may have such intractable mathematical structure. In the general setting, we prove the existence of maximally resourceful states in one-shot state conversion. Also analyzing asymptotic state conversion, we discover catalytic replication of quantum resources, where a resource state is infinitely replicable by free operations. In QRTs without assuming uniqueness of maximally resourceful states, we formulate the tasks of distillation and formation of quantum resources, and introduce distillable resource and resource cost based on the distillation and the formation, respectively. Furthermore, we introduce consistent resource measures that quantify the amount of quantum resources without contradicting the rate of state conversion even in QRTs with non-unique maximally resourceful states. Progressing beyond the previous work showing a uniqueness theorem for additive resource measures, we prove the corresponding uniqueness inequality for the consistent resource measures; that is, consistent resource measures of a quantum state take values between the distillable resource and the resource cost of the state. These formulations and results establish a foundation of QRTs applicable to mathematically intractable but physically motivated quantum resources in a unified way.
Chen, Sonia Chien-I, Liu, Chenglian, Wang, Zhenyuan
et al.
BackgroundIn remote areas, connected health (CH) is needed, but as local resources are often scarce and the purchasing power of residents is usually poor, it is a challenge to apply CH in these settings. In this study, CH is defended as a technological solution for reshaping the direction of health care to be more proactive, preventive, and precisely targeted—and thus, more effective.
ObjectiveThe objective of this study was to explore the identity of CH stakeholders in remote areas of Taiwan and their interests and power in order to determine ideal strategies for applying CH. We aimed to explore the respective unknowns and discover insights for those facing similar issues.
MethodsQualitative research was conducted to investigate and interpret the phenomena of the aging population in a remote setting. An exploratory approach was employed involving semistructured interviews with 22 participants from 8 remote allied case studies. The interviews explored perspectives on stakeholder arrangements, including the power and interests of stakeholders and the needs of all the parties in the ecosystem.
ResultsResults were obtained from in-depth interviews and focus groups that included identifying the stakeholders of remote health and determining how they influence its practice, as well as how associated agreements bring competitive advantages. Stakeholders included people in government sectors, industrial players, academic researchers, end users, and their associates who described their perspectives on their power and interests in remote health service delivery. Specific facilitators of and barriers to effective delivery were identified. A number of themes, such as government interests and power of decision making, were corroborated across rural and remote services. These themes were broadly grouped into the disclosure of conflicts of interest, asymmetry in decision making, and data development for risk assessment.
ConclusionsThis study contributes to current knowledge by exploring the features of CH in remote areas and investigating its implementation from the perspectives of stakeholder management. It offers insights into managing remote health through a CH platform, which can be used for preliminary quantitative research. Consequently, these findings could help to more effectively facilitate diverse stakeholder engagement for health information sharing and social interaction.
Computer applications to medicine. Medical informatics, Public aspects of medicine