Reinhard Pekrun, Stephanie Lichtenfeld, Herbert W Marsh
et al.
Abstract A reciprocal effects model linking emotion and achievement over time is proposed. The model was tested using five annual waves of the Project for the Analysis of Learning and Achievement in Mathematics (PALMA) longitudinal study, which investigated adolescents’ development in mathematics (Grades 5–9; N = 3,425 German students; mean starting age = 11.7 years; representative sample). Structural equation modeling showed that positive emotions (enjoyment, pride) positively predicted subsequent achievement (math end-of-the-year grades and test scores), and that achievement positively predicted these emotions, controlling for students’ gender, intelligence, and family socioeconomic status. Negative emotions (anger, anxiety, shame, boredom, hopelessness) negatively predicted achievement, and achievement negatively predicted these emotions. The findings were robust across waves, achievement indicators, and school tracks, highlighting the importance of emotions for students’ achievement and of achievement for the development of emotions.
Predicting the future evolutionary trajectory of SARS-CoV-2 remains a critical challenge, particularly due to the pivotal role of spike protein mutations. It is therefore essential to develop evolutionary models capable of continuously integrating new experimental data. In this study, we employ a cladogram algorithm that incorporates established assumptions for mutant representation -- using both four-letter and two-letter formats -- along with an n-mer distance algorithm to construct a cladogenetic tree of SARS-CoV-2 mutations. This tree accurately captures the observed changes across macro-lineages. We introduce a stochastic method for generating new strains on this tree based on spike protein mutations. For a given set A of existing mutation sites, we define a set X comprising x randomly generated mutation sites on the spike protein. The intersection of A and X, denoted as set Y, contains y sites. Our analysis indicates that the position of a generated strain on the tree is primarily determined by x. Through large-scale stochastic sampling, we predict the emergence of new macro-lineages. As x increases, the dominance among macro-lineages shifts: lineage O surpasses N, P surpasses O, and eventually Q surpasses P. We identify threshold values of x that delineate transitions between these macro-lineages. Furthermore, we propose an algorithm for predicting the timeline of macro-lineage emergence. In conclusion, our findings demonstrate that SARS-CoV-2 evolution adheres to statistical principles: the emergence of new strains can be driven by randomly generated spike protein sites, and large-scale stochastic sampling reveals evolutionary patterns underlying the rise of distinct macro-lineages.
Diffusion and anomalous diffusion are widely observed and used to study movement across organisms, resulting in extensive use of the mean and mean-squared displacement (MSD). However, these measures - corresponding to specific displacement moments - do not capture the full complexity of movement behavior. Using high-resolution data from over 70 million localizations of young and adult free-ranging Barn Owls (\textit{Tyto alba}), we reveal strong anomalous diffusion as nonlinear growth of displacement moments. The moment spectrum function $λ_t(q)$ -- defined by $\left<|\bm{x}(t)|^q\right> \sim t^{λ_t(q)}$ -- displays piecewise linearity in $q$, with a critical moment marking the crossover between scaling regimes. This highlights the need of a broad spectrum of displacement moments to characterize movement, which we link to age-specific ecological drivers. Furthermore, a characteristic timescale of five minutes marks an unexpected transition from a convex to a concave $λ_t(q)$. Using two stochastic models - a bounded Lévy walk and a multi-mode behavioral model - we account for the observed phenomena, showing good agreement with data, relating age-specific behavioral states to environmentally confined movement, and demonstrating how Lévy walk-like patterns can arise from underlying behavioral structure. Finally, we discuss the ecological significance of our results, arguing that strong anomalous diffusion may be widespread in animal movement.
The concept of individual admixture (IA) assumes that the genome of individuals is composed of alleles inherited from $K$ ancestral populations. Each copy of each allele has the same chance $q_k$ to originate from population $k$, and together with the allele frequencies $p$ in all populations at all $M$ markers, comprises the admixture model. Here, we assume a supervised scheme, i.e.\ allele frequencies $p$ are given through a reference database of size $N$, and $q$ is estimated via maximum likelihood for a single sample. We study laws of large numbers and central limit theorems describing effects of finiteness of both, $M$ and $N$, on the estimate of $q$. We recall results for the effect of finite $M$, and provide a central limit theorem for the effect of finite $N$, introduce a new way to express the uncertainty in estimates in standard barplots, give simulation results, and discuss applications in forensic genetics.
While recent work has established divergence as a key framework for understanding evenness, there is currently no research exploring how the families of measures within the divergence-based framework relate to each other. This paper uses geometry to show that, holding order and richness constant, the families of divergence-based evenness measures nest. This property allows them to be ranked based on their reactivity to changes in relatively even assemblages or changes in relatively uneven ones. We establish this ranking and explore how the distance-based measures relate to it for both order q=2 and q=1. We also derive a new family of distance-based measures that captures the angular distance between the vector of relative abundances and a perfectly even vector and is highly reactive to changes in even assemblages. Finally, we show that if we only require evenness to be a divergence, then any smooth, monotonically increasing function of diversity can be made into an evenness measure. A deeper understanding of how to measure evenness will require empirical or theoretical research that uncovers which kind of divergence best reflects the underlying concept.
Inference of the evolutionary histories of species, commonly represented by a species tree, is complicated by the divergent evolutionary history of different parts of the genome. Different loci on the genome can have different histories from the underlying species tree (and each other) due to processes such as incomplete lineage sorting (ILS), gene duplication and loss, and horizontal gene transfer. The multispecies coalescent is a commonly used model for performing inference on species and gene trees in the presence of ILS. This paper introduces Lily-T and Lily-Q, two new methods for species tree inference under the multispecies coalescent. We then compare them to two frequently used methods, SVDQuartets and ASTRAL, using simulated and empirical data. Both methods generally showed improvement over SVDQuartets, and Lily-Q was superior to Lily-T for most simulation settings. The comparison to ASTRAL was more mixed - Lily-Q tended to be better than ASTRAL when the length of recombination-free loci was short, when the coalescent population parameter θ was small, or when the internal branch lengths were longer.
An accurate closed-form solution is obtained to the SIR Epidemic Model through the use of Asymptotic Approximants (Barlow et. al, 2017, Q. Jl Mech. Appl. Math, 70 (1), 21-48). The solution is created by analytically continuing the divergent power series solution such that it matches the long-time asymptotic behavior of the epidemic model. The utility of the analytical form is demonstrated through its application to the COVID-19 pandemic.
The dynamics of a SIVR model with power relationship incidence rates $(βI^p S^q)$ is investigated. It is assumed an individual can be susceptible after receiving the first dose of the vaccine, hence a second dose is required to attain permanent immunity. The steady states conditions of the disease-free equilibrium and the endemic equilibrium are critically presented. Numerical simulations are carried out to determine the impact of the exponential parameters $(p;q)$ on infection.
Tsallis' non-extensive entropy is extended to incorporate the dependence on affinities between the microstates of a system. At the core of our construction of the extended entropy ($\mathcal{H}$) is the concept of the effective number of dissimilar states, termed the effective diversity ($\mathitΔ$). It is a unique integrated measure derived from the probability distribution among states and the affinities between states. The effective diversity is related to the extended entropy through the Boltzmann's-equation-like relation, $\mathcal{H}=\ln_{q}\mathitΔ$, in terms of the Tsallis' $q$-logarithm. A new principle called the Nesting Principle is established, stating that the effective diversity remains invariant under an arbitrary grouping of the constituent states. It is shown that this invariance property holds only for $q=2$; however, the invariance is recovered for general $q$ in the zero-affinity limit (i.e. the Tsallis and Boltzmann-Gibbs case). Using the affinity-based extended Tsallis entropy, the microcanonical and the canonical ensembles are constructed in the presence of general between-state affinities. It is shown that the classic postulate of equal a priori probabilities no longer holds but is modified by affinity-dependent terms. As an illustration, a two-level system is investigated by the extended canonical method, which manifests that the thermal behaviours of the thermodynamic quantities at equilibrium are affected by the between-state affinity. Furthermore, some applications and implications of the affinity-based extended diversity/entropy for information theory and biodiversity theory are addressed in appendices.
We present a new non-Archimedean model of evolutionary dynamics, in which the genomes are represented by p-adic numbers. In this model the genomes have a variable length, not necessarily bounded, in contrast with the classical models where the length is fixed. The time evolution of the concentration of a given genome is controlled by a p-adic evolution equation. This equation depends on a fitness function f and on mutation measure Q. By choosing a mutation measure of Gibbs type, and by using a p-adic version of the Maynard Smith Ansatz, we show the existence of threshold function M_{c}(f,Q), such that the long term survival of a genome requires that its length grows faster than M_{c}(f,Q). This implies that Eigen's paradox does not occur if the complexity of genomes grows at the right pace. About twenty years ago, Scheuring and Poole, Jeffares, Penny proposed a hypothesis to explain Eigen's paradox. Our mathematical model shows that this biological hypothesis is feasible, but it requires p-adic analysis instead of real analysis. More exactly, the Darwin-Eigen cycle proposed by Poole et al. takes place if the length of the genomes exceeds M_{c}(f,Q).
I extend the traditional SAR, which has achieved status of ecological law and plays a critical role in global biodiversity assessment, to the general (alpha- or beta-diversity in Hill numbers) diversity area relationship (DAR). The extension was motivated to remedy the limitation of traditional SAR that only address one aspect of biodiversity scaling, i.e., species richness scaling over space. The extension was made possible by the fact that all Hill numbers are in units of species (referred to as the effective number of species or as species equivalents), and I postulated that Hill numbers should follow the same or similar pattern of SAR. I selected three DAR models, the traditional power law (PL), PLEC (PL with exponential cutoff) and PLIEC (PL with inverse exponential cutoff). I defined three new concepts and derived their quantifications: (i)DAR profile: z-q series where z is the PL scaling parameter at different diversity order (q); (ii)PDO (pair-wise diversity overlap) profile: g-q series where g is the PDO corresponding to q; (iii) MAD (maximal accrual diversity) profile: Dmax-q series where Dmax is the MAD corresponding to q. Furthermore, the PDO-g is quantified based on the self-similarity property of the PL model, and Dmax can be estimated from the PLEC parameters. The three profiles constitute a novel DAR approach to biodiversity scaling. I verified the postulation with the American gut microbiome project (AGP) dataset of 1473 healthy North American individuals (the largest human dataset from a single project to date). The PL model was preferred due to its simplicity and established ecological properties such as self-similarity (necessary for establishing PDO profile), and PLEC has an advantage in establishing the MAD profile. All three profiles for the AGP dataset were successfully quantified and compared with existing SAR parameters in the literature whenever possible.
Some methods aim to correct or test for relationships or to reconstruct the pedigree, or family tree. We show that these methods cannot resolve ties for correct relationships due to identifiability of the pedigree likelihood which is the probability of inheriting the data under the pedigree model. This means that no likelihood-based method can produce a correct pedigree inference with high probability. This lack of reliability is critical both for health and forensics applications. In this paper we present the first discussion of multiple typed individuals in non-isomorphic pedigrees, $\mathcal{P}$ and $\mathcal{Q}$, where the likelihoods are non-identifiable, $Pr[G~|~\mathcal{P},θ] = Pr[G~|~\mathcal{Q},θ]$, for all input data $G$ and all recombination rate parameters $θ$. While there were previously known non-identifiable pairs, we give an example having data for multiple individuals. Additionally, deeper understanding of the general discrete structures driving these non-identifiability examples has been provided, as well as results to guide algorithms that wish to examine only identifiable pedigrees. This paper introduces a general criteria for establishing whether a pair of pedigrees is non-identifiable and two easy-to-compute criteria guaranteeing identifiability. Finally, we suggest a method for dealing with non-identifiable likelihoods: use Bayes rule to obtain the posterior from the likelihood and prior. We propose a prior guaranteeing that the posterior distinguishes all pairs of pedigrees. Shortened version published as: B. Kirkpatrick. Non-identifiable pedigrees and a Bayesian solution. Int. Symp. on Bioinformatics Res. and Appl. (ISBRA), 7292:139-152 2012.
Contact tracing is an important control strategy for containing Ebola epidemics. From a theoretical perspective, explicitly incorporating contact tracing with disease dynamics presents challenges, and population level effects of contact tracing are difficult to determine. In this work, we formulate and analyze a mechanistic SEIR type outbreak model which considers the key features of contact tracing, and we characterize the impact of contact tracing on the effective reproduction number, $\mathcal R_e$, of Ebola. In particular, we determine how relevant epidemiological properties such as incubation period, infectious period and case reporting, along with varying monitoring protocols, affect the efficacy of contact tracing. In the special cases of either perfect monitoring of traced cases or perfect reporting of all cases, we derive simple formulae for the critical proportion of contacts that need to be traced in order to bring the effective reproduction number $\mathcal R_e$ below one. Also, in either case, we show that $\mathcal R_e$ can be expressed completely in terms of observable reported case/tracing quantities, namely $\mathcal R_e=k\dfrac{(1-q)}{q}+k_m$ where $k$ is the number of secondary traced infected contacts per primary untraced reported case, $k_m$ is the number of secondary traced infected contacts per primary traced reported case and $(1-q)/q$ is the odds that a reported case is not a traced contact. These formulae quantify contact tracing as both an intervention strategy that impacts disease spread and a probe into the current epidemic status at the population level. Data from the West Africa Ebola outbreak is utilized to form real-time estimates of $\mathcal R_e$, and inform our projections of the impact of contact tracing, and other control measures, on the epidemic trajectory.
J. M. Ilnytskyi, Y. Kozitsky, H. I. Ilnytskyi
et al.
By means of the asynchronous cellular automata algorithm we study stationary states and spatial patterning in an $SIS$ model, in which the individuals' are attached to the vertices of a graph and their mobility is mimicked by varying the neighbourhood size $q$. The versions with fixed $q$ and those taken at random at each step and for each individual are studied. Numerical data on the local behaviour of the model are mapped onto the solution of its zero dimensional version, corresponding to the limit $q\to +\infty$ and equivalent to the logistic growth model. This allows for deducing an explicit form of the dependence of the fraction of infected individuals on the curing rate $γ$. A detailed analysis of the appearance of spatial patterns of infected individuals in the stationary state is performed.
Yuri S. Semenov, Alexander S. Bratus, Artem S. Novozhilov
We study general properties of the leading eigenvalue $\overline{w}(q)$ of Eigen's evolutionary matrices depending on the probability $q$ of faithful reproduction. This is a linear algebra problem that has various applications in theoretical biology, including such diverse fields as the origin of life, evolution of cancer progression, and virus evolution. We present the exact expressions for $\overline{w}(q),\overline{w}'(q),\overline{w}''(q)$ for $q=0,0.5,1$ and prove that the absolute minimum of $\overline{w}(q)$, which always exists, belongs to the interval $[0,0.5]$. For the specific case of a single peaked landscape we also find lower and upper bounds on $\overline{w}(q)$, which are used to estimate the critical mutation rate, after which the distribution of the types of individuals in the population becomes almost uniform. This estimate is used as a starting point to conjecture another estimate, valid for any fitness landscape, and which is checked by numerical calculations. The last estimate stresses the fact that the inverse dependence of the critical mutation rate on the sequence length is not a generally valid fact. Therefore, the discussions of the error threshold applied to biological systems must take this fact into account.
Charalambos Neophytou, Aikaterini Dounavi, Filippos A. Aravanopoulos
Conservation of 16 nuclear microsatellite loci, originally developed for Quercus macrocarpa (section Albae), Q. petraea, Q. robur (section Robur) and Q. myrsinifolia, (subgenus Cyclobalanopsis) was tested in a Q. infectoria ssp. veneris population from Cyprus. All loci could be amplified successfully and displayed allele size and diversity patterns that match those of oak species belonging to the section Robur. At least in one case, limited amplification and high levels of homozygosity support the occurrence of 'null alleles', caused by a possible mutation in the highly conserved primer areas, thus hindering PCR. The sampled population exhibited high levels of diversity despite the very limited distribution of this species in Cyprus and extended population fragmentation. Allele sizes of Q. infectoria at locus QpZAG9 partially match those of Q. alnifolia and Q. coccifera from neighboring populations. However, sequencing showed homoplasy, excluding a case of interspecific introgression with the latter, phylogenetically remote species. Q. infectoria ssp. veneris sequences at this locus were concordant to those of other species of section Robur, while sequences of Quercus alnifolia and Quercus coccifera were almost identical to Q. cerris.
The goal of this work is to propose a finite population counterpart to Eigen's model, which incorporates stochastic effects. We consider a Moran model describing the evolution of a population of size $m$ of chromosomes of length $\ell$ over an alphabet of cardinality $κ$. The mutation probability per locus is $q$. We deal only with the sharp peak landscape: the replication rate is $σ>1$ for the master sequence and 1 for the other sequences. We study the equilibrium distribution of the process in the regime where $\ell, m\to +\infty$, $q\to 0$, $\ell q \to a$, $m/\ell\toα$. We obtain an equation $αφ(a)=\lnκ$ in the parameter space $(a,α)$ separating the regime where the equilibrium population is totally random from the regime where a quasispecies is formed. We observe the existence of a critical population size necessary for a quasispecies to emerge and we recover the finite population counterpart of the error threshold. These results are supported by computer simulations.