One in four adults worldwide are either overweight or obese. Epidemiological studies indicate that the location and distribution of excess fat, rather than general adiposity, is most informative for predicting risk of obesity sequellae, including cardiometabolic disease and cancer. We performed a genome-wide association study meta-analysis of body fat distribution, measured by waist-to-hip ratio adjusted for BMI (WHRadjBMI), and identified 463 signals in 346 loci. Heritability and variant effects were generally stronger in women than men, and we found approximately one-third of all signals to be sexually dimorphic. The 5% of individuals carrying the most WHRadjBMI-increasing alleles were 1.62 times more likely than the bottom 5% to have a WHR above the thresholds used for metabolic syndrome. These data, made publicly available, will inform the biology of body fat distribution and its relationship with disease.
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has mapped transcription, DNase I hypersensitivity, transcription factor binding, chromatin modifications and replication domains throughout the mouse genome in diverse cell and tissue types. By comparing with the human genome, we not only confirm substantial conservation in the newly annotated potential functional sequences, but also find a large degree of divergence of sequences involved in transcriptional regulation, chromatin state and higher order chromatin organization. Our results illuminate the wide range of evolutionary forces acting on genes and their regulatory regions, and provide a general resource for research into mammalian biology and mechanisms of human diseases.
Determining the genome sequence of an organism is challenging, yet fundamental to understanding its biology. Over the past decade, thousands of human genomes have been sequenced, contributing deeply to biomedical research. In the vast majority of cases, these have been analyzed by aligning sequence reads to a single reference genome, biasing the resulting analyses and, in general, failing to capture sequences novel to a given genome. Some de novo assemblies have been constructed, free of reference bias, but nearly all were constructed by merging homologous loci into single ‘consensus’ sequences, generally absent from nature. These assemblies do not correctly represent the diploid biology of an individual. In exactly two cases, true diploid de novo assemblies have been made, at great expense. One was generated using Sanger sequencing and one using thousands of clone pools. Here we demonstrate a straightforward and low-cost method for creating true diploid de novo assemblies. We make a single library from ~1 ng of high molecular weight DNA, using the 10x Genomics microfluidic platform to partition the genome. We applied this technique to seven human samples, generating low-cost HiSeq X data, then assembled these using a new ‘pushbutton’ algorithm, Supernova. Each computation took two days on a single server. Each yielded contigs longer than 100 kb, phase blocks longer than 2.5 Mb, and scaffolds longer than 15 Mb. Our method provides a scalable capability for determining the actual diploid genome sequence in a sample, opening the door to new approaches in genomic biology and medicine.
Plasmodium falciparum is the causative agent of the most burdensome form of human malaria, affecting 200–300 million individuals per year worldwide. The recently sequenced genome of P. falciparum revealed over 5,400 genes, of which 60% encode proteins of unknown function. Insights into the biochemical function and regulation of these genes will provide the foundation for future drug and vaccine development efforts toward eradication of this disease. By analyzing the complete asexual intraerythrocytic developmental cycle (IDC) transcriptome of the HB3 strain of P. falciparum, we demonstrate that at least 60% of the genome is transcriptionally active during this stage. Our data demonstrate that this parasite has evolved an extremely specialized mode of transcriptional regulation that produces a continuous cascade of gene expression, beginning with genes corresponding to general cellular processes, such as protein synthesis, and ending with Plasmodium-specific functionalities, such as genes involved in erythrocyte invasion. The data reveal that genes contiguous along the chromosomes are rarely coregulated, while transcription from the plastid genome is highly coregulated and likely polycistronic. Comparative genomic hybridization between HB3 and the reference genome strain (3D7) was used to distinguish between genes not expressed during the IDC and genes not detected because of possible sequence variations. Genomic differences between these strains were found almost exclusively in the highly antigenic subtelomeric regions of chromosomes. The simple cascade of gene regulation that directs the asexual development of P. falciparum is unprecedented in eukaryotic biology. The transcriptome of the IDC resembles a “just-in-time” manufacturing process whereby induction of any given gene occurs once per cycle and only at a time when it is required. These data provide to our knowledge the first comprehensive view of the timing of transcription throughout the intraerythrocytic development of P. falciparum and provide a resource for the identification of new chemotherapeutic and vaccine candidates.
Abstract Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first four sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fifth section surveys the topological complications implied by non-mean-field-type social network structures in general. The next three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner's Dilemma, the Rock–Scissors–Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.
Marko Škrabić, Marija Majer, Zdravko Siketić
et al.
Thin amorphous oxide films (a-SiO<sub>2</sub>, a-Al<sub>2</sub>O<sub>3</sub>, a-MgO) were prepared by magnetron sputtering deposition. Their response to high-energy heavy ion beams (23 MeV I, 18 MeV Cu, 2.5 MeV Cu) and gamma-ray (1.25 MeV) irradiation was studied by elastic recoil detection analysis and infrared spectroscopy. It was established that their high radiation hardness is due to a high level of disorder, already present in as-prepared samples, so the high-energy heavy ion irradiation cannot change their structure much. In the case of a-SiO<sub>2</sub>, this resulted in a completely different response to high-energy heavy ion irradiation found previously in thermally grown a-SiO<sub>2</sub>. In the case of a-MgO, only gamma-ray irradiation was found to induce significant changes.
Nathaniel Linden-Santangeli, Jin Zhang, Boris Kramer
et al.
Mathematical models are indispensable to the system biology toolkit for studying the structure and behavior of intracellular signaling networks. A common approach to modeling is to develop a system of equations that encode the known biology using approximations and simplifying assumptions. As a result, the same signaling pathway can be represented by multiple models, each with its set of underlying assumptions, which opens up challenges for model selection and decreases certainty in model predictions. Here, we use Bayesian multimodel inference to develop a framework to increase certainty in systems biology models. Using models of the extracellular regulated kinase (ERK) pathway, we first show that multimodel inference increases predictive certainty and yields predictors that are robust to changes in the set of available models. We then show that predictions made with multimodel inference are robust to data uncertainties introduced by decreasing the measurement duration and reducing the sample size. Finally, we use multimodel inference to identify a new model to explain experimentally measured sub-cellular location-specific ERK activity dynamics. In summary, our framework highlights multimodel inference as a disciplined approach to increasing the certainty of intracellular signaling activity predictions.
Large Language Models (LLMs), such as ChatGPT, have taken the world by storm and have passed certain forms of the Turing test. However, LLMs are not limited to human language and analyze sequential data, such as DNA, protein, and gene expression. The resulting foundation models can be repurposed to identify the complex patterns within the data, resulting in powerful, multi-purpose prediction tools able to explain cellular systems. This review outlines the different types of LLMs and showcases their recent uses in biology. Since LLMs have not yet been embraced by the plant community, we also cover how these models can be deployed for the plant kingdom.
A biological circuit is a neural or biochemical cascade, taking inputs and producing outputs. How have biological circuits learned to solve environmental challenges over the history of life? The answer certainly follows Dobzhansky's famous quote that ``nothing in biology makes sense except in the light of evolution.'' But that quote leaves out the mechanistic basis by which natural selection's trial-and-error learning happens, which is exactly what we have to understand. How does the learning process that designs biological circuits actually work? How much insight can we gain about the form and function of biological circuits by studying the processes that have made those circuits? Because life's circuits must often solve the same problems as those faced by machine learning, such as environmental tracking, homeostatic control, dimensional reduction, or classification, we can begin by considering how machine learning designs computational circuits to solve problems. We can then ask: How much insight do those computational circuits provide about the design of biological circuits? How much does biology differ from computers in the particular circuit designs that it uses to solve problems? This article steps through two classic machine learning models to set the foundation for analyzing broad questions about the design of biological circuits. One insight is the surprising power of randomly connected networks. Another is the central role of internal models of the environment embedded within biological circuits, illustrated by a model of dimensional reduction and trend prediction. Overall, many challenges in biology have machine learning analogs, suggesting hypotheses about how biology's circuits are designed.
Pierre Joanne, Yeranuhi Hovhannisyan, Alexandre Simon
et al.
Myofibrillar myopathy (MFM) is a rare genetic disorder characterized by muscular dystrophy that is often associated with cardiac disease. This disease is caused by mutations in several genes, among them DES (encoding desmin) is the most frequently affected. Peripheral blood mononuclear cells from 5 different MFM patients with different DES mutations were reprogrammed into induced pluripotent stem cells (IPSC) using non-integrative vectors. For each patient, one IPSC clone was selected and demonstrated pluripotency hallmarks without genomic abnormalities. SNP profiles were identical to the cells of origin and all the clones have the capacity to differentiate into all three germ layers.
Fatima Al Dhaheri, Jens Thomsen, Dean Everett
et al.
The United Arab Emirates has very little data on the incidence or prevalence of fungal diseases. Using total and underlying disease risk populations and likely affected proportions, we have modelled the burden of fungal disease for the first time. The most prevalent serious fungal conditions are recurrent vulvovaginitis (~190,000 affected) and fungal asthma (~34,000 affected). Given the UAE’s low prevalence of HIV, we estimate an at-risk population of 204 with respect to serious fungal infections with cryptococcal meningitis estimated at 2 cases annually, 15 cases of <i>Pneumocystis</i> pneumonia (PCP) annually, and 20 cases of esophageal candidiasis in the HIV population. PCP incidence in non-HIV patients is estimated at 150 cases annually. Likewise, with the same low prevalence of tuberculosis in the country, we estimate a total chronic pulmonary aspergillosis prevalence of 1002 cases. The estimated annual incidence of invasive aspergillosis is 505 patients, based on local data on rates of malignancy, solid organ transplantation, and chronic obstructive pulmonary disease (5.9 per 100,000). Based on the 2022 annual report of the UAE’s national surveillance database, candidaemia annual incidence is 1090 (11.8/100,000), of which 49.2% occurs in intensive care. Fungal diseases affect ~228,695 (2.46%) of the population in the UAE.