Divyalakshmi Bhaskaran, Joshua Savage, Amit Patel
et al.
Abstract Background Glioblastoma (GBM) is the most common adult malignant brain tumour, with an incidence of 5 per 100,000 per year in England. Patients with tumours showing O6-methylguanine-DNA methyltransferase (MGMT) promoter methylation represent around 40% of newly diagnosed GBM. Relapse/tumour recurrence is inevitable. There is no agreed standard treatment for patients with GBM, therefore, it is aimed at delaying further tumour progression and maintaining health-related quality of life (HRQoL). Limited clinical trial data exist using cannabinoids in combination with temozolomide (TMZ) in this setting, but early phase data demonstrate prolonged overall survival compared to TMZ alone, with few additional side effects. Jazz Pharmaceuticals (previously GW Pharma Ltd.) have developed nabiximols (trade name Sativex®), an oromucosal spray containing a blend of cannabis plant extracts, that we aim to assess for preliminary efficacy in patients with recurrent GBM. Methods ARISTOCRAT is a phase II, multi-centre, double-blind, placebo-controlled, randomised trial to assess cannabinoids in patients with recurrent MGMT methylated GBM who are suitable for treatment with TMZ. Patients who have relapsed ≥ 3 months after completion of initial first-line treatment will be randomised 2:1 to receive either nabiximols or placebo in combination with TMZ. The primary outcome is overall survival time defined as the time in whole days from the date of randomisation to the date of death from any cause. Secondary outcomes include overall survival at 12 months, progression-free survival time, HRQoL (using patient reported outcomes from QLQ-C30, QLQ-BN20 and EQ-5D-5L questionnaires), and adverse events. Discussion Patients with recurrent MGMT promoter methylated GBM represent a relatively good prognosis sub-group of patients with GBM. However, their median survival remains poor and, therefore, more effective treatments are needed. The phase II design of this trial was chosen, rather than phase III, due to the lack of data currently available on cannabinoid efficacy in this setting. A randomised, double-blind, placebo-controlled trial will ensure an unbiased robust evaluation of the treatment and will allow potential expansion of recruitment into a phase III trial should the emerging phase II results warrant this development. Trial registration ISRCTN: 11460478. ClinicalTrials.Gov: NCT05629702.
Abstract Purpose Self-management can have clinical and quality-of-life benefits. However, people with lower-grade gliomas (LGG) may face chronic tumour- and/or treatment-related symptoms and impairments (e.g. cognitive deficits, seizures), which could influence their ability to self-manage. Our study aimed to identify and understand the barriers and facilitators to self-management in people with LGG. Methods We conducted semi-structured interviews with 28 people with LGG across the United Kingdom, who had completed primary treatment. Sixteen participants were male, mean age was 50.4 years, and mean time since diagnosis was 8.7 years. Interviews were audio-recorded and transcribed. Following inductive open coding, we deductively mapped codes to Schulman-Green et al.’s framework of factors influencing self-management, developed in chronic illness. Results Data suggested extensive support for all five framework categories (‘Personal/lifestyle characteristics’, ‘Health status’, ‘Resources’, ‘Environmental characteristics’, ‘Healthcare system’), encompassing all 18 factors influencing self-management. How people with LGG experience many of these factors appears somewhat distinct from other cancers; participants described multiple, often co-occurring, challenges, primarily with knowledge and acceptance of their incurable condition, the impact of seizures and cognitive deficits, transport difficulties, and access to (in)formal support. Several factors were on a continuum, for example, sufficient knowledge was a facilitator, whereas lack thereof, was a barrier to self-management. Conclusions People with LGG described distinctive experiences with wide-ranging factors influencing their ability to self-manage. Implications for cancer survivors These findings will improve awareness of the potential challenges faced by people with LGG around self-management and inform development of self-management interventions for this population.
Abstract Purpose Lower-grade gliomas (LGG) are mostly diagnosed in working-aged adults and rarely cured. LGG patients may face chronic impairments (e.g. fatigue, cognitive deficits). Self-management can improve clinical and psychosocial outcomes, yet how LGG patients self-manage the consequences of their tumour and its treatment is not fully understood. This study, therefore, aimed to identify and understand how LGG patients engage in the self-management of their condition. Methods A diverse group of 28 LGG patients (age range 22–69 years; male n = 16, female n = 12; mean time since diagnosis = 8.7 years) who had completed primary treatment, were recruited from across the United Kingdom. Semi-structured interviews were conducted. Informed by a self-management strategy framework developed in cancer, directed content analysis identified and categorised self-management types and strategies used by patients. Results Overall, 20 self-management strategy types, comprising 123 self-management strategies were reported; each participant detailed extensive engagement in self-management. The most used strategy types were ‘using support’ (n = 28), ‘creating a healthy environment’ (n = 28), ‘meaning making’ (n = 27), and ‘self-monitoring’ (n = 27). The most used strategies were ‘accepting the tumour and its consequences’ (n = 26), ‘receiving support from friends (n = 24) and family’ (n = 24), and ‘reinterpreting negative consequences’ (n = 24). Conclusions This study provides a comprehensive understanding of the strategies used by LGG patients to self-manage their health and wellbeing, with a diverse, and substantial number of self-management strategies reported. Implications for Cancer Survivors The findings will inform the development of a supported self-management intervention for LGG patients, which will be novel for this patient group.
Two distinct families of pan-primate endogenous retroviruses, namely HERVL and HERVH, infected primates germline, colonized host genomes, and evolved into the global retroviral genomic regulatory dominion (GRD) operating during human embryogenesis (HE). HE retroviral GRD constitutes 8839 highly conserved fixed LTR elements linked to 5444 down-stream target genes forged by evolution into a functionally-consonant constellation of 26 genome-wide multimodular genomic regulatory networks (GRNs), each of which is defined by significant enrichment of numerous single gene ontology (GO)-specific traits. Locations of GRNs appear scattered across chromosomes to occupy from 5.5%-15.09% of human genome. Each GRN harbors from 529-1486 retroviral LTRs derived from LTR7, MLT2A1, and MLT2A2 sequences that are quantitatively balanced according to their genome-wide abundance. GRNs integrate activities from 199-805 down-stream target genes, including transcription factors, chromatin-state remodelers, signal-sensing and signal-transduction mediators, enzymatic and receptor binding effectors, intracellular complexes and extracellular matrix elements, and cell-cell adhesion molecules. GRNs compositions consist of several hundred to thousands smaller GO enrichment-defined genomic regulatory modules (GRMs) combining from a dozen to hundreds LTRs and down-stream target genes, which appear to operate on individuals life-span timescale along specific phenotypic avenues to exert profound effects on patterns of transcription, protein-protein interactions, developmental phenotypes, physiological traits, and pathological conditions of Modern Humans. Overall, this study identifies 69,573 statistically significant retroviral LTR-linked GRMs (Binominal FDR q-value threshold of 0.001), including 27,601 GRMs validated by the single GO-specific directed acyclic graph (DAG) analyses across six GO annotations.
Yasmin Boyle, Terrance G. Johns, Emily V. Fletcher
Malignant central nervous system (CNS) cancers are among the most difficult to treat, with low rates of survival and a high likelihood of recurrence. This is primarily due to their location within the CNS, hindering adequate drug delivery and tumour access via surgery. Furthermore, CNS cancer cells are highly plastic, an adaptive property that enables them to bypass targeted treatment strategies and develop drug resistance. Potassium ion channels have long been implicated in the progression of many cancers due to their integral role in several hallmarks of the disease. Here, we will explore this relationship further, with a focus on malignant CNS cancers, including high-grade glioma (HGG). HGG is the most lethal form of primary brain tumour in adults, with the majority of patient mortality attributed to drug-resistant secondary tumours. Hence, targeting proteins that are integral to cellular plasticity could reduce tumour recurrence, improving survival. This review summarises the role of potassium ion channels in malignant CNS cancers, specifically how they contribute to proliferation, invasion, metastasis, angiogenesis, and plasticity. We will also explore how specific modulation of these proteins may provide a novel way to overcome drug resistance and improve patient outcomes.
Kleber Padovani, Roberto Xavier, Rafael Cabral Borges
et al.
De novo genome assembly is a relevant but computationally complex task in genomics. Although de novo assemblers have been used successfully in several genomics projects, there is still no 'best assembler', and the choice and setup of assemblers still rely on bioinformatics experts. Thus, as with other computationally complex problems, machine learning may emerge as an alternative (or complementary) way for developing more accurate and automated assemblers. Reinforcement learning has proven promising for solving complex activities without supervision - such games - and there is a pressing need to understand the limits of this approach to 'real' problems, such as the DFA problem. This study aimed to shed light on the application of machine learning, using reinforcement learning (RL), in genome assembly. We expanded upon the sole previous approach found in the literature to solve this problem by carefully exploring the learning aspects of the proposed intelligent agent, which uses the Q-learning algorithm, and we provided insights for the next steps of automated genome assembly development. We improved the reward system and optimized the exploration of the state space based on pruning and in collaboration with evolutionary computing. We tested the new approaches on 23 new larger environments, which are all available on the internet. Our results suggest consistent performance progress; however, we also found limitations, especially concerning the high dimensionality of state and action spaces. Finally, we discuss paths for achieving efficient and automated genome assembly in real scenarios considering successful RL applications - including deep reinforcement learning.
A dynamic model of non-lineal time-dependent ordinary differential equations (ODE) has been applied to the interactions of a HIV infection with the immune system cells. This model has been simplified into two compartments: lymph node and peripheral blood. The model includes CD4 T-lymphocytes in several states (quiescent Q, naive N and activated T), cytotoxic CD8 T-cells, B-cells and dendritic cells. Cytokines and immunoglobulins specific for each antigen (i.e. gp41 or p24) have been also included in the model, modelling the atraction effect of CD4 T-cells to the infected area and the reduction of virus concentration by immunoglobulins. HIV virus infection of CD4 T-lymphocytes is modelled in several stages: before fusion as HIV-attached (H) and after fusion as non-permissive / abortively infected (M), and permissive / latently infected (L) and permissive / actively infected (I). These equations have been implemented in a C++/Python interface application, called Immune System app, which runs Open Modelica software to solve the ODE system through a 4th order Runge-Kutta numerical approximation. Results of the simulation show that although HIV virus concentration in both compartments is lower than $10^{-10}$ virus/$μL$ after t=2 years, quiescent lymphocytes reach an equilibrium with a concentration lower than the initial conditions, due to the latency state, which serves as a reservoir in time of virus production. As a conclusion, this model can provide reliable results in other conditions, such as antiviral therapies.
I extend the traditional SAR, which has achieved status of ecological law and plays a critical role in global biodiversity assessment, to the general (alpha- or beta-diversity in Hill numbers) diversity area relationship (DAR). The extension was motivated to remedy the limitation of traditional SAR that only address one aspect of biodiversity scaling, i.e., species richness scaling over space. The extension was made possible by the fact that all Hill numbers are in units of species (referred to as the effective number of species or as species equivalents), and I postulated that Hill numbers should follow the same or similar pattern of SAR. I selected three DAR models, the traditional power law (PL), PLEC (PL with exponential cutoff) and PLIEC (PL with inverse exponential cutoff). I defined three new concepts and derived their quantifications: (i)DAR profile: z-q series where z is the PL scaling parameter at different diversity order (q); (ii)PDO (pair-wise diversity overlap) profile: g-q series where g is the PDO corresponding to q; (iii) MAD (maximal accrual diversity) profile: Dmax-q series where Dmax is the MAD corresponding to q. Furthermore, the PDO-g is quantified based on the self-similarity property of the PL model, and Dmax can be estimated from the PLEC parameters. The three profiles constitute a novel DAR approach to biodiversity scaling. I verified the postulation with the American gut microbiome project (AGP) dataset of 1473 healthy North American individuals (the largest human dataset from a single project to date). The PL model was preferred due to its simplicity and established ecological properties such as self-similarity (necessary for establishing PDO profile), and PLEC has an advantage in establishing the MAD profile. All three profiles for the AGP dataset were successfully quantified and compared with existing SAR parameters in the literature whenever possible.
DNA read mapping is a ubiquitous task in bioinformatics, and many tools have been developed to solve the read mapping problem. However, there are two trends that are changing the landscape of readmapping: First, new sequencing technologies provide very long reads with high error rates (up to 15%). Second, many genetic variants in the population are known, so the reference genome is not considered as a single string over ACGT, but as a complex object containing these variants. Most existing read mappers do not handle these new circumstances appropriately. We introduce a new read mapper prototype called VATRAM that considers variants. It is based on Min-Hashing of q-gram sets of reference genome windows. Min-Hashing is one form of locality sensitive hashing. The variants are directly inserted into VATRAMs index which leads to a fast mapping process. Our results show that VATRAM achieves better precision and recall than state-of-the-art read mappers like BWA under certain cirumstances. VATRAM is open source and can be accessed at https://bitbucket.org/Quedenfeld/vatram-src/.
Expression quantitative trait loci (eQTL) mapping constitutes a challenging problem due to, among other reasons, the high-dimensional multivariate nature of gene-expression traits. Next to the expression heterogeneity produced by confounding factors and other sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of all of these factors to end up with a network of direct associations connecting the path from genotype to phenotype. In this paper we approach this challenge with mixed graphical Markov models, higher-order conditional independences and q-order correlation graphs. These models show that additive genetic effects propagate through the network as function of gene-gene correlations. Our estimation of the eQTL network underlying a well-studied yeast data set leads to a sparse structure with more direct genetic and regulatory associations that enable a straightforward comparison of the genetic control of gene expression across chromosomes. Interestingly, it also reveals that eQTLs explain most of the expression variability of network hub genes.
Alejandro Ochoa, John D. Storey, Manuel Llinás
et al.
E-values have been the dominant statistic for protein sequence analysis for the past two decades: from identifying statistically significant local sequence alignments to evaluating matches to hidden Markov models describing protein domain families. Here we formally show that for "stratified" multiple hypothesis testing problems, controlling the local False Discovery Rate (lFDR) per stratum, or partition, yields the most predictions across the data at any given threshold on the FDR or E-value over all strata combined. For the important problem of protein domain prediction, a key step in characterizing protein structure, function and evolution, we show that stratifying statistical tests by domain family yields excellent results. We develop the first FDR-estimating algorithms for domain prediction, and evaluate how well thresholds based on q-values, E-values and lFDRs perform in domain prediction using five complementary approaches for estimating empirical FDRs in this context. We show that stratified q-value thresholds substantially outperform E-values. Contradicting our theoretical results, q-values also outperform lFDRs; however, our tests reveal a small but coherent subset of domain families, biased towards models for specific repetitive patterns, for which FDRs are greatly underestimated due to weaknesses in random sequence models. Usage of lFDR thresholds outperform q-values for the remaining families, which have as-expected noise, suggesting that further improvements in domain predictions can be achieved with improved modeling of random sequences. Overall, our theoretical and empirical findings suggest that the use of stratified q-values and lFDRs could result in improvements in a host of structured multiple hypothesis testing problems arising in bioinformatics, including genome-wide association studies, orthology prediction, motif scanning, and multi-microarray analyses.
Sequence organizations are viewed from two points: one is from informational redundancy or informational correlation (IC) and another is from k-mer frequency statistics. Two problems are investigated. The first is how the ICs exceed the fluctuation bound and the order emerges from fluctuation in a genome when the sequence length attains some critical value. We demonstrated that the transition from fluctuation to order takes place at about sequence length 200-300 thousands bases for human and E coli genome. It means that the life emerges from a region between macroscopic and microscopic. The second is about the statistical law of the k-mer organization in a genome under the evolutionary pressure and functional selection. We deduced a sum rule Q(k,N) on the k-mer frequency deviations from the randomness in a N-long sequence of genome and deduced the relations of Q(k,N) with k and N. We found that Q(k,N) increases with length N at a constant rate for most genome sequences and demonstrated that when the functional selection of k-mers is accumulated to some critical value the ordering takes place. An important finding is the sum rule correlated with the evolutionary complexity of the genome.
We determine the Renyi entropies K_q of symbol sequences generated by human chromosomes. These exhibit nontrivial behaviour as a function of the scanning parameter q. In the thermodynamic formalism, there are phase transition-like phenomena close to the q=1 region. We develop a theoretical model for this based on the superposition of two multifractal sets, which can be associated with the different statistical properties of coding and non-coding DNA sequences. This model is in good agreement with the human chromosome data.
In this work it is shown that 20 canonical amino acids (AAs) within genetic code appear to be a whole system with strict distinction in Genetic Code Table (GCT) into some different quantums: 20, 23, 61 amino acid molecules. These molecules distinction is followed by specific balanced atom number and/or nucleon number distinctions within those molecules. In this second version two appendices are added; also a new version of Periodic system of numbers, whose first verson is given in arXiv:1107.1998 [q-bio.OT].
Estimation of molecular evolutionary divergence times requires models of rate change. These vary with regard to the assumption of what quantity is penalized. The possibilities considered are the rate of evolution, the log of the rate of evolution and the inverse of the rate of evolution. These models also vary with regard to how time affects the expected variance of rate change. Here the alternatives are not at all, linearly with time and as the product of rate and time. This results in a set of nine models, both random walks and Brownian motion. A priori any of these models could be correct, yet different researchers may well prefer, or simply use, one rather than the others. Another variable is whether to use a scaling factor to take account of the variance of the process of rate change being unknown and therefore avoid minimizing the penalty function with unrealistically large times. Here the difference these models and assumptions make on a tree of mammals, with the root fixed and with a single internal node fixed, is measured. The similarity of models is measured as the correlation of their time estimates and visualized with a least squares tree. The fit of model to data is measured and Q-Q plots are shown. Comparing model estimates with each other, the age of clades within Laurasiatheria are seen to vary far more across models than those within Supraprimates (informally called Euarchontoglires). Especially problematic are the often-used fossil calibrated nodes of horse/rhino and whale/hippo clashing with times within Supraprimates and in particular no fossil rodent teeth older than ~60 mybp. A scaling factor in addition to penalizing rate change is seen to yield consistent relative time estimates irrespective of exactly where the calibration point is placed.
Motivation: The reconstruction of gene networks from gene expression microarrays is gaining popularity as methods improve and as more data become available. The reliability of such networks could be judged by the probability that a connection between genes is spurious, resulting from chance fluctuations rather than from a true biological relationship. Results: Unlike the false discovery rate and positive false discovery rate, the decisive false discovery rate (dFDR) is exactly equal to a conditional probability without assuming independence or the randomness of hypothesis truth values. This property is useful not only in the common application to the detection of differential gene expression, but also in determining the probability of a spurious connection in a reconstructed gene network. Estimators of the dFDR can estimate each of three probabilities: 1. The probability that two genes that appear to be associated with each other lack such association. 2. The probability that a time ordering observed for two associated genes is misleading. 3. The probability that a time ordering observed for two genes is misleading, either because they are not associated or because they are associated without a lag in time. The first probability applies to both static and dynamic gene networks, and the other two only apply to dynamic gene networks. Availability: Cross-platform software for network reconstruction, probability estimation, and plotting is free from http://www.davidbickel.com as R functions and a Java application.