Lung cancer clinical decision support demands precise reasoning across complex, multi-stage oncological workflows. Existing multimodal large language models (MLLMs) fail to handle guideline-constrained staging and treatment reasoning. We formalize three oncological precision treatment (OPT) tasks for lung cancer, spanning TNM staging, treatment recommendation, and end-to-end clinical decision support. We introduce LungCURE, the first standardized multimodal benchmark built from 1,000 real-world, clinician-labeled cases across more than 10 hospitals. We further propose LCAgent, a multi-agent framework that ensures guideline-compliant lung cancer clinical decision-making by suppressing cascading reasoning errors across the clinical pathway. Experiments reveal large differences across various large language models (LLMs) in their capabilities for complex medical reasoning, when given precise treatment requirements. We further verify that LCAgent, as a simple yet effective plugin, enhances the reasoning performance of LLMs in real-world medical scenarios.
Julie Williamson, Muhammad Zaki Hidayatullah Fadlullah, Magdalena Kovacsovics-Bankowski
et al.
Patients with advanced melanoma who progress on standard-dose ipilimumab (Ipi) + nivolumab continue to have poor prognosis. Studies support a dose–response activity of Ipi, and one promising combination is Ipi 10 mg/kg (Ipi10) + temozolomide (TMZ). We performed a retrospective cohort analysis of patients with advanced melanoma treated with Ipi10 + TMZ in the immunotherapy refractory/resistant setting (n = 6, all progressed after prior Ipi + nivolumab), using similar patients treated with Ipi3 + TMZ (n = 6) as comparison. Molecular profiling by whole-exome sequencing (WES) and RNA-sequencing (RNA-seq) of tumors harvested through one responder’s treatment was performed. With a median follow up of 119 days, patients treated with Ipi10 + TMZ had a statistically significant longer median progression-free survival of 144.5 days (range 27–219) vs. 44 (26–75) in Ipi 3 mg/kg (Ipi3) + TMZ, <i>p</i> = 0.04, and a trend of longer median overall survival of 154.5 days (27–537) vs. 89.5 (26–548). Two patients in the Ipi10 + TMZ cohort had a partial response, and both responders had BRAF V600E mutant melanoma. RNA-seq showed enrichment of inflammatory signatures, including interferon responses in metastases after Ipi10 + TMZ compared to the primary tumor, and downregulated negative immune regulators. Ipi10 + TMZ demonstrated efficacy, including dramatic responses in patients refractory to prior Ipi + anti-PD1. Molecular data suggest a potential threshold of Ipi dose for activation of sufficient anti-tumor immune response, and higher doses are required for some patients.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Kathryn Bress, Patrick Bou-Samra, Cramer J. Kallem
et al.
Abstract Background Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide. Due to the advanced stage in which HCC presents, most patients are only eligible for transarterial chemoembolization (TACE) or radioembolization (Y90). The purpose of this study is to examine the differences in survival and health-related quality of life (HRQOL) in patients diagnosed with HCC and treated with TACE or Y90. Methods Two hundred thirty-four patients with HCC were enrolled in studies examining HRQOL between 2003–2009. HRQOL was evaluated using the Functional Assessment of Cancer Therapy-Hepatobiliary (FACT-Hep). Between-group differences were examined using chi-square and ANOVA. Survival was assessed using Kaplan–Meier and Cox regression analyses. Results Significant baseline differences between patients treated with TACE versus Y90 were found. Patients who received Y90 tended to be older (p < 0.001), female (p < 0.001), had fewer lesions (p = 0.03), had smaller tumors (p = 0.03), and were less likely to have vascular invasion (p = 0.04). After adjusting for demographic and disease-specific factors, no significant differences in HRQOL were observed at 3 months (p = 0.79) or 6 months (p = 0.75). Clinically meaningful differences were found, with the TACE group reporting greater physical, social, and emotional well-being at 3 and 6 months and greater overall HRQOL at 6 months. No significant differences in survival were found. Conclusions Treatment with TACE and Y90 was similar with regard to survival. However, TACE showed statistically and clinically meaningful benefits in physical, social/family, and emotional well-being. Further research is warranted to identify profiles of patients who may demonstrate a preferential response to either TACE or Y90.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Abraham Francisco Arellano Tavara, Umesh Kumar, Jathurshan Pradeepkumar
et al.
Variants of Uncertain Significance (VUS) limit the clinical utility of prostate cancer genomics by delaying diagnosis and therapy when evidence for pathogenicity or benignity is incomplete. Progress is further limited by inconsistent annotations across sources and the absence of a prostate-specific benchmark for fair comparison. We introduce Prostate-VarBench, a curated pipeline for creating prostate-specific benchmarks that integrates COSMIC (somatic cancer mutations), ClinVar (expert-curated clinical variants), and TCGA-PRAD (prostate tumor genomics from The Cancer Genome Atlas) into a harmonized dataset of 193,278 variants supporting patient- or gene-aware splits to prevent data leakage. To ensure data integrity, we corrected a Variant Effect Predictor (VEP) issue that merged multiple transcript records, introducing ambiguity in clinical significance fields. We then standardized 56 interpretable features across eight clinically relevant tiers, including population frequency, variant type, and clinical context. AlphaMissense pathogenicity scores were incorporated to enhance missense variant classification and reduce VUS uncertainty. Building on this resource, we trained an interpretable TabNet model to classify variant pathogenicity, whose step-wise sparse masks provide per-case rationales consistent with molecular tumor board review practices. On the held-out test set, the model achieved 89.9% accuracy with balanced class metrics, and the VEP correction yields an 6.5% absolute reduction in VUS.
The development of accessible screening tools for early cancer detection in dogs represents a significant challenge in veterinary medicine. Routine laboratory data offer a promising, low-cost source for such tools, but their utility is hampered by the non-specificity of individual biomarkers and the severe class imbalance inherent in screening populations. This study assesses the feasibility of cancer risk classification using the Golden Retriever Lifetime Study (GRLS) cohort under real-world constraints, including the grouping of diverse cancer types and the inclusion of post-diagnosis samples. A comprehensive benchmark evaluation was conducted, systematically comparing 126 analytical pipelines that comprised various machine learning models, feature selection methods, and data balancing techniques. Data were partitioned at the patient level to prevent leakage. The optimal model, a Logistic Regression classifier with class weighting and recursive feature elimination, demonstrated moderate ranking ability (AUROC = 0.815; 95% CI: 0.793-0.836) but poor clinical classification performance (F1-score = 0.25, Positive Predictive Value = 0.15). While a high Negative Predictive Value (0.98) was achieved, insufficient recall (0.79) precludes its use as a reliable rule-out test. Interpretability analysis with SHapley Additive exPlanations (SHAP) revealed that predictions were driven by non-specific features like age and markers of inflammation and anemia. It is concluded that while a statistically detectable cancer signal exists in routine lab data, it is too weak and confounded for clinically reliable discrimination from normal aging or other inflammatory conditions. This work establishes a critical performance ceiling for this data modality in isolation and underscores that meaningful progress in computational veterinary oncology will require integration of multi-modal data sources.
Background: Data on treatment outcomes of Epstein-Barr virus (EBV) associated nasopharyngeal carcinoma (NPC) largely comes from endemic regions. There is limited literature regarding the epidemiology and treatment outcomes of EBV-associated NPC in South Africa.
Aim: The aim of the study was to compare overall survival (OS) of EBV positive and EBV negative NPC patients.
Setting: Groote Schuur Hospital, South Africa.
Methods: Data were collected on all patients with histologically confirmed NPC over an 11-year period, including prevalence of EBV, OS, disease-free survival (DFS), loco-regional control (LRC), and impact of treatment interruptions on OS.
Results: There were 53 patients in total. Non-keratinising carcinoma was the primary histological subtype (86.8%). The majority of patients had EBV positive NPC (47.2%). The 2- and 5-year OS of EBV positive patients treated with curative intent were significantly higher than EBV negative patients, 84.0% versus 34.0% and 45.0% versus 17.0%, respectively (hazard ratio [HR] 0.25, 95% confidence interval [CI]: 0.10–0.63, p = 0.002). Two-year DFS was 55.0% versus 43.0% (HR: 0.59, 95% CI: 0.18–1.98, p = 0.38) and 2-year LRC were 76.2% versus 46.2% (HR: 0.40, 95% CI: 0.12–1.36, p = 0.13) for EBV positive and EBV negative patients respectively.
Conclusion: Treatment of EBV-associated NPC is associated with superior OS compared to EBV negative tumours.
Contribution: Epstein-Barr virus was found to be a significant prognostic factor associated with superior OS compared to EBV negative NPC. These findings correlate with literature from endemic and non-endemic regions.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Liver cancer is a leading cause of cancer-related mortality worldwide, with its high genetic heterogeneity complicating diagnosis and treatment. This study introduces DLSOM, a deep learning framework utilizing stacked autoencoders to analyze the complete somatic mutation landscape of 1,139 liver cancer samples, covering 20,356 protein-coding genes. By transforming high-dimensional mutation data into three low-dimensional features, DLSOM enables robust clustering and identifies five distinct liver cancer subtypes with unique mutational, functional, and biological profiles. Subtypes SC1 and SC2 exhibit higher mutational loads, while SC3 has the lowest, reflecting mutational heterogeneity. Novel and COSMIC-associated mutational signatures reveal subtype-specific molecular mechanisms, including links to hypermutation and chemotherapy resistance. Functional analyses further highlight the biological relevance of each subtype. This comprehensive framework advances precision medicine in liver cancer by enabling the development of subtype-specific diagnostics, biomarkers, and therapies, showcasing the potential of deep learning in addressing cancer complexity.
To track tumors during surgery, information from preoperative CT scans is used to determine their position. However, as the surgeon operates, the tumor may be deformed which presents a major hurdle for accurately resecting the tumor, and can lead to surgical inaccuracy, increased operation time, and excessive margins. This issue is particularly pronounced in robot-assisted partial nephrectomy (RAPN), where the kidney undergoes significant deformations during operation. Toward addressing this, we introduce a occupancy network-based method for the localization of tumors within kidney phantoms undergoing deformations at interactive speeds. We validate our method by introducing a 3D hydrogel kidney phantom embedded with exophytic and endophytic renal tumors. It closely mimics real tissue mechanics to simulate kidney deformation during in vivo surgery, providing excellent contrast and clear delineation of tumor margins to enable automatic threshold-based segmentation. Our findings indicate that the proposed method can localize tumors in moderately deforming kidneys with a margin of 6mm to 10mm, while providing essential volumetric 3D information at over 60Hz. This capability directly enables downstream tasks such as robotic resection.
Surgery for brain cancer is a major problem in neurosurgery. The diffuse infiltration into the surrounding normal brain by these tumors makes their accurate identification by the naked eye difficult. Since surgery is the common treatment for brain cancer, an accurate radical resection of the tumor leads to improved survival rates for patients. However, the identification of the tumor boundaries during surgery is challenging. Hyperspectral imaging is a noncontact, non-ionizing and non-invasive technique suitable for medical diagnosis. This study presents the development of a novel classification method taking into account the spatial and spectral characteristics of the hyperspectral images to help neurosurgeons to accurately determine the tumor boundaries in surgical-time during the resection, avoiding excessive excision of normal tissue or unintentionally leaving residual tumor. The algorithm proposed in this study to approach an efficient solution consists of a hybrid framework that combines both supervised and unsupervised machine learning methods. To evaluate the proposed approach, five hyperspectral images of surface of the brain affected by glioblastoma tumor in vivo from five different patients have been used. The final classification maps obtained have been analyzed and validated by specialists. These preliminary results are promising, obtaining an accurate delineation of the tumor area.
Tumors can manifest in various forms and in different areas of the human body. Brain tumors are specifically hard to diagnose and treat because of the complexity of the organ in which they develop. Detecting them in time can lower the chances of death and facilitate the therapy process for patients. The use of Artificial Intelligence (AI) and, more specifically, deep learning, has the potential to significantly reduce costs in terms of time and resources for the discovery and identification of tumors from images obtained through imaging techniques. This research work aims to assess the performance of a multimodal model for the classification of Magnetic Resonance Imaging (MRI) scans processed as grayscale images. The results are promising, and in line with similar works, as the model reaches an accuracy of around 98\%. We also highlight the need for explainability and transparency to ensure human control and safety.
Rectal cancer is one of the most common diseases and a major cause of mortality. For deciding rectal cancer treatment plans, T-staging is important. However, evaluating the index from preoperative MRI images requires high radiologists' skill and experience. Therefore, the aim of this study is to segment the mesorectum, rectum, and rectal cancer region so that the system can predict T-stage from segmentation results. Generally, shortage of large and diverse dataset and high quality annotation are known to be the bottlenecks in computer aided diagnostics development. Regarding rectal cancer, advanced cancer images are very rare, and per-pixel annotation requires high radiologists' skill and time. Therefore, it is not feasible to collect comprehensive disease patterns in a training dataset. To tackle this, we propose two kinds of approaches of image synthesis-based late stage cancer augmentation and semi-supervised learning which is designed for T-stage prediction. In the image synthesis data augmentation approach, we generated advanced cancer images from labels. The real cancer labels were deformed to resemble advanced cancer labels by artificial cancer progress simulation. Next, we introduce a T-staging loss which enables us to train segmentation models from per-image T-stage labels. The loss works to keep inclusion/invasion relationships between rectum and cancer region consistent to the ground truth T-stage. The verification tests show that the proposed method obtains the best sensitivity (0.76) and specificity (0.80) in distinguishing between over T3 stage and underT2. In the ablation studies, our semi-supervised learning approach with the T-staging loss improved specificity by 0.13. Adding the image synthesis-based data augmentation improved the DICE score of invasion cancer area by 0.08 from baseline.
Yu Ando, Nora Jee-Young Park and, Gun Oh Chong
et al.
Screening is critical for prevention and early detection of cervical cancer but it is time-consuming and laborious. Supervised deep convolutional neural networks have been developed to automate pap smear screening and the results are promising. However, the interest in using only normal samples to train deep neural networks has increased owing to class imbalance problems and high-labeling costs that are both prevalent in healthcare. In this study, we introduce a method to learn explainable deep cervical cell representations for pap smear cytology images based on one class classification using variational autoencoders. Findings demonstrate that a score can be calculated for cell abnormality without training models with abnormal samples and localize abnormality to interpret our results with a novel metric based on absolute difference in cross entropy in agglomerative clustering. The best model that discriminates squamous cell carcinoma (SCC) from normals gives 0.908 +- 0.003 area under operating characteristic curve (AUC) and one that discriminates high-grade epithelial lesion (HSIL) 0.920 +- 0.002 AUC. Compared to other clustering methods, our method enhances the V-measure and yields higher homogeneity scores, which more effectively isolate different abnormality regions, aiding in the interpretation of our results. Evaluation using in-house and additional open dataset show that our model can discriminate abnormality without the need of additional training of deep models.
Purpose of the study: Analysis of available data on geno-phenotypic correlations and atypical forms of neurofibromatosis type 1. Material and methods. We searched for relevant sources in the Scopus, Web of Science, PubMed systems, including publications from May 1993 to October 2021. Of the 318 studies we identified, 59 were used to write a systematic review. Results. We found studies describing atypical forms of neurofibromatosis type 1 with an erased course without manifestation of a tumor syndrome, which are caused by specific mutations in the NF1 gene (causing substitutions of amino acids in neurofibromin: p.Arg1038, p.Met1149, p.Arg1809, or deletion of amino acids: p.Met990del, p.Met992del). NF1 patients with microdeletions are characterized by more severe disease symptoms (more often facial dysmorphism, skeletal and cardiovascular abnormalities, learning difficulties, and symptomatic spinal neurofibromas). mutations of splicing sites and extended deletions of the NF1 gene are associated with early manifestation of tumors, mutations at the 5’-end of the gene, causing a shortening of the protein product, are associated with optic nerve gliomas. the mutation c.3721C>T (p.R1241*) correlated with structural brain damage, and c.6855C>A (p.Y2285*) with endocrine disorders. the manifestations of NF1, similar to lipomatosis and Jaffe-Campanacci syndrome, not associated with a specific type of mutation are described. Conclusion. In spite of pronounced clinical variability of the disease, even among members of the same family, several studies have described genotype-phenotype correlations. Therefore, the role of modifier genes and epigenetic factors in the pathogenesis of NF1 is assumed, since the neurofibromin protein has a complex structure with several functional domains. It has been shown that the severity of the tumor syndrome is influenced by the methylation characteristics of NF1 gene and adjacent areas. in addition, NF1 gene is associated with a variety of microRNAs. therefore, targeted therapy aimed at specific non-coding RNAs to restore normal expression of NF1 gene can become a promising treatment for NF1.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Muthana Al Abo, Larisa Gearhart-Serna, Steven Van Laere
et al.
Abstract Aggressive breast cancer variants, like triple negative and inflammatory breast cancer, contribute to disparities in survival and clinical outcomes among African American (AA) patients compared to White (W) patients. We previously identified the dominant role of anti-apoptotic protein XIAP in regulating tumor cell adaptive stress response (ASR) that promotes a hyperproliferative, drug resistant phenotype. Using The Cancer Genome Atlas (TCGA), we identified 46–88 ASR genes that are differentially expressed (2-fold-change and adjusted p-value < 0.05) depending on PAM50 breast cancer subtype. On average, 20% of all 226 ASR genes exhibited race-related differential expression. These genes were functionally relevant in cell cycle, DNA damage response, signal transduction, and regulation of cell death-related processes. Moreover, 23% of the differentially expressed ASR genes were associated with AA and/or W breast cancer patient survival. These identified genes represent potential therapeutic targets to improve breast cancer outcomes and mitigate associated health disparities.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Mariano Provencio, Manuel Cobo, Delvys Rodriguez-Abreu
et al.
Abstract Background The survival of patients with lung cancer has substantially increased in the last decade by about 15%. This increase is, basically, due to targeted therapies available for advanced stages and the emergence of immunotherapy itself. This work aims to study the situation of biomarker testing in Spain. Patients and methods The Thoracic Tumours Registry (TTR) is an observational, prospective, registry-based study that included patients diagnosed with lung cancer and other thoracic tumours, from September 2016 to 2020. This TTR study was sponsored by the Spanish Lung Cancer Group (GECP) Foundation, an independent, scientific, multidisciplinary oncology society that coordinates more than 550 experts and 182 hospitals across the Spanish territory. Results Nine thousand two hundred thirty-nine patients diagnosed with stage IV non-small cell lung cancer (NSCLC) between 2106 and 2020 were analysed. 7,467 (80.8%) were non-squamous and 1,772 (19.2%) were squamous. Tumour marker testing was performed in 85.0% of patients with non-squamous tumours vs 56.3% in those with squamous tumours (p-value < 0.001). The global testing of EGFR, ALK, and ROS1 was 78.9, 64.7, 35.6% respectively, in non-squamous histology. PDL1 was determined globally in the same period (46.9%), although if we focus on the last 3 years it exceeds 85%. There has been a significant increase in the last few years of all determinations and there are even close to 10% of molecular determinations that do not yet have targeted drug approval but will have it in the near future. 4,115 cases had a positive result (44.5%) for either EGFR, ALK, KRAS, BRAF, ROS1, or high PDL1. Conclusions Despite the lack of a national project and standard protocol in Spain that regulates the determination of biomarkers, the situation is similar to other European countries. Given the growing number of different determinations and their high positivity, national strategies are urgently needed to implement next-generation sequencing (NGS) in an integrated and cost-effective way in lung cancer.
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
Neoplasms (NPs) and neurological diseases and disorders (NDDs) are amongst the major classes of diseases underlying deaths of a disproportionate number of people worldwide. To determine if there exist some distinctive features in the local wiring patterns of protein interactions emerging at the onset of a disease belonging to either of these two classes, we examined 112 and 175 protein interaction networks belonging to NPs and NDDs, respectively. Orbit usage profiles (OUPs) for each of these networks were enumerated by investigating the networks' local topology. 56 non-redundant OUPs (nrOUPs) were derived and used as network features for classification between these two disease classes. Four machine learning classifiers, namely, k-nearest neighbour (KNN), support vector machine (SVM), deep neural network (DNN), random forest (RF) were trained on these data. DNN obtained the greatest average AUPRC (0.988) among these classifiers. DNNs developed on node2vec and the proposed nrOUPs embeddings were compared using 5-fold cross validation on the basis of average values of the six of performance measures, viz., AUPRC, Accuracy, Sensitivity, Specificity, Precision and MCC. It was found that nrOUPs based classifier performed better in all of these six performance measures.