Hasil untuk "data science"

Menampilkan 20 dari ~44699232 hasil · dari DOAJ, CrossRef, Semantic Scholar

JSON API
S2 Open Access 2023
Machine Learning Methods for Small Data Challenges in Molecular Science.

Bozheng Dou, Zailiang Zhu, E. Merkurjev et al.

Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.

360 sitasi en Medicine
DOAJ Open Access 2025
Comparison of the Efficacy of Laparoscopic Extraperitoneal versus Transperitoneal Para-Aortic Lymphadenectomy in the Treatment of Gynecological Malignancies: A Meta-Analysis

Mengjie Li, Hong Xue, Jing Sun et al.

Background: To evaluate the efficacy of laparoscopic para-aortic lymphadenectomy in the treatment of gynecologic malignancies through a literature review comparing the extraperitoneal and transperitoneal approaches. Methods: A comprehensive computerized search of PubMed, Embase, the Cochrane Library, Medline, Web of Science, and other relevant databases was conducted, covering the period from January 2010 to January 2025, to collect studies that compared the transperitoneal and extraperitoneal approaches to laparoscopic para-aortic lymphadenectomy in the treatment of gynecologic malignancies. Relevant data were extracted and analyzed using the Review Manager (RevMan) version 5.4.1 statistical software. Outcome indexes included operation time, intraoperative blood loss, number of para-aortic lymph nodes dissected, hospitalization days, and incidence of surgical complications. Results: A total of 525 manuscripts were retrieved, of which 8 were included. Our analysis showed no statistically significant differences between the extraperitoneal and transperitoneal groups in terms of operative time, intraoperative bleeding, and hospitalization days. However, the complication rate was significantly lower in the extraperitoneal group than in the transperitoneal group. Additionally, the number of para-aortic lymph nodes (PAL) retrieved was significantly higher in the extraperitoneal group compared to the transperitoneal group [mean difference (MD) = 0.43, 95% confidence intervals (CI) (0.13 to 0.72, p = 0.004)]. Conclusion: Laparoscopic para-aortic lymphadenectomy for gynecologic malignancies offers several advantages when performed via the extraperitoneal route. This approach reduces surgical trauma, shortens hospital stay, lowers the rate of complication, and increases the number of lymph nodes that can be resected compared to the transperitoneal route. Registration: The study has been registered on https://www.crd.york.ac.uk/prospero/ (registration number: CRD420251033897; registration link: https://www.crd.york.ac.uk/PROSPERO/view/CRD420251033897).

Gynecology and obstetrics
DOAJ Open Access 2025
Survival analysis of Kasturi tobacco plants in Jember district using the stratified cox and extended cox models.

Angga Iryanto Pratama, Dwi Rantini, Ratih Ardiati Ningrum et al.

In 2022, tobacco was the top export commodity, generating $106.3 million in sales. East Java Province contributes significantly to Indonesia's tobacco production, with an annual output of 188.6 thousand tons. Jember Regency is the area in East Java Province which is a Kasturi tobacco cultivation area. The productivity of Kasturi tobacco continues to decrease due to several factors that can affect the survival of the tobacco. This study aims to analyze the survival of Kasturi tobacco by creating stratified Cox and extended Cox models in order to handle factors that cannot fulfill the proportional hazard assumption. Based on the results of the analysis, the stratified Cox model is the best model with AIC and BIC values of 798.108 and 805.5748 respectively, while the extended Cox has AIC and BIC values of 840.2186 and 850.1732 respectively. Variables that can significantly affect the survival of Kasturi tobacco are the variable concentration of ZA fertilizer, pesticide concentration. The addition of ZA fertilizer must be appropriate, because if excessive it can cause poisoning of tobacco plants. Likewise with pesticides, if excessive it will cause damage to the leaves. This policy can increase the productivity of Kasturi tobacco. Then, Jember Regency contributes greater export capacity. • This paper aims to determine the factors that affect the survival of Kasturi tobacco plants in Jember • By using the extended Cox model, the best one is obtained using the Heaviside function with AIC and BIC values of 840.2186 and 850.1732, respectively • The stratified Cox model is better than the extended Cox model, which has AIC and BIC values of 798.108 and 805.5748, respectively

DOAJ Open Access 2025
A Quantitative Legal Support System for Transnational Autonomous Vehicle Design

Zhe Yu, Yiwei Lu, Hao Zhan et al.

One of the key expectations of AI product manufacturers for their products is the ability to scale to larger markets, especially across legal systems, with fewer prototypes and lower adaptation costs. This paper focuses on the increasingly dynamic legal compliance challenges faced by designers of AI products in achieving this goal. Based on non-monotonic reasoning, we design an automated reasoning tool to help them better understand the legal implications of their designs in a transnational context and, ultimately, adjust the design of AI products more flexibly. This tool supports the quantitative representation of the strength of legal significance to help designers better understand the reasons for their decisions from their own perspective. To illustrate this functionality, a case study on traffic regulations across the UK, France, and Japan demonstrates the system’s ability to resolve legal conflicts—such as driving-side mandates and speed radar detector prohibitions—through quantitative evaluation.

Motor vehicles. Aeronautics. Astronautics
CrossRef Open Access 2025
Comparative Analysis of Data Visualization Techniques for Rainfall Data

Wan Hussain Wan Ishak, Fadhilah Yamin, Siti Sarah Maidin et al.

Rainfall data is essential for applications such as climate monitoring, agricultural planning, flood forecasting, and water resource management. However, the interpretation of this data is often hindered by its high volume, variability, and multi-scale temporal nature. Effective visualization is critical not only for summarizing complex datasets but also for uncovering patterns, detecting anomalies, and facilitating informed decision-making. Despite the availability of numerous visualization techniques, selecting the most suitable method for rainfall data, especially across varying temporal resolutions is a challenging task.  This study presents a comparative analysis of widely used data visualization techniques in the context of rainfall data. The methodology was structured into three phases: understanding the nature of rainfall data, reviewing relevant visualization techniques, and conducting a comparative content analysis. A SWOT (Strengths, Weaknesses, Opportunities, and Threats) evaluation was used to assess each technique’s analytical potential, while a temporal suitability comparison was performed across five time granularities: yearly, monthly, weekly, daily, and hourly. Findings show that no single technique is universally effective. Instead, each method demonstrates specific strengths and limitations depending on the temporal scale and analytical objective. Line charts and bar charts are well-suited for lower-frequency data, while heat maps and scatter plots are more effective for high-resolution, time-sensitive patterns. Box plots and histograms provide valuable insights into data distribution and variability, whereas map-based visualizations excel in spatial analysis but require enhancements for temporal exploration. The study concludes that visualization effectiveness depends on aligning method selection with data characteristics and analytical goals. A thoughtful combination of techniques is often necessary to achieve clarity, reduce misinterpretation, and enhance decision support in rainfall data analysis.

DOAJ Open Access 2024
Socioeconomic disadvantage and long-term survival duration in out-of-hospital cardiac arrest patients: A population-based cohort study

Dawn Yi Xin Lee, Chun En Yau, Maeve Pin Pin Pek et al.

Background: Socioeconomic status (SES) is a well-established determinant of cardiovascular health. However, the relationship between SES and clinical outcomes in long-term out-of-hospital cardiac arrest (OHCA) is less well-understood. The Singapore Housing Index (SHI) is a validated building-level SES indicator. We investigated whether SES as measured by SHI is associated with long-term OHCA survival in Singapore. Methods: We conducted an open cohort study with linked data from the Singapore Pan-Asian Resuscitation Outcomes Study (PAROS), and the Singapore Registry of Births and Deaths (SRBD) from 2010 to 2020. We fitted generalized structural equation models, calculating hazard ratios (HRs) using a Weibull model. We constructed Kaplan–Meier survival curves and calculated the predicted marginal probability for each SHI category. Results: We included 659 cases. In both univariable and multivariable analyses, SHI did not have a significant association with survival. Indirect pathways of SHI mediated through covariates such as Emergency Medical Services (EMS) response time (HR of low-medium, high-medium and high SHI when compared to low SHI: 0.98 (0.88–1.10), 1.01 (0.93–1.11), 1.02 (0.93–1.12) respectively), and age of arrest (HR of low-medium, high-medium and high SHI when compared to low SHI: 1.02 (0.75–1.38), 1.08 (0.84–1.38), 1.18 (0.91–1.54) respectively) had no significant association with OHCA survival. There was no clear trend in the predicted marginal probability of survival among the different SHI categories. Conclusions: We did not find a significant association between SES and OHCA survival outcomes in residential areas in Singapore. Among other reasons, this could be due to affordable healthcare across different socioeconomic classes.

Specialties of internal medicine
DOAJ Open Access 2024
Impact of metformin on melanoma: a meta-analysis and systematic review

Hua Feng, Shuxian Shang, Kun Chen et al.

BackgroundThere is evidence of a modest reduction in skin cancer risk among metformin users. However, no studies have further examined the effects of metformin on melanoma survival and safety outcomes. This study aimed to quantitatively summarize any influence of metformin on the overall survival (OS) and immune-related adverse effects (irAEs) in melanoma patients.MethodsSelection criteria: The inclusion criteria were designed based on the PICOS principles. Information sources: PubMed, EMBASE, Cochrane Library, and Web of Science were searched for relevant literature published from the inception of these databases until November 2023 using ‘Melanoma’ and ‘Metformin’ as keywords. Survival outcomes were OS, progression-free survival (PFS), recurrence-free survival (RFS), and mortality; the safety outcome was irAEs. Risk of bias and data Synthesis: The Cochrane tool for assessing the risk of bias in randomized trial 2 (RoB2) and methodological index for non-randomized studies (MINORS) were selected to assess the risk of bias. The Cochrane Q and I2 statistics based on Stata 15.1 SE were used to test the heterogeneity among all studies. Funnel plot, Egger regression, and Begg tests were used to evaluate publication bias. The leave-one-out method was selected as the sensitivity analysis tool.ResultsA total of 12 studies were included, involving 111,036 melanoma patients. The pooled HR for OS was 0.64 (95% CI [0.42, 1.00], p = 0.004, I2 = 73.7%), HR for PFS was 0.89 (95% CI [0.70, 1.12], p = 0.163, I2 = 41.4%), HR for RFS was 0.62 (95% CI [0.26, 1.48], p = 0.085, I2 = 66.3%), and HR for mortality was 0.53 (95% CI [0.46, 0.63], p = 0.775, I2 = 0.0%). There was no significant difference in irAEs incidence (OR = 1.01; 95% CI [0.42, 2.41]; p = 0.642) between metformin and no metformin groups.DiscussionThe improvement in overall survival of melanoma patients with metformin may indirectly result from its diverse biological targets and beneficial effects on multiple systemic diseases. While we could not demonstrate a specific improvement in the survival of melanoma patients, the combined benefits and safety of metformin for patients taking the drug are worthy of recognition.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/, identifier CRD42024518182.

Neoplasms. Tumors. Oncology. Including cancer and carcinogens
DOAJ Open Access 2024
Comparing terminology mappings to ICD-10 coded data in Discharge Abstract Database (DAD) in Alberta, Canada

Namneet Sandhu, Bing Li, Danielle A Southern et al.

Introduction Coding has become burdensome to healthcare systems due to patient complexity and resource requirements. In Alberta, Intelligent Medical Objects (IMO), an interface terminology mapping product, is integrated within the new province-wide Clinical Information System, named Connect Care (CC), to support documentation by clinicians and map clinical terminologies to ICD-10. This study evaluates comparability of terminologies mapped ICD-10 codes to the ICD-10 codes in DAD. Approach We conducted a retrospective analysis by linking acute care hospital DAD with CC between April 2021 and December 2023. The primary outcome was the level of agreement between ‘hospital problem list’ of CC and DAD for ICD-10-CA codes at a 3-digit level. The number of diagnoses and rate of unspecified codes were also compared. The outcome measures were stratified by physician specialty, hospital type and location, and length-of-stay (LOS). Results A total of 498,834 unique hospital records were linked. The average level of agreement between CC and DAD at 3-digit level of ICD-10-CA code was 43.4%. The average number of diagnoses captured in CC (3.91) was slightly lower than DAD (4.06), and the average rate of unspecified codes was higher in CC (26.4%) compared to DAD (23.0%). The level of agreement varied by specialty and length-of-stay with specialties with more complex patients and longer lengths-of-stay having the lowest agreement (43% for generalists and internal medicine and 33% for LOS >3 months). Conclusion Level of agreement between CC and DAD for ICD-10 data was identified as low, indicating significant disparities between terminology mappings and the coding process.

Demography. Population. Vital events
DOAJ Open Access 2024
Ways of Development of the National Electronic Scientific Information System as a Tool for Implementing the State Open Science Policy in Ukraine

A. H. ZHARINOVA, S. S. ZHARINOV, Y. V. RYBALKO

Objective. Since 2020, Ukraine has been developing the National Electronic Scientific Information System (URIS). The creation and development of URIS were initially guided by a specific Concept, whose implementation timeline has now concluded. However, the Cabinet of Ministers of Ukraine mandated that work on URIS should continue. Consequently, there is a need to develop a new concept and identify the future directions for this scientific information system. Current research has emphasized the necessity of implementing the open science paradigm and highlighted the role of Current Research Information Systems (CRIS) in this process. There is a specific need to establish the development directions for URIS as a unique type of such system. The aim is to explore the prospective directions for the development of the National Electronic Scientific Information System "URIS," detailing the pathways for its growth and integrating new functions to fully implement the principles of open science in Ukraine. Methods. The study employed methods of theoretical generalization of normative and analytical data, as well as statistical and comparative analysis of the obtained scientific information. Results. The study identified seven directions with specific implementation paths: developing new functional modules for URIS, ensuring the comprehensive inclusion of priority information resources within URIS, providing Ukrainian scientists, research institutions, and higher education institutions with a digital tool, enabling ongoing communication with the scientific community and business representatives, identifying shortcomings in existing legal acts, creating conditions to overcome the dispersion of financial resources, accounting for losses in Ukrainian research infrastructure due to Russian military aggression. The URIS serves as a multifunctional platform for collecting, processing, and disseminating data related to scientific activities in Ukraine. Developed by the State Scientific and Technical Library of Ukraine, URIS aims to integrate and present aggregated data from various research institutions and universities, thereby enhancing the visibility of Ukrainian research on a global scale. This paper discusses the importance of URIS in supporting academic and research libraries, emphasizing the role of the library's experts in implementing the project. Conclusions. The development and improvement of URIS should lead to a significant reduction in the use of paper for various documents, offering interested individuals and organizations a wide range of digital tools for quick and continuous access to research data, information, and services related to the field of science. The full implementation of this project will enhance the visibility of Ukrainian scientists and research infrastructures both within Ukraine and for interested researchers and institutions in Europe and globally. Scope of Application of Research Results. The development of URIS will support the Ministry of Education and Science of Ukraine in fulfilling the national plan for open science and accelerating Ukraine's integration into the European Research Area. The successful implementation of URIS demonstrates the vital role of academic and research libraries in the digital age. The expertise of the State Scientific and Technical Library of Ukraine has been instrumental in advancing the goals of URIS, which aims to enhance the quality and visibility of Ukrainian research both nationally and internationally.

Bibliography. Library science. Information resources
S2 Open Access 2015
Towards Data Science

Yangyong Zhu, Yun Xiong

Currently, a huge amount of data is being rapidly generated in cyberspace. Datanature (all data in cyberspace) is forming due to a data explosion. Exploring the patterns and rules in datanature is necessary but difficult. A new discipline called Data Science is coming. It provides a type of novel research method (a data-intensive method) for natural and social sciences and goes beyond computer science in researching data. This paper presents the challenges presented by data and discusses what differentiates data science from the established sciences, data technologies, and big data. Our goal is to encourage data related researchers to transfer their focus towards this new science.

294 sitasi en Computer Science

Halaman 14 dari 2234962