Hasil untuk "data science"

Menampilkan 20 dari ~44761596 hasil · dari CrossRef, DOAJ, Semantic Scholar

JSON API
S2 Open Access 2016
The BIG Data Center: from deposition to integration to translation

Wenming Zhao, Jingfa Xiao

Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn.

444 sitasi en Medicine, Computer Science
S2 Open Access 2016
Where are human subjects in Big Data research? The emerging ethics divide

Jacob Metcalf, K. Crawford

There are growing discontinuities between the research practices of data science and established tools of research ethics regulation. Some of the core commitments of existing research ethics regulations, such as the distinction between research and practice, cannot be cleanly exported from biomedical research to data science research. Such discontinuities have led some data science practitioners and researchers to move toward rejecting ethics regulations outright. These shifts occur at the same time as a proposal for major revisions to the Common Rule—the primary regulation governing human-subjects research in the USA—is under consideration for the first time in decades. We contextualize these revisions in long-running complaints about regulation of social science research and argue data science should be understood as continuous with social sciences in this regard. The proposed regulations are more flexible and scalable to the methods of non-biomedical research, yet problematically largely exclude data science methods from human-subjects regulation, particularly uses of public datasets. The ethical frameworks for Big Data research are highly contested and in flux, and the potential harms of data science research are unpredictable. We examine several contentious cases of research harms in data science, including the 2014 Facebook emotional contagion study and the 2016 use of geographical data techniques to identify the pseudonymous artist Banksy. To address disputes about application of human-subjects research ethics in data science, critical data studies should offer a historically nuanced theory of “data subjectivity” responsive to the epistemic methods, harms and benefits of data science and commerce.

349 sitasi en Sociology, Computer Science
S2 Open Access 2017
Spatio-Temporal Data Mining

G. Atluri, A. Karpatne, Vipin Kumar

Large volumes of spatio-temporal data are increasingly collected and studied in diverse domains, including climate science, social sciences, neuroscience, epidemiology, transportation, mobile health, and Earth sciences. Spatio-temporal data differ from relational data for which computational approaches are developed in the data-mining community for multiple decades in that both spatial and temporal attributes are available in addition to the actual measurements/attributes. The presence of these attributes introduces additional challenges that needs to be dealt with. Approaches for mining spatio-temporal data have been studied for over a decade in the data-mining community. In this article, we present a broad survey of this relatively young field of spatio-temporal data mining. We discuss different types of spatio-temporal data and the relevant data-mining questions that arise in the context of analyzing each of these datasets. Based on the nature of the data-mining problem studied, we classify literature on spatio-temporal data mining into six major categories: clustering, predictive learning, change detection, frequent pattern mining, anomaly detection, and relationship mining. We discuss the various forms of spatio-temporal data-mining problems in each of these categories.

299 sitasi en Computer Science, Mathematics
S2 Open Access 2020
Dimensions: Bringing down barriers between scientometricians and data

C. Herzog, D. Hook, Stacy Konkiel

Abstract Until recently, comprehensive scientometrics data has been made available only in siloed, subscription-based tools that are inaccessible to researchers who lack institutional support and resources. As a result of limited data access, research evaluation practices have focused upon basic indicators that only take publications and their citation rates into account. This has blocked innovation on many fronts. Dimensions is a database that links and contextualizes different research information objects. It brings together data describing and linking awarded grants, clinical trials, patents, and policy documents, as well as altmetric information, alongside traditional publications and citations data. This article describes the approach that Digital Science is taking to support the scientometric community, together with the various Dimensions tools available to researchers who wish to use Dimensions data in their research at no cost.

194 sitasi en Computer Science, Political Science
S2 Open Access 2012
Ecoinformatics: supporting ecology as a data-intensive science.

W. Michener, Matthew B. Jones

Ecology is evolving rapidly and increasingly changing into a more open, accountable, interdisciplinary, collaborative and data-intensive science. Discovering, integrating and analyzing massive amounts of heterogeneous data are central to ecology as researchers address complex questions at scales from the gene to the biosphere. Ecoinformatics offers tools and approaches for managing ecological data and transforming the data into information and knowledge. Here, we review the state-of-the-art and recent advances in ecoinformatics that can benefit ecologists and environmental scientists as they tackle increasingly challenging questions that require voluminous amounts of data across disciplines and scales of space and time. We also highlight the challenges and opportunities that remain.

452 sitasi en Medicine, Computer Science
DOAJ Open Access 2025
An Examination of Concord Errors in the Academic Writings of Students at a Technical University in Ghana

Gifty Serwah Mensah , Ernest Kwesi Klu, Ndishunwani Vincent Demana et al.

This study explored the teachers’ challenges in using local history projects to develop learners’ historical skills. Despite the importance of local history projects in developing learners’ historical skills and knowledge, there is evidence that teachers are struggling to teach and administer them effectively. The study used critical pedagogy as a critical framework, along with an interpretive paradigm, to guide the qualitative approach to achieve the aim and objective of the study. The study purposefully sampled five participants from five different schools in the Motheo Education District, Free State Province. Semi-structured interviews were used as a data collection strategy. Thematic analysis was used to make sense of the data. The researchers’ findings show that social science teachers who participated in the study faced many challenges in teaching local history projects to develop learners’ historical skills. To mitigate these challenges, the Department of Education should develop strategies to organize workshops to empower teachers, specifically in teaching local history projects, using available resources. The paper also recommends a collaborative effort of relevant stakeholders to come together to assist the schools. State the contribution of this study to scholarship. The study is contributing to research on social sciences education by assisting teachersto be aware of challenges that impacts on the teaching of local history project. It is anticipated this with this knowledge Social Science teachers will be able to minimise or circumvent these challenges.

DOAJ Open Access 2025
A novel approach for target deconvolution from phenotype-based screening using knowledge graph

Xiaohong Wang, Meifang Zhang, Jianliang Xu et al.

Abstract Deconvoluting drug targets is crucial in modern drug development, yet both traditional and artificial intelligence (AI)-driven methods face challenges in terms of completeness, accuracy, and efficiency. Identifying drug targets, especially within complex systems such as the p53 pathway, remains a formidable task. The regulation of this pathway by myriad stress signals and regulatory elements adds layers of complexity to the discovery of effective p53 pathway activators. Recent insights into p53 activation have led to two main screening strategies for p53 activators. The target-based approach focuses on p53 and its regulators (MDM2, MDMX, USP7, Sirt proteins), but requires separate systems for each target and may miss multi-target compounds. Phenotype-based screening can reveal new targets but involves a lengthy process to elucidate mechanisms and targets, hindering drug development. Knowledge graphs have emerged as powerful tools that offer strengths in link prediction and knowledge inference to address these issues. In this study, we constructed a protein-protein interaction knowledge graph (PPIKG) and pioneered an integrated drug target deconvolution system that combines AI with molecular docking techniques. Analysis based on the PPIKG narrowed down candidate proteins from 1088 to 35, significantly saving time and cost. Subsequent molecular docking led us to pinpoint USP7 as a direct target for the p53 pathway activator UNBS5162. Leveraging knowledge graphs and a multidisciplinary approach allows us to streamline the laborious and expensive process of reverse targeting drug discovery through phenotype screening. Our findings have the potential to revolutionize drug screening and open new avenues in pharmacological research, increasing the speed and efficiency of pursuing novel therapeutics. The code is available at  https://github.com/Xiong-Jing/PPIKG .

Medicine, Science
DOAJ Open Access 2025
Risk prediction models for ovarian hyperstimulation syndrome: a systematic review and meta-analysis

Jinghui Liu, Fangli Liu, Wenqi Xu et al.

Abstract Background Ovarian hyperstimulation syndrome (OHSS) is a serious complication of controlled ovarian stimulation (COS). The main clinical manifestation of OHSS is increased ovarian volume. OHSS can cause local and systemic tissue oedema, electrolyte disturbances, cardiorespiratory dysfunction, coagulation dysfunction, and other symptoms. These symptoms greatly affect patients’ quality of life. As infertility rates rise and assisted reproductive technology (ART) becomes more common, the risk of OHSS increases. Therefore, early identification of high-risk patients and timely intervention are crucial. Methods The PubMed, Embase, Cochrane Library, Web of Science, CINAHL, China National Knowledge Internet (CNKI), Wanfang, China Science and Technology Journal Database (VIP), and China Biology Medicine (CBM) databases were systematically searched from inception to March 30, 2025. Two researchers independently screened the literature, extracted data, and evaluated the quality of included studies using the updated prediction model risk of bias assessment tool (PROBAST + AI). We conducted a meta-analysis of predictors from the developed models using Stata 15.0 software. Results A total of 16 studies were included, comprising 29 OHSS risk prediction models. The area under the curve (AUC) ranged from 0.628 to 0.998, with 23 models demonstrating AUC > 0.700. Model calibration was performed in 10 studies, internal validation in 14 studies, and 2 studies conducted both internal and external validation. The PROBAST + AI assessment identified a high risk of bias across the included studies, primarily in the research design and statistical analysis domains. The most common predictors identified across the models included: antral follicle count (AFC), estrogen (E2) levels on the day of human chorionic gonadotrophin (hCG) injection, number of oocytes retrieved, polycystic ovary syndrome (PCOS), age, anti-mullerian hormone (AMH), gonadotropin (Gn) days, initial dose of Gn, and body mass index (BMI). Conclusions Our findings indicate substantial variation in OHSS incidence. Interpretation of the results should be with caution due to the limitations of the current evidence. Current OHSS risk prediction models remain under development and require further refinement. Future efforts to build and improve these models should focus on key areas, including research design, sample size, handling of missing data, model calibration and validation, and detailed reporting. Trial registration: PROSPERO CRD420251025876.

Gynecology and obstetrics

Halaman 28 dari 2238080