Hasil "data science"

S2 Open Access 2014

Next Steps for Citizen Science

R. Bonney, Jennifer Shirk, T. Phillips et al.

1083 sitasi en Medicine, Geography

Detail DOI Sumber

S2 Open Access 2014

Data Preprocessing in Data Mining

S. García, Julián Luengo, F. Herrera

1040 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2013

Next Generation Science Standards

P. Adams

1376 sitasi en Psychology, Engineering

Detail DOI Sumber

S2 Open Access 2013

The inevitable application of big data to health care.

T. Murdoch, A. Detsky

1611 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 2013

The Electric and Magnetic Field Instrument Suite and Integrated Science (EMFISIS) on RBSP

C. Kletzing, W. Kurth, Mario H. Acuña et al.

The Electric and Magnetic Field Instrument and Integrated Science (EMFISIS) investigation on the NASA Radiation Belt Storm Probes (now named the Van Allen Probes) mission provides key wave and very low frequency magnetic field measurements to understand radiation belt acceleration, loss, and transport. The key science objectives and the contribution that EMFISIS makes to providing measurements as well as theory and modeling are described. The key components of the instruments suite, both electronics and sensors, including key functional parameters, calibration, and performance, demonstrate that EMFISIS provides the needed measurements for the science of the RBSP mission. The EMFISIS operational modes and data products, along with online availability and data tools provide the radiation belt science community with one the most complete sets of data ever collected.

1102 sitasi en Physics

Detail DOI Sumber

S2 Open Access 2011

Statistics for High-Dimensional Data: Methods, Theory and Applications

Peter Bhlmann, S. Geer

1684 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2011

Global multi-resolution terrain elevation data 2010 (GMTED2010)

J. Danielson, D. Gesch

1284 sitasi en Geology

Detail DOI Sumber

S2 Open Access 2010

Citizen Science as an Ecological Research Tool: Challenges and Benefits

J. Dickinson, B. Zuckerberg, David N. Bonter

1817 sitasi en Geography

Detail DOI Sumber

S2 Open Access 2008

Reducing the Dimensionality of Data with Neural

Geoffrey E. Hinton

1808 sitasi en Computer Science

Detail Sumber

S2 Open Access 2007

Mars Reconnaissance Orbiter's High Resolution Imaging Science Experiment (HiRISE)

A. S. McEwen, N. Thomas, Hirise Team

1625 sitasi en Geology

Detail DOI Sumber

S2 Open Access 2001

Fuzzy-Set Social Science

Charles C. Ragin

1732 sitasi en Political Science, Computer Science

Detail DOI Sumber

S2 Open Access 2005

Data streams: algorithms and applications

S. Muthukrishnan

1880 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2017

Taxonomic bias in biodiversity data and societal preferences

Julien Troudet, P. Grandcolas, A. Blin et al.

Studying and protecting each and every living species on Earth is a major challenge of the 21st century. Yet, most species remain unknown or unstudied, while others attract most of the public, scientific and government attention. Although known to be detrimental, this taxonomic bias continues to be pervasive in the scientific literature, but is still poorly studied and understood. Here, we used 626 million occurrences from the Global Biodiversity Information Facility (GBIF), the biggest biodiversity data portal, to characterize the taxonomic bias in biodiversity data. We also investigated how societal preferences and taxonomic research relate to biodiversity data gathering. For each species belonging to 24 taxonomic classes, we used the number of publications from Web of Science and the number of web pages from Bing searches to approximate research activity and societal preferences. Our results show that societal preferences, rather than research activity, strongly correlate with taxonomic bias, which lead us to assert that scientists should advertise less charismatic species and develop societal initiatives (e.g. citizen science) that specifically target neglected organisms. Ensuring that biodiversity is representatively sampled while this is still possible is an urgent prerequisite for achieving efficient conservation plans and a global understanding of our surrounding environment.

380 sitasi en Geography

Detail DOI Sumber

S2 Open Access 2019

Predicting Materials Properties with Little Data Using Shotgun Transfer Learning

H. Yamada, Chang Liu, Stephen Wu et al.

There is a growing demand for the use of machine learning (ML) to derive fast-to-evaluate surrogate models of materials properties. In recent years, a broad array of materials property databases have emerged as part of a digital transformation of materials science. However, recent technological advances in ML are not fully exploited because of the insufficient volume and diversity of materials data. An ML framework called “transfer learning” has considerable potential to overcome the problem of limited amounts of materials data. Transfer learning relies on the concept that various property types, such as physical, chemical, electronic, thermodynamic, and mechanical properties, are physically interrelated. For a given target property to be predicted from a limited supply of training data, models of related proxy properties are pretrained using sufficient data; these models capture common features relevant to the target task. Repurposing of such machine-acquired features on the target task yields outstanding prediction performance even with exceedingly small data sets, as if highly experienced human experts can make rational inferences even for considerably less experienced tasks. In this study, to facilitate widespread use of transfer learning, we develop a pretrained model library called XenonPy.MDL. In this first release, the library comprises more than 140 000 pretrained models for various properties of small molecules, polymers, and inorganic crystalline materials. Along with these pretrained models, we describe some outstanding successes of transfer learning in different scenarios such as building models with only dozens of materials data, increasing the ability of extrapolative prediction through a strategic model transfer, and so on. Remarkably, transfer learning has autonomously identified rather nontrivial transferability across different properties transcending the different disciplines of materials science; for example, our analysis has revealed underlying bridges between small molecules and polymers and between organic and inorganic chemistry.

303 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 2013

If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology

J. Wallis, E. Rolando, C. Borgman

Research on practices to share and reuse data will inform the design of infrastructure to support data collection, management, and discovery in the long tail of science and technology. These are research domains in which data tend to be local in character, minimally structured, and minimally documented. We report on a ten-year study of the Center for Embedded Network Sensing (CENS), a National Science Foundation Science and Technology Center. We found that CENS researchers are willing to share their data, but few are asked to do so, and in only a few domain areas do their funders or journals require them to deposit data. Few repositories exist to accept data in CENS research areas.. Data sharing tends to occur only through interpersonal exchanges. CENS researchers obtain data from repositories, and occasionally from registries and individuals, to provide context, calibration, or other forms of background for their studies. Neither CENS researchers nor those who request access to CENS data appear to use external data for primary research questions or for replication of studies. CENS researchers are willing to share data if they receive credit and retain first rights to publish their results. Practices of releasing, sharing, and reusing of data in CENS reaffirm the gift culture of scholarship, in which goods are bartered between trusted colleagues rather than treated as commodities.

450 sitasi en Business, Medicine

Detail DOI Sumber

CrossRef Open Access 2026

Machine Learning for Data Science

en

Detail DOI Sumber

S2 Open Access 2014

Understanding the paradigm shift to computational social science in the presence of big data

Ray M. Chang, R. Kauffman, YoungOk Kwon

article i nfo Available online xxxx The era of big data has created new opportunities for researchers to achieve high relevance and impact amid changes and transformations in how we study social science phenomena. With the emergence of new data col- lection technologies, advanced data mining and analytics support, there seems to be fundamental changes that are occurring with the research questions we can ask, and the research methods we can apply. The contexts in- clude social networks and blogs, political discourse, corporate announcements, digital journalism, mobile tele- phony, home entertainment, online gaming, financial services, online shopping, social advertising, and social commerce. The changing costs of data collection and the new capabilities that researchers have to conduct re- search that leverages micro-level, meso-level and macro-level data suggest the possibility of a scientifi cp aradigm shift toward computational social science. The new thinking related to empirical regularities analysis, experimen- tal design, and longitudinal empirical research further suggests that these approaches can be tailored for rapid acquisition of big data sets. This will allow business analysts and researchers to achieve frequent, controlled and meaningful observations of real-world phenomena. We discuss how our philosophy of science should be changing in step with the times, and illustrate our perspective with comparisons between earlier and current re- search inquiry. We argue against the assertion that theory no longer matters and offer some new research directions.

367 sitasi en Computer Science

Detail DOI Sumber

DOAJ Open Access 2025

Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education

Huibo Yang, Mengxuan Hu, Amoreena Most et al.

BackgroundLarge language models (LLMs) have demonstrated impressive performance on medical licensing and diagnosis-related exams. However, comparative evaluations to optimize LLM performance and ability in the domain of comprehensive medication management (CMM) are lacking. The purpose of this evaluation was to test various LLMs performance optimization strategies and performance on critical care pharmacotherapy questions used in the assessment of Doctor of Pharmacy students.MethodsIn a comparative analysis using 219 multiple-choice pharmacotherapy questions, five LLMs (GPT-3.5, GPT-4, Claude 2, Llama2-7b and 2-13b) were evaluated. Each LLM was queried five times to evaluate the primary outcome of accuracy (i.e., correctness). Secondary outcomes included variance, the impact of prompt engineering techniques (e.g., chain-of-thought, CoT) and training of a customized GPT on performance, and comparison to third year doctor of pharmacy students on knowledge recall vs. knowledge application questions. Accuracy and variance were compared with student’s t-test to compare performance under different model settings.ResultsChatGPT-4 exhibited the highest accuracy (71.6%), while Llama2-13b had the lowest variance (0.070). All LLMs performed more accurately on knowledge recall vs. knowledge application questions (e.g., ChatGPT-4: 87% vs. 67%). When applied to ChatGPT-4, few-shot CoT across five runs improved accuracy (77.4% vs. 71.5%) with no effect on variance. Self-consistency and the custom-trained GPT demonstrated similar accuracy to ChatGPT-4 with few-shot CoT. Overall pharmacy student accuracy was 81%, compared to an optimal overall LLM accuracy of 73%. Comparing question types, six of the LLMs demonstrated equivalent or higher accuracy than pharmacy students on knowledge recall questions (e.g., self-consistency vs. students: 93% vs. 84%), but pharmacy students achieved higher accuracy than all LLMs on knowledge application questions (e.g., self-consistency vs. students: 68% vs. 80%).ConclusionChatGPT-4 was the most accurate LLM on critical care pharmacy questions and few-shot CoT improved accuracy the most. Average student accuracy was similar to LLMs overall, and higher on knowledge application questions. These findings support the need for future assessment of customized training for the type of output needed. Reliance on LLMs is only supported with recall-based questions.

Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2025

Precipitation prediction over the upper Indus Basin from large-scale circulation patterns using Gaussian processes

Kenza Tazi, Andrew Orr, J. Scott Hosking et al.

Water resources from the Indus Basin sustain over 270 million people. However, water security in this region is threatened by climate change. This is especially the case for the upper Indus Basin, where most frozen water reserves are expected to decrease significantly by the end of the century, leaving rainfall as the main driver of river flow. However, future precipitation estimates from global climate models differ greatly for this region. To address this uncertainty, this paper explores the feasibility of using probabilistic machine learning to map large-scale circulation fields, better represented by global climate models, to local precipitation over the upper Indus Basin. More specifically, Gaussian processes are trained to predict monthly ERA5 precipitation data over a 15-year horizon. This paper also explores different Gaussian process model designs, including a non-stationary covariance function to learn complex spatial relationships in the data. Going forward, this approach could be used to make more accurate predictions from global climate model outputs and better assess the probability of future precipitation extremes.

Environmental sciences, Electronic computers. Computer science

Detail DOI Sumber

DOAJ Open Access 2025

Conceptual frameworks, competencies, contents and teaching methods in planetary health education for health students and professionals: a global systematic scoping review

Carme Carrion, Camilla Alay Llamas, Eka Dian Safitri et al.

Abstract Background Planetary Health studies the impact of the global environmental crisis on health. Urgent transdisciplinary, intersectoral, and holistic solutions adapted to local realities are needed. Designing training programs attuned to contextual needs of diverse groups and geographical areas is crucial. Planetary health programs are emerging worldwide, but little is known about their scope and learning outcomes. A systematic scoping review is needed to shed light on the state of planetary health education. Objectives This review aims to identify existing frameworks, competencies, content, and teaching methods in planetary health education. Methods Following PRISMA Extension for Scoping Reviews (PRISMA-ScR) guidelines, we included studies targeting undergraduate and postgraduate students, focusing on skills, knowledge, and abilities related to planetary health, published in English or Spanish. No exclusions were made based on geographic area, study design, or publication period. Databases consulted were MEDLINE via PubMed, Scopus, Web of Science, and ProQuest. Selection and data extraction processes were conducted systematically. Results We included 73 articles, with 88% from high-income countries and 49% focused on health professionals. Conceptual frameworks identified include "One Health," "Sustainable Development Goals," and the "Planetary Health Education Framework." Transversal skills (complex problem-solving, systemic thinking, collaboration, interdisciplinary) and specific competencies (understanding health interactions with climate change, pollution) were outlined in 45% of studies. Half of the studies described 23 general topics and 93 specific content areas. Teaching methods included in-person (59%), virtual (12%), and hybrid models (29%). Conclusions This review highlights the heterogeneity in conceptual frameworks, competencies, content, and teaching methods in planetary health education for health professionals. Future research should focus on developing and evaluating evidence-based educational models to address the evolving challenges of planetary health. Recommendations include enhancing collaboration among stakeholders and integrating innovative teaching methods to improve planetary health education. Trial registration The protocol has been registered in the Open Science Framework database (registration number: osf.io/h2b3j, March 2024). Clinical trial number: not applicable.

Special aspects of education, Medicine

Detail DOI Sumber

Hasil untuk "data science"