Hasil "data science"

S2 Open Access 2015

The Extent and Consequences of P-Hacking in Science

M. Head, L. Holman, R. Lanfear et al.

A focus on novel, confirmatory, and statistically significant results leads to substantial bias in the scientific literature. One type of bias, known as “p-hacking,” occurs when researchers collect or select data or statistical analyses until nonsignificant results become significant. Here, we use text-mining to demonstrate that p-hacking is widespread throughout science. We then illustrate how one can test for p-hacking when performing a meta-analysis and show that, while p-hacking is probably common, its effect seems to be weak relative to the real effect sizes being measured. This result suggests that p-hacking probably does not drastically alter scientific consensuses drawn from meta-analyses.

1100 sitasi en Medicine, Biology

Detail DOI Sumber

S2 Open Access 2014

Explaining Fixed Effects: Random Effects Modeling of Time-Series Cross-Sectional and Panel Data*

Andrew Bell, K. Jones

This article challenges Fixed Effects (FE) modeling as the ‘default’ for time-series-cross-sectional and panel data. Understanding different within and between effects is crucial when choosing modeling strategies. The downside of Random Effects (RE) modeling—correlated lower-level covariates and higher-level residuals—is omitted-variable bias, solvable with Mundlak's (1978a) formulation. Consequently, RE can provide everything that FE promises and more, as confirmed by Monte-Carlo simulations, which additionally show problems with Plümper and Troeger's FE Vector Decomposition method when data are unbalanced. As well as incorporating time-invariant variables, RE models are readily extendable, with random coefficients, cross-level interactions and complex variance functions. We argue not simply for technical solutions to endogeneity, but for the substantive importance of context/heterogeneity, modeled using RE. The implications extend beyond political science to all multilevel datasets. However, omitted variables could still bias estimated higher-level variable effects; as with any model, care is required in interpretation.

1434 sitasi en Mathematics

Detail DOI Sumber

S2 Open Access 2014

Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references

L. Bornmann, Rüdiger Mutz

Many studies (in information science) have looked at the growth of science. In this study, we reexamine the question of the growth of science. To do this we (a) use current data up to publication year 2012 and (b) analyze the data across all disciplines and also separately for the natural sciences and for the medical and health sciences. Furthermore, the data were analyzed with an advanced statistical technique—segmented regression analysis—which can identify specific segments with similar growth rates in the history of science. The study is based on two different sets of bibliometric data: (a) the number of publications held as source items in the Web of Science (WoS, Thomson Reuters) per publication year and (b) the number of cited references in the publications of the source items per cited reference year. We looked at the rate at which science has grown since the mid‐1600s. In our analysis of cited references we identified three essential growth phases in the development of science, which each led to growth rates tripling in comparison with the previous phase: from less than 1% up to the middle of the 18th century, to 2 to 3% up to the period between the two world wars, and 8 to 9% to 2010.

1391 sitasi en Geography, Computer Science

Detail DOI Sumber

S2 Open Access 2014

Data Mining and Analysis: Fundamental Concepts and Algorithms

Mohammed J. Zaki

995 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2011

JENDL-4.0: A New Library for Nuclear Science and Engineering

K. Shibata, O. Iwamoto, T. Nakagawa et al.

1976 sitasi en Chemistry

Detail DOI Sumber

S2 Open Access 2015

Pegasus, a workflow management system for science automation

E. Deelman, K. Vahi, G. Juve et al.

865 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2009

A new dawn for citizen science.

J. Silvertown

2137 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 1967

Color Science, Concepts and Methods. Quantitative Data and Formulas

W. D. Wright

1203 sitasi en Engineering

Detail DOI Sumber

S2 Open Access 1999

Estimating dynamic panel data models: a guide for

Ruth A. Judson, Ann L. Owen

2202 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2001

Linear Mixed Models for Longitudinal Data

G. Verbeke, G. Molenberghs

2416 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 1996

The KDD process for extracting useful knowledge from volumes of data

U. Fayyad, G. Piatetsky-Shapiro, Padhraic Smyth

2204 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 1997

Functional Data Analysis

J. Ramsay, Bernard Walter Silverman

2183 sitasi en Mathematics, Computer Science

Detail DOI Sumber

S2 Open Access 2002

Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data

R. Myneni, S. Hoffman, Y. Knyazikhin et al.

1963 sitasi en

Detail DOI Sumber

S2 Open Access 1989

The Analysis of Social Science Data with Missing Values

R. Little, D. Rubin

1113 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2006

Citation indexes for science; a new dimension in documentation through association of ideas.

E. Garfield

2100 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 2016

The View from Above: Applications of Satellite Data in Economics

D. Donaldson, Adam Storeygard

552 sitasi en

Detail DOI Sumber

S2 Open Access 2020

Mapping citizen science contributions to the UN sustainable development goals

D. Fraisl, Jillian Campbell, L. See et al.

The UN Sustainable Development Goals (SDGs) are a vision for achieving a sustainable future. Reliable, timely, comprehensive, and consistent data are critical for measuring progress towards, and ultimately achieving, the SDGs. Data from citizen science represent one new source of data that could be used for SDG reporting and monitoring. However, information is still lacking regarding the current and potential contributions of citizen science to the SDG indicator framework. Through a systematic review of the metadata and work plans of the 244 SDG indicators, as well as the identification of past and ongoing citizen science initiatives that could directly or indirectly provide data for these indicators, this paper presents an overview of where citizen science is already contributing and could contribute data to the SDG indicator framework. The results demonstrate that citizen science is “already contributing” to the monitoring of 5 SDG indicators, and that citizen science “could contribute” to 76 indicators, which, together, equates to around 33%. Our analysis also shows that the greatest inputs from citizen science to the SDG framework relate to SDG 15 Life on Land, SDG 11 Sustainable Cities and Communities, SDG 3 Good Health and Wellbeing, and SDG 6 Clean Water and Sanitation. Realizing the full potential of citizen science requires demonstrating its value in the global data ecosystem, building partnerships around citizen science data to accelerate SDG progress, and leveraging investments to enhance its use and impact.

317 sitasi en Business

Detail DOI Sumber

S2 Open Access 2018

The Ames Stereo Pipeline: NASA's Open Source Software for Deriving and Processing Terrain Data

R. Beyer, O. Alexandrov, S. McMichael

The NASA Ames Stereo Pipeline is a suite of free and open source automated geodesy and stereogrammetry tools designed for processing stereo images captured from satellites (around Earth and other planets), robotic rovers, aerial cameras, and historical images, with and without accurate camera pose information. It produces cartographic products, including digital terrain models, ortho‐projected images, 3‐D models, and bundle‐adjusted networks of cameras. Ames Stereo Pipeline's data products are suitable for science analysis, mission planning, and public outreach.

373 sitasi en Computer Science

Detail DOI Sumber

S2 Open Access 2019

RESTplus: an improved toolkit for resting-state functional magnetic resonance imaging data processing.

Xize Jia, Jue Wang, Hai Sun et al.

Center for Cognition and Brain Disorders, Institutes of Psychological Sciences, Hangzhou Normal University, Hangzhou 311121, China b Zhejiang Key Laboratory for Research in Assessment of Cognitive Impairments, Hangzhou 311121, China Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27517, USA d School of Life Science and Technology, Center for Information in BioMedicine, University of Electronic Science and Technology of China, Chengdu 610054, China CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China Department of Psychology, University of Chinese Academy of Sciences, Beijing 100101, China g Preclinical Pharmacology Section, National Institute on Drug Abuse, Baltimore 21224, USA

319 sitasi en Medicine

Detail DOI Sumber

S2 Open Access 2021

Jupyter: Thinking and Storytelling With Code and Data

B. Granger, Fernando Pérez

Project Jupyter is an open-source project for interactive computing widely used in data science, machine learning, and scientific computing. We argue that even though Jupyter helps users perform complex, technical work, Jupyter itself solves problems that are fundamentally human in nature. Namely, Jupyter helps humans to think and tell stories with code and data. We illustrate this by describing three dimensions of Jupyter: 1) interactive computing; 2) computational narratives; and 3) the idea that Jupyter is more than software. We illustrate the impact of these dimensions on a community of practice in earth and climate science.

167 sitasi en Computer Science

Detail DOI Sumber

Hasil untuk "data science"