CrossRef Open Access 2024

Data Science Using OpenAI: Testing Their New Capabilities Focused on Data Science

Jorge Guerra Pires

Abstrak

Introduction: Despite the ubiquity of statistics in numerous academic disciplines, including the life sciences, many researchers–who are not statistically trained–struggle with the correct application of statistical analysis, leading to fundamental errors in their work. The complexity and importance of statistics in scientific research necessitate a tool that empowers researchers from various backgrounds to conduct sound statistical analysis without being experts in the field. This paper introduces and evaluates the potential of OpenAI's latest API, known as the "coder interpreter," to fulfill this need. Methods: The coder interpreter API is designed to comprehend human commands, process CSV data files, and perform statistical analyses by intelligently selecting appropriate methods and libraries. Unlike traditional statistical software, this API simplifies the analysis process by requiring minimal input from the user—often just a straightforward question or command. Our work involved testing the API with actual datasets to demonstrate its capabilities, focusing on ease of use for non-statisticians and investigating its potential to improve research output, particularly in evidence-based medicine. Results: The coder interpreter API effectively utilized open-source Python libraries, renowned for their extensive resources in data science, to accurately execute statistical analyses on provided datasets. Practical examples, including a study involving diabetic patients, showcased the API's proficiency in aiding non-expert researchers in interpreting and utilizing data for their research. Discussion: Integrating AI-based tools such as OpenAI's coder interpreter API into the research process can revolutionize how scientific data is analyzed. By reducing the barrier to conducting advanced statistics, it enables researchers—including those in fields where practitioners are often concurrently medical doctors, such as in evidence-based medicine—to focus on substantive research questions. This paper highlights the potential for these tools to be adopted broadly by both novices and experts alike, thereby improving the overall quality of statistical analysis in scientific research. We advocate for the wider implementation of this technology as a step towards democratizing access to sophisticated statistical inference and data analysis capabilities.

Penulis (1)

J

Jorge Guerra Pires

Format Sitasi

Pires, J.G. (2024). Data Science Using OpenAI: Testing Their New Capabilities Focused on Data Science. https://doi.org/10.32388/76qmhb.2

Akses Cepat

Lihat di Sumber doi.org/10.32388/76qmhb.2
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
CrossRef
DOI
10.32388/76qmhb.2
Akses
Open Access ✓