DOAJ Open Access 2024

Framework for evaluating code generation ability of large language models

Sangyeop Yeo Yu-Seung Ma Sang Cheol Kim Hyungkook Jun Taeho Kim

Abstrak

Large language models (LLMs) have revolutionized various applications in natural language processing and exhibited proficiency in generating programming code. We propose a framework for evaluating the code generation ability of LLMs and introduce a new metric, pass-ratio@n, which captures the granularity of accuracy according to the pass rate of test cases. The framework is intended to be fully automatic to handle the repetitive work involved in generating prompts, conducting inferences, and executing the generated codes. A preliminary evaluation focusing on the prompt detail, problem publication date, and difficulty level demonstrates the successful integration of our framework with the LeetCode coding platform and highlights the applicability of the pass-ratio@n metric.

Topik & Kata Kunci

Telecommunication Electronics

Penulis (5)

Sangyeop Yeo

Yu-Seung Ma

Sang Cheol Kim

Hyungkook Jun

Taeho Kim

Format Sitasi

APA MLA BibTeX

Yeo, S., Ma, Y., Kim, S.C., Jun, H., Kim, T. (2024). Framework for evaluating code generation ability of large language models. https://doi.org/10.4218/etrij.2023-0357

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.4218/etrij.2023-0357

Informasi Jurnal

Tahun Terbit: 2024
Sumber Database: DOAJ
DOI: 10.4218/etrij.2023-0357
Akses: Open Access ✓