arXiv Open Access 2023

DMOps: Data Management Operation and Recipes

Eujeong Choi Chanjun Park
Lihat Sumber

Abstrak

Data-centric AI has shed light on the significance of data within the machine learning (ML) pipeline. Recognizing its significance, academia, industry, and government departments have suggested various NLP data research initiatives. While the ability to utilize existing data is essential, the ability to build a dataset has become more critical than ever, especially in the industry. In consideration of this trend, we propose a "Data Management Operations and Recipes" to guide the industry in optimizing the building of datasets for NLP products. This paper presents the concept of DMOps which is derived from real-world experiences with NLP data management and aims to streamline data operations by offering a baseline.

Topik & Kata Kunci

Penulis (2)

E

Eujeong Choi

C

Chanjun Park

Format Sitasi

Choi, E., Park, C. (2023). DMOps: Data Management Operation and Recipes. https://arxiv.org/abs/2301.01228

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓