Semantic Scholar Open Access 2025

Empirical Evaluation of Big Data Stacks: Performance and Design Analysis of Hadoop, Modern, and Cloud Architectures

Widad Elouataoui Youssef Gahi

Abstrak

The proliferation of big data applications across various industries has led to a paradigm shift in data architecture, with traditional approaches giving way to more agile and scalable frameworks. The evolution of big data architecture began with the emergence of the Hadoop-based data stack, leveraging technologies like Hadoop Distributed File System (HDFS) and Apache Spark for efficient data processing. However, recent years have seen a shift towards modern data stacks, offering flexibility and diverse toolsets tailored to specific use cases. Concurrently, cloud computing has revolutionized big data management, providing unparalleled scalability and integration capabilities. Despite their benefits, navigating these data stack paradigms can be challenging. While existing literature offers valuable insights into individual data stack paradigms, there remains a dearth of studies that offer practical, in-depth comparisons of these paradigms across the entire big data value chain. To address this gap in the field, this paper examines three main big data stack paradigms: the Hadoop data stack, modern data stack, and cloud-based data stack. Indeed, we conduct in this study an exhaustive architectural comparison of these stacks covering the entire big data value chain from data acquisition to exposition. Moreover, this study extends beyond architectural considerations to include end-to-end use case implementations for a comprehensive evaluation of each stack. Using a large dataset of Amazon reviews, different data stack scenarios are implemented and compared. Furthermore, the paper explores critical factors such as data integration, implementation costs, and ease of deployment to provide researchers and practitioners with a relevant and up-to-date reference for navigating the complex landscape of big data technologies and making informed decisions about data strategies.

Topik & Kata Kunci

Penulis (2)

W

Widad Elouataoui

Y

Youssef Gahi

Format Sitasi

Elouataoui, W., Gahi, Y. (2025). Empirical Evaluation of Big Data Stacks: Performance and Design Analysis of Hadoop, Modern, and Cloud Architectures. https://doi.org/10.3390/bdcc10010007

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.3390/bdcc10010007
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
Semantic Scholar
DOI
10.3390/bdcc10010007
Akses
Open Access ✓