Empirical Evaluation of Big Data Stacks: Performance and Design Analysis of Hadoop, Modern, and Cloud Architectures
Abstrak
The proliferation of big data applications across various industries has led to a paradigm shift in data architecture, with traditional approaches giving way to more agile and scalable frameworks. The evolution of big data architecture began with the emergence of the Hadoop-based data stack, leveraging technologies like Hadoop Distributed File System (HDFS) and Apache Spark for efficient data processing. However, recent years have seen a shift towards modern data stacks, offering flexibility and diverse toolsets tailored to specific use cases. Concurrently, cloud computing has revolutionized big data management, providing unparalleled scalability and integration capabilities. Despite their benefits, navigating these data stack paradigms can be challenging. While existing literature offers valuable insights into individual data stack paradigms, there remains a dearth of studies that offer practical, in-depth comparisons of these paradigms across the entire big data value chain. To address this gap in the field, this paper examines three main big data stack paradigms: the Hadoop data stack, modern data stack, and cloud-based data stack. Indeed, we conduct in this study an exhaustive architectural comparison of these stacks covering the entire big data value chain from data acquisition to exposition. Moreover, this study extends beyond architectural considerations to include end-to-end use case implementations for a comprehensive evaluation of each stack. Using a large dataset of Amazon reviews, different data stack scenarios are implemented and compared. Furthermore, the paper explores critical factors such as data integration, implementation costs, and ease of deployment to provide researchers and practitioners with a relevant and up-to-date reference for navigating the complex landscape of big data technologies and making informed decisions about data strategies.
Topik & Kata Kunci
Penulis (2)
Widad Elouataoui
Youssef Gahi
Akses Cepat
- Tahun Terbit
- 2025
- Bahasa
- en
- Sumber Database
- Semantic Scholar
- DOI
- 10.3390/bdcc10010007
- Akses
- Open Access ✓