DOAJ Open Access 2025

GANs for data augmentation with stacked CNN models and XAI for interpretable maize yield prediction

Ishaan Seshukumar Pothapragada Sujatha R

Abstrak

A robust maize yield prediction framework is proposed to counter major problems predominantly present in Agri-tech analytics, like scarcity of data, class imbalance, redundant features, and model interpretability. This research is motivated by the need for accurate crop forecasting to ensure food security amid climate change and population growth. The methodology integrates frontier methods to improve both accuracy and explainability through a structured five-stage process. For augmenting data to address the issue of data-scarcity, generative adversarial networks (GANs) with a 200-dimension latent space were used to synthetically generate 20,000 samples, which greatly boosted the dataset. Data preprocessing included IQR-based outlier removal and class balancing. Feature selection is carefully addressed via a combination of 14 statistical methods, tree-based methods, bio-inspired methods, and regularization methods so that only the most relevant features for modelling are chosen and included. The predictive framework is based on the ensemble of one-dimensional convolutional neural network (CNN) learning on the features selected, combining three parallel branches (processing features selected by Decision Tree, XGBoost, and Lasso methods), followed by a stacked refinement with residual connections. This two-stage approach reinforces both the accuracy and robustness of prediction. The focus on transparency and interpretability makes this work relevant. By the adoption of Explainable AI (XAI) tools such as SHAP and LIME, interpretable explanations are afforded to a model as to which features contribute to the prediction [21] [33]. The combination of stacked modelling methods and model interpretability is a significant enhancement in agricultural analytics, providing actionable insights for farmers with the aim of increasing crop production. The framework's effectiveness was validated on maize data from Sevur farm. The model outperformed baseline methods with an R2 of 0.9165 and mean squared error (MSE) of 0.6893, significantly outperforming conventional approaches for optimizing production in variable growing conditions.

Penulis (2)

I

Ishaan Seshukumar Pothapragada

S

Sujatha R

Format Sitasi

Pothapragada, I.S., R, S. (2025). GANs for data augmentation with stacked CNN models and XAI for interpretable maize yield prediction. https://doi.org/10.1016/j.atech.2025.100992

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1016/j.atech.2025.100992
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.1016/j.atech.2025.100992
Akses
Open Access ✓