Logical Reasoning Based on Residual Attention Multi-scale Relation Network
Abstrak
Logical reasoning is the ability to perceive patterns and connections between visual elements. Endowing computers with human-like reasoning ability is a critical area of research;state-of-the-art deep neural networks have achieved superhuman performance in image processing and other fields.However,the concept of logical reasoning through images requires further research.To address the problems of insufficient feature extraction and generalization of Multi-scale Relation Network(MRNet),an improved logical reasoning method,called Residual Attention Multi-scale Relation Network(ResAMRNet),is proposed. In the backbone network,shallow features are integrated into the deep network training process by utilizing residual structures and combining jump and long jump. This reduces the loss of feature information and improves the feature extraction capability of the model. In the reasoning module,the channel attention mechanism and residuals are combined to detect the relationship features between each image line.It can differentiate the significance of each feature channel,learn the attention weight adaptively,and extract key features.In this study,a Double-pooled Efficient Channel Attention(DECA) mechanism is proposed to combine global maximum pooling to further obtain feature information regarding objects and to improve generalization.Experimental results on representative logical reasoning datasets,Relational and Analogical Visual rEasoNing(RAVEN) and Improved RAVEN(I-RAVEN),show that the accuracy of the proposed method using these datasets is higher by 8.3 and 18.1 percentage points,respectively,than that of MRNet. Therefore,it demonstrates strong logical reasoning capabilities.
Topik & Kata Kunci
Penulis (1)
XIONG Zhongmin, ZENG Qi, LU Peng, WANG Zhenhua, ZHENG Zongsheng
Akses Cepat
- Tahun Terbit
- 2023
- Sumber Database
- DOAJ
- DOI
- 10.19678/j.issn.1000-3428.0064591
- Akses
- Open Access ✓