arXiv Open Access 2024

A Mechanistic Explanatory Strategy for XAI

Marcin Rabiza

Lihat Sumber

Abstrak

Despite significant advancements in XAI, scholars continue to note a persistent lack of robust conceptual foundations and integration with broader discourse on scientific explanation. In response, emerging XAI research increasingly draws on explanatory strategies from various scientific disciplines and the philosophy of science to address these gaps. This paper outlines a mechanistic strategy for explaining the functional organization of deep learning systems, situating recent developments in AI explainability within a broader philosophical context. According to the mechanistic approach, explaining opaque AI systems involves identifying the mechanisms underlying decision-making processes. For deep neural networks, this means discerning functionally relevant components - such as neurons, layers, circuits, or activation patterns - and understanding their roles through decomposition, localization, and recomposition. Proof-of-principle case studies from image recognition and language modeling align this theoretical framework with recent research from OpenAI and Anthropic. The findings suggest that pursuing mechanistic explanations can uncover elements that traditional explainability techniques may overlook, ultimately contributing to more thoroughly explainable AI.

Topik & Kata Kunci

cs.LG cs.AI

Penulis (1)

Marcin Rabiza

Format Sitasi

APA MLA BibTeX

Rabiza, M. (2024). A Mechanistic Explanatory Strategy for XAI. https://arxiv.org/abs/2411.01332

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓