Semantic Scholar Open Access 2020 255 sitasi

Four Principles of Explainable Artificial Intelligence

P. Phillips Carina A. Hahn Peter C. Fontana David A. Broniatowski Mark A. Przybocki

Abstrak

We introduce four principles for explainable artificial intelligence (AI) that comprise fundamental properties for explainable AI systems. We propose that explainable AI systems deliver accompanying evidence or reasons for outcomes and processes; provide explanations that are understandable to individual users; provide explanations that correctly reflect the system’s process for generating the output; and that a system only operates under conditions for which it was designed and when it reaches sufficient confidence in its output. We have termed these four principles as explanation, meaningful, explanation accuracy, and knowledge limits, respectively. Through significant stakeholder engagement, these four principles were developed to encompass the multidisciplinary nature of explainable AI, including the fields of computer science, engineering, and psychology. Because one-size-fits-all explanations do not exist, different users will require different types of explanations. We present five categories of explanation and summarize theories of explainable AI. We give an overview of the algorithms in the field that cover the major classes of explainable algorithms. As a baseline comparison, we assess how well explanations provided by people follow our four principles. This assessment provides insights to the challenges of designing explainable AI systems.

Topik & Kata Kunci

Penulis (5)

P

P. Phillips

C

Carina A. Hahn

P

Peter C. Fontana

D

David A. Broniatowski

M

Mark A. Przybocki

Format Sitasi

Phillips, P., Hahn, C.A., Fontana, P.C., Broniatowski, D.A., Przybocki, M.A. (2020). Four Principles of Explainable Artificial Intelligence. https://doi.org/10.6028/nist.ir.8312-draft

Akses Cepat

Lihat di Sumber doi.org/10.6028/nist.ir.8312-draft
Informasi Jurnal
Tahun Terbit
2020
Bahasa
en
Total Sitasi
255×
Sumber Database
Semantic Scholar
DOI
10.6028/nist.ir.8312-draft
Akses
Open Access ✓