arXiv Open Access 2026

Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection

Dhiraj Neupane Richard Dazeley Mohamed Reda Bouadjenek Sunil Aryal
Lihat Sumber

Abstrak

Reinforcement learning (RL) offers significant promise for machinery fault detection (MFD). However, most existing RL-based MFD approaches do not fully exploit RL's sequential decision-making strengths, often treating MFD as a simple guessing game (Contextual Bandits). To bridge this gap, we formulate MFD as an offline inverse reinforcement learning problem, where the agent learns the reward dynamics directly from healthy operational sequences, thereby bypassing the need for manual reward engineering and fault labels. Our framework employs Adversarial Inverse Reinforcement Learning to train a discriminator that distinguishes between normal (expert) and policy-generated transitions. The discriminator's learned reward serves as an anomaly score, indicating deviations from normal operating behaviour. When evaluated on three run-to-failure benchmark datasets (HUMS2023, IMS, and XJTU-SY), the model consistently assigns low anomaly scores to normal samples and high scores to faulty ones, enabling early and robust fault detection. By aligning RL's sequential reasoning with MFD's temporal structure, this work opens a path toward RL-based diagnostics in data-driven industrial settings.

Topik & Kata Kunci

Penulis (4)

D

Dhiraj Neupane

R

Richard Dazeley

M

Mohamed Reda Bouadjenek

S

Sunil Aryal

Format Sitasi

Neupane, D., Dazeley, R., Bouadjenek, M.R., Aryal, S. (2026). Learning Rewards, Not Labels: Adversarial Inverse Reinforcement Learning for Machinery Fault Detection. https://arxiv.org/abs/2602.22297

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓