DOAJ Open Access 2025

Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis

Jingkai Li

Abstrak

Integrated Information Theory (IIT) provides a quantitative framework for explaining consciousness phenomenon, positing that conscious systems comprise elements integrated through causal properties. We apply IIT 3.0 and 4.0 — the latest iterations of this framework — to sequences of Large Language Model (LLM) representations, analyzing data derived from existing Theory of Mind (ToM) test results. Our study systematically investigates whether the differences of ToM test performances, when presented in the LLM representations, can be revealed by IIT estimates, i.e., Φmax(IIT 3.0), Φ (IIT 4.0), Conceptual Information (IIT 3.0), and Φ-structure (IIT 4.0). Furthermore, we compare these metrics with the Span Representations independent of any estimate for consciousness. This additional effort aims to differentiate between potential “consciousness” phenomena and inherent separations within LLM representational space. We conduct comprehensive experiments examining variations across LLM transformer layers and linguistic spans from stimuli. Our results suggest that sequences of contemporary Transformer-based LLM representations lack statistically significant indicators of observed “consciousness” phenomena but exhibit intriguing patterns under spatio-permutational analyses.

Penulis (1)

J

Jingkai Li

Format Sitasi

Li, J. (2025). Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis. https://doi.org/10.1016/j.nlp.2025.100163

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1016/j.nlp.2025.100163
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.1016/j.nlp.2025.100163
Akses
Open Access ✓