arXiv Open Access 2025

OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses

Angela Lopez-Cardona Sebastian Idesis Miguel Barreda-Ángeles Sergi Abadal Ioannis Arapakis
Lihat Sumber

Abstrak

While Large Language Models (LLMs) have significantly advanced natural language processing, aligning them with human preferences remains an open challenge. Although current alignment methods rely primarily on explicit feedback, eye-tracking (ET) data offers insights into real-time cognitive processing during reading. In this paper, we present OASST-ETC, a novel eye-tracking corpus capturing reading patterns from 24 participants, while evaluating LLM-generated responses from the OASST1 dataset. Our analysis reveals distinct reading patterns between preferred and non-preferred responses, which we compare with synthetic eye-tracking data. Furthermore, we examine the correlation between human reading measures and attention patterns from various transformer-based models, discovering stronger correlations in preferred responses. This work introduces a unique resource for studying human cognitive processing in LLM evaluation and suggests promising directions for incorporating eye-tracking data into alignment methods. The dataset and analysis code are publicly available.

Topik & Kata Kunci

Penulis (5)

A

Angela Lopez-Cardona

S

Sebastian Idesis

M

Miguel Barreda-Ángeles

S

Sergi Abadal

I

Ioannis Arapakis

Format Sitasi

Lopez-Cardona, A., Idesis, S., Barreda-Ángeles, M., Abadal, S., Arapakis, I. (2025). OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses. https://arxiv.org/abs/2503.10927

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓