arXiv Open Access 2024

Style Vectors for Steering Generative Large Language Model

Kai Konen Sophie Jentzsch Diaoulé Diallo Peer Schütt Oliver Bensch +3 lainnya

Lihat Sumber

Abstrak

This research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of activation engineering using such style vectors to influence the style of generated text in a nuanced and parameterisable way, distinguishing it from prompt engineering. The presented research constitutes a significant step towards developing more adaptive and effective AI-empowered interactive systems.

Topik & Kata Kunci

cs.CL

Penulis (8)

Kai Konen

Sophie Jentzsch

Diaoulé Diallo

Peer Schütt

Oliver Bensch

Roxanne El Baff

Dominik Opitz

Tobias Hecking

Format Sitasi

APA MLA BibTeX

Konen, K., Jentzsch, S., Diallo, D., Schütt, P., Bensch, O., Baff, R.E. et al. (2024). Style Vectors for Steering Generative Large Language Model. https://arxiv.org/abs/2402.01618

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓