arXiv Open Access 2024

Cheap Ways of Extracting Clinical Markers from Texts

Anastasia Sandu Teodor Mihailescu Sergiu Nisioi
Lihat Sumber

Abstrak

This paper describes the work of the UniBuc Archaeology team for CLPsych's 2024 Shared Task, which involved finding evidence within the text supporting the assigned suicide risk level. Two types of evidence were required: highlights (extracting relevant spans within the text) and summaries (aggregating evidence into a synthesis). Our work focuses on evaluating Large Language Models (LLM) as opposed to an alternative method that is much more memory and resource efficient. The first approach employs a good old-fashioned machine learning (GOML) pipeline consisting of a tf-idf vectorizer with a logistic regression classifier, whose representative features are used to extract relevant highlights. The second, more resource intensive, uses an LLM for generating the summaries and is guided by chain-of-thought to provide sequences of text indicating clinical markers.

Topik & Kata Kunci

Penulis (3)

A

Anastasia Sandu

T

Teodor Mihailescu

S

Sergiu Nisioi

Format Sitasi

Sandu, A., Mihailescu, T., Nisioi, S. (2024). Cheap Ways of Extracting Clinical Markers from Texts. https://arxiv.org/abs/2403.11227

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓