arXiv Open Access 2022

verBERT: Automating Brazilian Case Law Document Multi-label Categorization Using BERT

Felipe R. Serras Marcelo Finger
Lihat Sumber

Abstrak

In this work, we carried out a study about the use of attention-based algorithms to automate the categorization of Brazilian case law documents. We used data from the Kollemata Project to produce two distinct datasets with adequate class systems. Then, we implemented a multi-class and multi-label version of BERT and fine-tuned different BERT models with the produced datasets. We evaluated several metrics, adopting the micro-averaged F1-Score as our main metric for which we obtained a performance value of F1-micro=0.72 corresponding to gains of 30 percent points over the tested statistical baseline. In this work, we carried out a study about the use of attention-based algorithms to automate the categorization of Brazilian case law documents. We used data from the \textit{Kollemata} Project to produce two distinct datasets with adequate class systems. Then, we implemented a multi-class and multi-label version of BERT and fine-tuned different BERT models with the produced datasets. We evaluated several metrics, adopting the micro-averaged F1-Score as our main metric for which we obtained a performance value of $\langle \mathcal{F}_1 \rangle_{micro}=0.72$ corresponding to gains of 30 percent points over the tested statistical baseline.

Topik & Kata Kunci

Penulis (2)

F

Felipe R. Serras

M

Marcelo Finger

Format Sitasi

Serras, F.R., Finger, M. (2022). verBERT: Automating Brazilian Case Law Document Multi-label Categorization Using BERT. https://arxiv.org/abs/2203.06224

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓