arXiv
Open Access
2022
Extracting Similar Questions From Naturally-occurring Business Conversations
Xiliang Zhu
David Rossouw
Shayna Gardiner
Simon Corston-Oliver
Abstrak
Pre-trained contextualized embedding models such as BERT are a standard building block in many natural language processing systems. We demonstrate that the sentence-level representations produced by some off-the-shelf contextualized embedding models have a narrow distribution in the embedding space, and thus perform poorly for the task of identifying semantically similar questions in real-world English business conversations. We describe a method that uses appropriately tuned representations and a small set of exemplars to group questions of interest to business users in a visualization that can be used for data exploration or employee coaching.
Penulis (4)
X
Xiliang Zhu
D
David Rossouw
S
Shayna Gardiner
S
Simon Corston-Oliver
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2022
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓