Pegasus-v1 Technical Report
Abstrak
This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's architecture, training strategies, and its performance in benchmarks on video conversation, zero-shot video question answering, and video summarization. We also explore qualitative characteristics of Pegasus-1 , demonstrating its capabilities as well as its limitations, in order to provide readers a balanced view of its current state and its future direction.
Penulis (44)
Raehyuk Jung
Hyojun Go
Jaehyuk Yi
Jiho Jang
Daniel Kim
Jay Suh
Aiden Lee
Cooper Han
Jae Lee
Jeff Kim
Jin-Young Kim
Junwan Kim
Kyle Park
Lucas Lee
Mars Ha
Minjoon Seo
Abraham Jo
Ed Park
Hassan Kianinejad
SJ Kim
Tony Moon
Wade Jeong
Andrei Popescu
Esther Kim
EK Yoon
Genie Heo
Henry Choi
Jenna Kang
Kevin Han
Noah Seo
Sunny Nguyen
Ryan Won
Yeonhoo Park
Anthony Giuliani
Dave Chung
Hans Yoon
James Le
Jenny Ahn
June Lee
Maninder Saini
Meredith Sanders
Soyoung Lee
Sue Kim
Travis Couture
Akses Cepat
- Tahun Terbit
- 2024
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓