arXiv Open Access 2024

Pegasus-v1 Technical Report

Raehyuk Jung Hyojun Go Jaehyuk Yi Jiho Jang Daniel Kim +39 lainnya
Lihat Sumber

Abstrak

This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's architecture, training strategies, and its performance in benchmarks on video conversation, zero-shot video question answering, and video summarization. We also explore qualitative characteristics of Pegasus-1 , demonstrating its capabilities as well as its limitations, in order to provide readers a balanced view of its current state and its future direction.

Penulis (44)

R

Raehyuk Jung

H

Hyojun Go

J

Jaehyuk Yi

J

Jiho Jang

D

Daniel Kim

J

Jay Suh

A

Aiden Lee

C

Cooper Han

J

Jae Lee

J

Jeff Kim

J

Jin-Young Kim

J

Junwan Kim

K

Kyle Park

L

Lucas Lee

M

Mars Ha

M

Minjoon Seo

A

Abraham Jo

E

Ed Park

H

Hassan Kianinejad

S

SJ Kim

T

Tony Moon

W

Wade Jeong

A

Andrei Popescu

E

Esther Kim

E

EK Yoon

G

Genie Heo

H

Henry Choi

J

Jenna Kang

K

Kevin Han

N

Noah Seo

S

Sunny Nguyen

R

Ryan Won

Y

Yeonhoo Park

A

Anthony Giuliani

D

Dave Chung

H

Hans Yoon

J

James Le

J

Jenny Ahn

J

June Lee

M

Maninder Saini

M

Meredith Sanders

S

Soyoung Lee

S

Sue Kim

T

Travis Couture

Format Sitasi

Jung, R., Go, H., Yi, J., Jang, J., Kim, D., Suh, J. et al. (2024). Pegasus-v1 Technical Report. https://arxiv.org/abs/2404.14687

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2024
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓