arXiv Open Access 2023

A Twitter Dataset for Pakistani Political Discourse

Ehsan-Ul Haq Haris Bin Zia Reza Hadi Mogavi Gareth Tyson Yang K. Lu +2 lainnya
Lihat Sumber

Abstrak

We share the largest dataset for the Pakistani Twittersphere consisting of over 49 million tweets, collected during one of the most politically active periods in the country. We collect the data after the deposition of the government by a No Confidence Vote in April 2022. This large-scale dataset can be used for several downstream tasks such as political bias, bots detection, trolling behavior, (dis)misinformation, and censorship related to Pakistani Twitter users. In addition, this dataset provides a large collection of tweets in Urdu and Roman Urdu that can be used for optimizing language processing tasks.

Topik & Kata Kunci

Penulis (7)

E

Ehsan-Ul Haq

H

Haris Bin Zia

R

Reza Hadi Mogavi

G

Gareth Tyson

Y

Yang K. Lu

T

Tristan Braud

P

Pan Hui

Format Sitasi

Haq, E., Zia, H.B., Mogavi, R.H., Tyson, G., Lu, Y.K., Braud, T. et al. (2023). A Twitter Dataset for Pakistani Political Discourse. https://arxiv.org/abs/2301.06316

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓