arXiv
Open Access
2024
3DLNews: A Three-decade Dataset of US Local News Articles
Gangani Ariyarathne
Alexander C. Nwala
Abstrak
We present 3DLNews, a novel dataset with local news articles from the United States spanning the period from 1996 to 2024. It contains almost 1 million URLs (with HTML text) from over 14,000 local newspapers, TV, and radio stations across all 50 states, and provides a broad snapshot of the US local news landscape. The dataset was collected by scraping Google and Twitter search results. We employed a multi-step filtering process to remove non-news article links and enriched the dataset with metadata such as the names and geo-coordinates of the source news media organizations, article publication dates, etc. Furthermore, we demonstrated the utility of 3DLNews by outlining four applications.
Topik & Kata Kunci
Penulis (2)
G
Gangani Ariyarathne
A
Alexander C. Nwala
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2024
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓