Information Pathways in Online Science Communication: The Role of Platform Actors and News Media
Alexandros Efstratiou, Giuseppe Russo, Luca Luceri
Online discussions of science involve complex interactions among experts, news media, and social media users as they interpret and disseminate scientific findings. While prior work has examined these actors in isolation, their interplay in shaping science communication remains poorly understood. Using the COVID-19 pandemic as a case study, we analyze 1.24M tweets and 211k news articles that reference pandemic-related scientific papers. We find that the most influential Twitter accounts in this discourse are predominantly individuals with medical or research credentials. However, we also identify a coordinated network that disproportionately amplifies a small set of prominent credentialed experts who advance contrarian, anti-consensus positions on vaccines, lockdowns, and related topics. The papers promoted by these influential actors substantially overlap with those covered by news media, but with key differences: pro-consensus experts primarily engage with studies featured by mainstream and medical outlets, whereas contrarian experts align more closely with papers promoted by low-quality, pseudoscientific, or conspiratorial sources. Notably, news outlets tend to report on scientific studies after they have been highlighted by social media superspreaders. Together, these findings reveal multi-level pathways of information flow and coordinated amplification structures that shape science communication across social media and news, offering new insights into the dynamics of the broader information ecosystem.
SOCIAL FEMININE NOUNS IN DAILY NEWSPAPERS IN THE REPUBLIC OF SRPSKA
Mijana Č. Kuburić Macura
This paper analyzes the frequency and competition between social feminine and (generic) social masculine nouns in a sample of daily newspapers in the Republic of Srpska, focusing on fluctuations in their usage. The study also examines the recorded forms from a neological perspective by checking their presence in Serbian dictionaries. The findings provide a snapshot of the current linguistic situation, showing that social feminine nouns are widely used in print media and can be regarded as the dominant lexical choice for denoting women’s professions and social roles. The most frequent and stable forms appear in the sports lexicon, but numerous examples are also found in references to occupations, functions, and roles in other areas of social life. While most of these feminine forms belong to standardized lexis, the use of unregistered forms is not uncommon, indicating a dynamic process in which social changes are closely mirrored in language. These results highlight the importance of continued, longitudinal research on the spread and acceptance of social feminine nouns, as well as the active role of language-planning institutions in evaluating individual solutions.
Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky
Benjamin White, Anastasia Shimorina
Understanding and predicting user behavior on social media platforms is crucial for content recommendation and platform design. While existing approaches focus primarily on common actions like retweeting and liking, the prediction of rare but significant behaviors remains largely unexplored. This paper presents a hybrid methodology for social media user behavior prediction that addresses both frequent and infrequent actions across a diverse action vocabulary. We evaluate our approach on a large-scale Bluesky dataset containing 6.4 million conversation threads spanning 12 distinct user actions across 25 persona clusters. Our methodology combines four complementary approaches: (i) a lookup database system based on historical response patterns; (ii) persona-specific LightGBM models with engineered temporal and semantic features for common actions; (iii) a specialized hybrid neural architecture fusing textual and temporal representations for rare action classification; and (iv) generation of text replies. Our persona-specific models achieve an average macro F1-score of 0.64 for common action prediction, while our rare action classifier achieves 0.56 macro F1-score across 10 rare actions. These results demonstrate that effective social media behavior prediction requires tailored modeling strategies recognizing fundamental differences between action types. Our approach achieved first place in the SocialSim: Social-Media Based Personas challenge organized at the Social Simulation with LLMs workshop at COLM 2025.
Video Design Motion Graphic Based Company Profile as Publication Media Pt. Muliaoffset Packindo Semarang City: Case study: PT. Muliaoffset Packindo Semarang
Lukas Dwi Santoso, Sarwo Nugroho
Competition in the business world is getting tougher day by day. The increasing market demand for product requirements has triggered the emergence of new companies that continue to grow. Because of this, good and communicative marketing is needed as a means of introducing the company to consumers so that the company or business being run can be widely known and continue to compete with other competitors. Publication media is currently increasingly varied and interesting. Technological advances indirectly force graphic designers to continue to develop and produce work that is fresh and up to dateThere are more and more innovations and significant developments in the world of publications as time goes by. Unlike before, today's publication media is dominated by non-print media. Before the widespread use of the internet and smartphones , publication media was dominated by print media such as; brochures, posters, billboards, banners and other printed media. Ease of access is one of the main factors causing the rise of non-physical publication media. Information about a company can be accessed more easily and practically, compared to using print media which is starting to be replaced. Company profile videos This contains, among other things ; company background, address, telephone number, email , website , and other information that needs to be displayed about PT. Muliaoffset Packindo. Videos company profile Motion graphic- based has a relatively short duration but is packaged in an interesting way, because of the movement animations and transitions applied to each frame. After seeing the company's company profile video , the audience expects interested with the information presented and get to know more about the company .
Drawing. Design. Illustration
Bundling Digital Journalism: Exploring the Potential of Subscription-Based Product Bundles
Lukas Erbrich, Christian-Mathias Wellbrock, Frank Lobigs
et al.
This study explores the potential of cross-publisher bundled offers as a strategy for increasing subscription sales in digital journalism. While innovative forms of bundling are an integral part of media distribution in music (e.g., Spotify) and film (e.g., Netflix), their adoption in digital journalism has been limited, despite research showing that bundled access to products can increase consumers’ willingness to pay, especially in younger target groups. Against this background, we conduct a choice-based conjoint analysis using data from a representative survey of the German online population (n = 1,542). Results show that bundling digital journalism has the potential to raise publisher revenues and subscription sales in digital markets. In particular, they highlight that a comprehensive, cross-publisher bundled offer, available at a fixed monthly rate, has the potential to stimulate digital journalism sales among different consumer groups in a relatively balanced way, including those who are typically more reluctant towards journalism. These findings align with the principles of information goods economics, which posit that maximising the size of digital content bundles often tends to be the most profitable distribution strategy. However, it is crucial to examine these findings in the context of the potential negative effects associated with this emerging business model in digital journalism, such as the cannibalisation of print subscriptions, diminished brand identification, and a possible imbalanced distribution of revenues.
Communication. Mass media
Leveraging the teamwork model for effective integration of interactive materials on mobile devices in visual media communication, innovation, and impact on society
Luis Ochoa Siguencia
The issue of waste in printed materials can be effectively addressed by leveraging technology, especially mobile devices, in language learning initiatives. This research focuses on integrating interactive materials through the TEAMWORK model, with the ELENE project serving as a case study. The overarching goal is to assess how the TEAMWORK model can streamline planning and execution in mobile-assisted language learning endeavours. By evaluating effectiveness, identifying challenges and best practices, and offering practical recommendations, this study aims to improve language education outcomes through technology integration while reducing the environmental impact associated with traditional printed materials. The research methodology involves a thorough examination of integrating interactive materials on mobile devices using the TEAMWORK model, with a specific focus on the ELENE project. It commences with a comprehensive literature review and progresses to qualitative data collection through interviews and observations. The TEAMWORK model serves as a guiding framework for planning and execution, with data gathered through surveys and usage analytics. Analysis techniques are employed to evaluate the model’s impact on project outcomes and to derive practical recommendations for stakeholders, all aimed at enhancing language education practices and technology integration.
The research results highlight the effectiveness of integrating interactive materials on mobile devices using the TEAMWORK model, as evidenced by the ELENE project. Qualitative analysis revealed that the structured approach provided by the TEAMWORK model significantly improved planning and execution phases, leading to better project outcomes and increased learner engagement. Quantitative data further corroborated these findings, demonstrating measurable enhancements in language proficiency and user satisfaction. Challenges such as technical constraints and resource limitations were effectively managed through proactive risk mitigation strategies outlined by the model. Overall, the results underscore the value of the TEAMWORK model in optimizing mobile-assisted language learning initiatives and offer valuable insights for stakeholders involved in similar endeavours. The integration of interactive materials on mobile devices, guided by the TEAMWORK model within the ELENE project, has significantly improved language learning outcomes. Through structured planning and execution, the TEAMWORK model effectively enhanced project outcomes and increased learner engagement. Practical implications include actionable recommendations for stakeholders involved in similar projects, offering insights for maximizing the effectiveness of technology integration in language education.
Communication. Mass media, Print media
The MediaSpin Dataset: Post-Publication News Headline Edits Annotated for Media Bias
Preetika Verma, Kokil Jaidka
The editability of online news content has become a significant factor in shaping public perception, as social media platforms introduce new affordances for dynamic and adaptive news framing. Edits to news headlines can refocus audience attention, add or remove emotional language, and shift the framing of events in subtle yet impactful ways. What types of media bias are editorialized in and out of news headlines, and how can they be systematically identified? This study introduces the MediaSpin dataset, the first to characterize the bias in how prominent news outlets editorialize news headlines after publication. The dataset includes 78,910 pairs of headlines annotated with 13 distinct types of media bias, using human-supervised LLM labeling. We discuss the linguistic insights it affords and show its applications for bias prediction and user behavior analysis.
Nigeria Centre for Disease Control, awareness creation and risk communication of Covid‑19 pandemic amongst non‑literate population in South‑West Nigeria: Lessons for future health campaign
Rachael Ojeka-John, Bernice O. Sanusi, Omowale T. Adelabu
et al.
Risk communication of Covid‑19 pandemic in Nigeria appeared to be urban‑centered with the dominant use of social media, print communication and other controlled media. In such times of public health emergencies, non‑literate population could be vulnerable as a result of their limited understanding of the nature of such health risk. Therefore, the study seeks to investigate the extent to which Nigeria Centre for Disease Control (NCDC) communicated the risk of Covid‑19 disease to non‑literate population in its public health campaign during the pandemic in South‑West Nigeria. The study adopts risk communication theory which advances the approach communication should take during public health emergencies. Using descriptive cross‑sectional mixed methods research design, a sample of 420 respondents were purposively selected from 6 towns in the rural areas of Lagos, Oyo and Osun states to examine the level of awareness on Covid‑19 pandemic among non‑literates. In addition, NCDC risk communication on Covid‑19 for non‑literate population were analyzed from 3 Jingles in Yoruba language as well as 9 flyers designed for Covid‑19 disease from NCDC websites. Results showed that NCDC awareness creation on Covid‑19 disease for non‑literates in Southwest achieved significant success as a result of the medium used in creating awareness. Specifically, radio was highly rated among majority of the respondents (60.4%) followed by health workers (19.8%) as channels that created understandable message on Covid‑19 safety protocols.Further findings on Jingles content revealed that all Covid‑19 safety protocols were communicated in Yoruba language for Southwest populace. However, NCDC fall short in communi‑ cating Covid‑19 risk effectively for non‑literates in Southwest as jingles only buttressed the Covid‑19 safety protocols and symptoms as well as the need to comply, without educating the masses on the dreadful nature of the disease and its dynamics. Though flyers designed by the NCDC communicated risk to an extent, nevertheless, graphics and symbols on Covid‑19 disease were complimented by words in English language only, which could be difficult for non‑literates to decipher. Based on the findings, the study recommends that public health agencies need to educate non‑literate population about the nature of a disease more than creating awareness about the outbreak of a disease, and such education should be strategic, context‑specific, and evidence‑based.
Public aspects of medicine
Coordinated Information Campaigns on Social Media: A Multifaceted Framework for Detection and Analysis
Kin Wai Ng, Adriana Iamnitchi
The prevalence of coordinated information campaigns in social media platforms has significant negative consequences across various domains, including social, political, and economic processes. This paper proposes a multifaceted framework for detecting and analysing coordinated message promotion on social media. By simultaneously considering features related to content, time, and network dimensions, our framework can capture the diverse nature of coordinated activity and identify anomalous user accounts who likely engaged in suspicious behaviour. Unlike existing solutions that rely on specific constraints, our approach is more flexible as it employs specialised components to extract the significant structures within a network and to detect the most unusual interactions. We demonstrate the effectiveness of our framework using two Twitter datasets, the Russian Internet Research Agency (IRA), and long-term discussions on Data Science topics. The results demonstrate our framework's ability to isolate unusual activity from expected normal behaviour and provide valuable insights for further qualitative investigation.
A Multi-Platform Collection of Social Media Posts about the 2022 U.S. Midterm Elections
Rachith Aiyappa, Matthew R. DeVerna, Manita Pote
et al.
Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.
Класифікація актуальних Soft Skills сучасного редактора
Світлана Борисівна Фіялка
Редагування як професійна діяльність охоплює різні напрями соціальних взаємодій. Сучасний редактор не лише працює з мовою і стилем текстів, а й планує медіадіяльність, сам створює різножанрові тексти, організовує роботу з підготовки матеріалів, співпрацює з творцями мультимедійного контенту, аналізує дані за джерелами трафіку, виявляє в розміщених публікаціях елементи прихованої або ненавмисної реклами, бере участь у просуванні медіапродукту тощо. Тож нині роботодавці потребують універсального працівника, професійні стандарти якого закономірно розширилися за рахунок так званих м’яких навичок.
Мета статті — на підставі аналізу редакторських вакансій виявити спектр вимог роботодавців до м’яких навичок здобувачів роботи за відповідним фахом та класифікувати ці вимоги.
У результаті дослідження виокремлено такі блоки навичок: 1. Дотримання принципів професійної етики. 2. Лідерські якості, менеджерські навички. 3. Креативність, інноваційність. 4. Прагнення і здатність навчатися та самовдосконалюватися. 5. Ораторські навички, уміння вести переговори й досягати згоди, а також налагоджувати професійні контакти. 6. Стресостійкість. 7. Адаптивність до формату медіа і до змін, мультизадачність. 8. Навички таймменеджменту. 9. Причетність до організації, уміння вливатися в колектив, працювати в команді. 10. Критичне мислення, здатність приймати рішення в умовах невизначеності. 11. Системність мислення, уміння орієнтуватися в суспільно-політичному житті країни і світу, організовувати підготовку контенту «під ключ». 12. Уміння дотримуватися інструкцій і стандартів, висока концентрація уваги. 13. Емоційний інтелект, толерантність, мультикультурна адаптивність та ін. Редакторські професії є соціально орієнтованими й динамічними, тож, щоб знайти своє місце на ринку праці й реалізувати власне фахові компетентності, сучасний редактор має володіти широким набором м’яких навичок, потреба в яких визначається умовами створення контенту та специфікою медіапродукту.
The Image of Chelyabinsk in the 20th century British Media Discourse (1901-1950)
Olga A. Solopova, Natalya N. Koshkarova, Igor V. Sibiriakov
The paper studies the evolution of the image of Chelyabinsk in the 20th century British media discourse. The research proves relevant as it involves both linguistic and historical analyses; it aims at retrospective study of the evolution of the image of the foreign city in British media discourse over a large time span. A wide range of methods is employed in the study: comparative, diachronic, cognitive-matrix, cognitive-discursive methods, source study, and content analysis. The source of the data is a digitized archive of British historical media texts. The authors fixed nine variations of the city name. The frequency of modeling the image of Chelyabinsk is dissimilar: it is rather high at the beginning of the century, declines in the second decade, reaches its minimum in 1921-1930, and rises again in the subsequent decades, which is explained by the interest of the British media to industrialization and the events of World War II. Most of the newspapers and magazines that modelled the image of Chelyabinsk were published in the capitals and large industrial centres, which is explained by the peculiarities of British print media, a higher level of education of large cities residents, and Britains economic interests in Russia / the Soviet Union. The significant difference in the images of Chelyabinsk across the time is in their emotive load: negative images of the beginning of the century are contrasted to positive images generated in the latest time span.
Language. Linguistic theory. Comparative grammar, Semantics
Is Twitter Enough? Investigating Situational Awareness in Social and Print Media during the Second COVID-19 Wave in India
Ishita Vohra, Meher Shashwat Nigam, Aryan Sakaria
et al.
The pandemic required efficient allocation of public resources and transforming existing ways of societal functions. To manage any crisis, governments and public health researchers exploit the information available to them in order to make informed decisions, also defined as situational awareness. Gathering situational awareness using social media has been functional to manage epidemics. Previous research focused on using discussions during periods of epidemic crises on social media platforms like Twitter, Reddit, or Facebook and developing NLP techniques to filter out relevant discussions from a huge corpus of messages and posts. Social media usage varies with internet penetration and other socioeconomic factors, which might induce disparity in analyzing discussions across different geographies. However, print media is a ubiquitous information source, irrespective of geography. Further, topics discussed in news articles are already newsworthy, while on social media newsworthiness is a product of techno-social processes. Developing this fundamental difference, we study Twitter data during the second wave in India focused on six high-population cities with varied macroeconomic factors. Through a mixture of qualitative and quantitative methods, we further analyze two Indian newspapers during the same period and compare topics from both Twitter and the newspapers to evaluate situational awareness around the second phase of COVID on each of these platforms. We conclude that factors like internet penetration and GDP in a specific city influence the discourse surrounding situational updates on social media. Thus, augmenting information from newspapers with information extracted from social media would provide a more comprehensive perspective in resource deficit cities.
Hidden behind the obvious: misleading keywords and implicitly abusive language on social media
Wenjie Yin, Arkaitz Zubiaga
While social media offers freedom of self-expression, abusive language carry significant negative social impact. Driven by the importance of the issue, research in the automated detection of abusive language has witnessed growth and improvement. However, these detection models display a reliance on strongly indicative keywords, such as slurs and profanity. This means that they can falsely (1a) miss abuse without such keywords or (1b) flag non-abuse with such keywords, and that (2) they perform poorly on unseen data. Despite the recognition of these problems, gaps and inconsistencies remain in the literature. In this study, we analyse the impact of keywords from dataset construction to model behaviour in detail, with a focus on how models make mistakes on (1a) and (1b), and how (1a) and (1b) interact with (2). Through the analysis, we provide suggestions for future research to address all three problems.
Random matrix theory of polarized light scattering in disordered media
Niall Byrnes, Matthew R. Foreman
In this work we present a method for generating random matrices describing electromagnetic scattering from disordered media containing dielectric particles with prescribed single particle scattering characteristics. Resulting scattering matrices automatically satisfy the physical constraints of unitarity, reciprocity and time reversal, whilst also incorporating the polarization properties of electromagnetic waves and scattering anisotropy. Our technique therefore enables statistical study of a variety of polarization phenomena, including depolarization rates and polarization-dependent scattering by chiral particles. In this vein, we perform numerical simulations for media containing isotropic and chiral spherical particles of different sizes for thicknesses ranging from the single to multiple scattering regime and discuss our results, drawing comparisons to established theory.
en
physics.optics, cond-mat.dis-nn
Universal and nonuniversal statistics of transmission in thin random layered media
Jongchul Park, Matthieu Davy, Victor A. Gopar
et al.
The statistics of transmission through random 1D media are generally presumed to be universal and to depend only upon a single dimensionless parameter-the ratio of the sample length and the mean free path, s = L/l. Here, we show in numerical simulations and optical measurements of random binary systems, and most prominently in systems for which s is less than unity, that the statistics of the logarithm of transmission, ln T, are universal for transmission near the upper cutoff of unity and depend distinctively upon the reflectivity of the layer interfaces and their number near a lower cutoff. The universal segment of the probability distribution function of the logarithm of transmission P (ln T) is manifested with as few as three binary layers. For a given value of s, P (ln T ) evolves towards a universal distribution as the number of layers increases. Optical measurements in stacks of 5 and 20 glass coverslips exhibit statistics at low and moderate values of transmission that are close to those found in simulations for 1D layered media, while differences appear at higher transmission where the transmission time in the medium is longer and the wave explores the transverse nonuniformity of the sample.
The Anti-Cult Discourse of Print Media: Problematization of the Role of the Anti-Cult Movement
Vladimir A. Martinovich
<p>This article is devoted to analyzing the anti-cult discourse in the Republic of Belarus in 1996–2000. The print media and the anti-cult movement are selected as objects of research because of their significant role in this discourse. The main features when it comes to covering the topic of new religious movements by both actors are investigated by method of standardized survey of texts on a sample of 521 anti-cult articles from 57 Belarusian newspapers. The range of variability of religious organizations identified as new religious movements is revealed, and their distribution by type of structure is analyzed. The results are compared to the estimated population universe of new religions of the Republic of Belarus. The frequency of their mentions is established, as well as a group of organizations that are criticized by actors, but have never operated in the country. The range of variability and frequency of use of special terminology is disclosed. The influence of the anti-cult discourse on changes in the evaluative connotations of special terms is analyzed. Different facts from the history and modern practice of the anti-cult movement are examined, all of which are particularly important in terms of understanding the specifics of its representatives’ attitudes towards non-specialized print media. The ambivalent nature of the coverage of the topic of new religions in the press and its influence on the anti-cult movement is noted. Special care is taken defining the place and role of print media and the anti-cult movement in the complex system of society’s anti-cult discourse. Based on the data obtained, the dominant theory of the unilateral influence of the anti-cult movement on print media is criticized. An alternative hypothesis on the complex genesis of anti-cult discourse is proposed, in which the specifics of its main features as perceived by each subject are influenced by many different factors. Two methodological problems related to searching for and recording materials relevant for analyzing this discourse and verifying this theory are identified.</p>
Системно-технічний аналіз технологій виготовлення термозбіжних етикеток
Світлана Федорівна Гавенко, Олена Георгіївна Котмальова, Марта Тарасівна Лабецька
Shrink label today is one of the leading innovative solutions in the field of packaging labeling. Heat-shrinkable film has mechanical strength, elasticity and moisture-proof properties, easily changes linear dimensions under the influence of temperature, which allows it to take the form of packaged products, prevents unauthorized opening and forgery, and attracts buyers through original design solutions. The technology of heat-shrinkable labelling of products is used in the pharmaceutical, cosmetology, food, dairy, confectionery industries.
To preserve the product’s proper appearance until the end of its service life, as well as to be able to decorate it printing on the shrink label is performed on its inner side. Gravure and flexographic printing technologies are mainly used to apply images to shrink labels. The specifics of the technological process of manufacturing a shrink label determines the use of materials with different physical properties: transparency, gloss, thickness, strength, coefficients of friction and tensile, temperature, percentage and time of shrinkage, which significantly affects its final cost. The most widely used are PET and PVC films. PVC film is deformed at lower temperatures and is more resistant to the external environment, while the use of PET films allows to achieve better thermal shrinkage and, accordingly, to ensure higher print quality. When applying the information to the heat-shrinkable film, it can be used water-based inks, but then there is a possibility of blurring of the image when passing the package through the steam furnace. As an alternative to printing on such films, UV fixing inks are used, which allows to apply multi-color complex images on any type of material, but this method is quite expensive due to the high cost of inks and printing equipment.
Therefore, it is important to conduct system-technical, economic analysis and determine the most cost-effective and efficient technology for the production of shrink labels.
Counter-examples to the high-order version and strong version of the generalized Eshelby conjecture for anisotropic media
Tianyu Yuan, Kefu Huang, Jianxiang Wang
In this work, we prove that in anisotropic media possessing cubic, transversely isotropic, orthotropic, and monoclinic symmetries, there exist non-ellipsoidal inclusions that can transform particular quadratic eigenstrains into quadratic elastic strain fields in them. Further, we prove that in these anisotropic media, there exist non-ellipsoidal inclusions that can transform particular polynomial eigenstrains of even degrees into polynomial elastic strain fields of the same even degrees in them. A sufficient condition for the existence of those counter-examples is provided. These results constitute counter-examples, in the strong sense, to the generalized high-order Eshelby conjecture (inverse problem of Eshelby's polynomial conservation theorem) for polynomial eigenstrains in both anisotropic media and the isotropic medium (quadratic eigenstrain only). In addition, we also show that there are counter-examples to the strong version of the generalized Eshelby conjecture for uniform eigenstrains in these anisotropic media. These findings reveal striking richness of the uniformity between the eigenstrains and the correspondingly induced elastic strains in inclusions in anisotropic media beyond the canonical ellipsoidal inclusion.
Media Cloud: Massive Open Source Collection of Global News on the Open Web
Hal Roberts, Rahul Bhargava, Linas Valiukas
et al.
We present the first full description of Media Cloud, an open source platform based on crawling hyperlink structure in operation for over 10 years, that for many uses will be the best way to collect data for studying the media ecosystem on the open web. We document the key choices behind what data Media Cloud collects and stores, how it processes and organizes these data, and its open API access as well as user-facing tools. We also highlight the strengths and limitations of the Media Cloud collection strategy compared to relevant alternatives. We give an overview two sample datasets generated using Media Cloud and discuss how researchers can use the platform to create their own datasets.