“Who are these people?” Evaluating the demographic characteristics and political preferences of MTurk survey respondents
C. Huff, D. Tingley
As Amazon’s Mechanical Turk (MTurk) has surged in popularity throughout political science, scholars have increasingly challenged the external validity of inferences made drawing upon MTurk samples. At workshops and conferences experimental and survey-based researchers hear questions about the demographic characteristics, political preferences, occupation, and geographic location of MTurk respondents. In this paper we answer these questions and present a number of novel results. By introducing a new benchmark comparison for MTurk surveys, the Cooperative Congressional Election Survey, we compare the joint distributions of age, gender, and race among MTurk respondents within the United States. In addition, we compare political, occupational, and geographical information about respondents from MTurk and CCES. Throughout the paper we show several ways that political scientists can use the strengths of MTurk to attract respondents with specific characteristics of interest to best answer their substantive research questions.
Beyond Continuity: Institutional Change in Advanced Political Economies
W. Streeck, K. Thelen
After Method: Mess in Social Science Research
J. Law
The Two-party System and Duverger's Law: An Essay on the History of Political Science
W. Riker
797 sitasi
en
Political Science
Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users
Jonathan Mellon, Christopher Prosser
A growing social science literature has used Twitter and Facebook to study political and social phenomena including for election forecasting and tracking political conversations. This research note uses a nationally representative probability sample of the British population to examine how Twitter and Facebook users differ from the general population in terms of demographics, political attitudes and political behaviour. We find that Twitter and Facebook users differ substantially from the general population on many politically relevant dimensions including vote choice, turnout, age, gender, and education. On average social media users are younger and better educated than non-users, and they are more liberal and pay more attention to politics. Despite paying more attention to politics, social media users are less likely to vote than non-users, but they are more likely to support the left leaning Labour Party when they do vote. However, we show that these apparent differences mostly arise due to the demographic composition of social media users. After controlling for age, gender, and education, no statistically significant differences arise between social media users and non-users on political attention, values or political behaviour.
Political diversity will improve social psychological science.
José L. Duarte, Jarret T. Crawford, Charlotta Stern
et al.
425 sitasi
en
Psychology, Medicine
Opportunities in AI/ML for the Rubin LSST Dark Energy Science Collaboration
LSST Dark Energy Science Collaboration, Eric Aubourg, Camille Avestruz
et al.
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.
en
astro-ph.IM, astro-ph.CO
You Can't Get There From Here: Redefining Information Science to address our sociotechnical futures
Scott Humr, Mustafa Canan
Current definitions of Information Science are inadequate to comprehensively describe the nature of its field of study and for addressing the problems that are arising from intelligent technologies. The ubiquitous rise of artificial intelligence applications and their impact on society demands the field of Information Science acknowledge the sociotechnical nature of these technologies. Previous definitions of Information Science over the last six decades have inadequately addressed the environmental, human, and social aspects of these technologies. This perspective piece advocates for an expanded definition of Information Science that fully includes the sociotechnical impacts information has on the conduct of research in this field. Proposing an expanded definition of Information Science that includes the sociotechnical aspects of this field should stimulate both conversation and widen the interdisciplinary lens necessary to address how intelligent technologies may be incorporated into society and our lives more fairly.
What Does Information Science Offer for Data Science Research?: A Review of Data and Information Ethics Literature
Brady D. Lund, Ting Wang
This paper reviews literature pertaining to the development of data science as a discipline, current issues with data bias and ethics, and the role that the discipline of information science may play in addressing these concerns. Information science research and researchers have much to offer for data science, owing to their background as transdisciplinary scholars who apply human-centered and social-behavioral perspectives to issues within natural science disciplines. Information science researchers have already contributed to a humanistic approach to data ethics within the literature and an emphasis on data science within information schools all but ensures that this literature will continue to grow in coming decades. This review article serves as a reference for the history, current progress, and potential future directions of data ethics research within the corpus of information science literature.
Depth and Autonomy: A Framework for Evaluating LLM Applications in Social Science Research
Ali Sanaei, Ali Rajabzadeh
Large language models (LLMs) are increasingly utilized by researchers across a wide range of domains, and qualitative social science is no exception; however, this adoption faces persistent challenges, including interpretive bias, low reliability, and weak auditability. We introduce a framework that situates LLM usage along two dimensions, interpretive depth and autonomy, thereby offering a straightforward way to classify LLM applications in qualitative research and to derive practical design recommendations. We present the state of the literature with respect to these two dimensions, based on all published social science papers available on Web of Science that use LLMs as a tool and not strictly as the subject of study. Rather than granting models expansive freedom, our approach encourages researchers to decompose tasks into manageable segments, much as they would when delegating work to capable undergraduate research assistants. By maintaining low levels of autonomy and selectively increasing interpretive depth only where warranted and under supervision, one can plausibly reap the benefits of LLMs while preserving transparency and reliability.
LLM-Based Data Science Agents: A Survey of Capabilities, Challenges, and Future Directions
Mizanur Rahman, Amran Bhuiyan, Mohammed Saidul Islam
et al.
Recent advances in large language models (LLMs) have enabled a new class of AI agents that automate multiple stages of the data science workflow by integrating planning, tool use, and multimodal reasoning across text, code, tables, and visuals. This survey presents the first comprehensive, lifecycle-aligned taxonomy of data science agents, systematically analyzing and mapping forty-five systems onto the six stages of the end-to-end data science process: business understanding and data acquisition, exploratory analysis and visualization, feature engineering, model building and selection, interpretation and explanation, and deployment and monitoring. In addition to lifecycle coverage, we annotate each agent along five cross-cutting design dimensions: reasoning and planning style, modality integration, tool orchestration depth, learning and alignment methods, and trust, safety, and governance mechanisms. Beyond classification, we provide a critical synthesis of agent capabilities, highlight strengths and limitations at each stage, and review emerging benchmarks and evaluation practices. Our analysis identifies three key trends: most systems emphasize exploratory analysis, visualization, and modeling while neglecting business understanding, deployment, and monitoring; multimodal reasoning and tool orchestration remain unresolved challenges; and over 90% lack explicit trust and safety mechanisms. We conclude by outlining open challenges in alignment stability, explainability, governance, and robust evaluation frameworks, and propose future research directions to guide the development of robust, trustworthy, low-latency, transparent, and broadly accessible data science agents.
NATIONAL POLICE DATABASES AND THEIR INCREASING IMPORTANCE IN 21ST CENTURY POLICING
Amanda Blakeman
On November 14, 2023, Chief Constable Amanda Blakeman presented “National Police Databases and Their Increasing Importance in 21st Century Policing” for this year’s West Coast Security Conference. The key points discussed were the creation, functions, and use cases of the Police National Database (PND).
Received: 12-18-2023
Revised: 01-26-2024
Data clustering: a fundamental method in data science and management
Tai Dinh, Wong Hauchi, Daniil Lisik
et al.
This paper explores the critical role of data clustering in data science, emphasizing its methodologies, tools, and diverse applications. Traditional techniques, such as partitional and hierarchical clustering, are analyzed alongside advanced approaches such as data stream, density-based, graph-based, and model-based clustering for handling complex structured datasets. The paper highlights key principles underpinning clustering, outlines widely used tools and frameworks, introduces the workflow of clustering in data science, discusses challenges in practical implementation, and examines various applications of clustering. By focusing on these foundations and applications, the discussion underscores clustering's transformative potential. The paper concludes with insights into future research directions, emphasizing clustering's role in driving innovation and enabling data-driven decision-making.
A Skin Not a Sweater: Ontology and Epistemology in Political Science
P. Furlong, D. Marsh
Education and Political Participation
Claire Willeck, Tali Mendelberg
Whether education affects political participation is a long-standing and central question in political philosophy and political science. In this review, we provide an overview of the three main theoretical models that explain different causal pathways. We then synthesize the surge in research using causal inference strategies and show that this literature has generated mixed results about the causal impact of education, even when using similar methods and data. These findings do not provide clear support for any of the three theories. Our next section covers research on civic education and political participation. The quantity of civic education matters little for political participation, but how civic education is taught does matter. Namely, strategies falling under the rubric of active learning show promise. These strategies seem especially effective for historically marginalized students. Our final section calls for more research on how civic education is taught. Expected final online publication date for the Annual Review of Political Science, Volume 25 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
System Science in Politics -- Europe and the War in Ukraine
Juergen Mimkes
Peace means order, and war brings disorder and chaos to any society. But order and disorder are not only observed in wars, in many systems they are the dominant property. Understanding order and disorder enables us to understand the structure of systems. Order and disorder are also part of the Lagrange Principle, and as statistics is valid in all systems, we may regard Lagrange statistics as a mathematical basis of system science. Two systems out of natural and social science are compared: materials of trillions of atoms and politics of millions of people. Lagrange statistics leads to three phases of homogeneous systems: in materials we have the states: solid, liquid, gas, depending on two Lagrange parameters, temperature T (the mean energy of atoms) and pressure p. In politics we have three states: autocratic, democratic, global, depending on two Lagrange parameters, standard of living T (the mean capital of people) and political pressure p. The three phases of each system are compared in the p-T phase diagram: Different phases of one system cannot coexist as nearest neighbors: Water will dissolve ice by exchange of atoms and heat. This leads to the present climate crisis. Democracies will dissolve autocracies by exchange of goods, ideas, and people like guest workers. This is the peaceful history of the EU and has led to the aggressive reaction of Russia in the GDR, Hungary, ČSR, and now in the Ukraine. At the end of war peaceful coexistence will not be possible between Russia and Ukraine. Only separation by a new Iron Curtain guaranteed by NATO can lead to a long-time armistice.
en
physics.soc-ph, stat.AP
Protoplanetary Disk Science with the Orbiting Astronomical Satellite Investigating Stellar Systems (OASIS) Observatory
Kamber Schwarz, Joan Najita, Jennifer Bergner
et al.
The Orbiting Astronomical Satellite for Investigating Stellar Systems (OASIS) is a NASA Astrophysics MIDEX-class mission concept, with the stated goal of following water from galaxies, through protostellar systems, to Earth's oceans. This paper details the protoplanetary disk science achievable with OASIS. OASIS's suite of heterodyne receivers allow for simultaneous, high spectral resolution observations of water emission lines spanning a large range of physical conditions within protoplanetary disks. These observations will allow us to map the spatial distribution of water vapor in disks across evolutionary stages and assess the importance of water, particularly the location of the midplane water snowline, to planet formation. OASIS will also detect the H2 isotopologue HD in 100+ disks, allowing for the most accurate determination of total protoplanetary disk gas mass to date. When combined with the contemporaneous water observations, the HD detection will also allow us to trace the evolution of water vapor across evolutionary stages. These observations will enable OASIS to characterize the time development of the water distribution and the role water plays in the process of planetary system formation.
en
astro-ph.EP, astro-ph.IM
Diskursiv avpolitisering av demokratiet: Å forstå autoritær konsolidering i Russland gjennom Jacques Rancières tenkning
Anni Roth Hjermann
Denne artikkelen undersøker diskursens rolle i konsolidering av autoritære regimer. Gjennom å etablere en dialog mellom Jacques Rancières arbeider om politikk og avpolitisering og poststrukturalistisk diskursanalyse argumenterer artikkelen for at diskursiv avpolitisering bidrar til at autokratier befester seg, og viser at autoritær konsolidering ofte finner sted i skjæringsfeltet mellom nasjonal og internasjonal politikk. Artikkelen retter et særskilt søkelys på Rancières begrep om kløfter som politikkens scene, og teoretiserer hvordan slike kløfter nøytraliseres i avpolitisering. Artikkelen fremsetter så en metode for å analysere diskursiv avpolitisering empirisk ved å konseptualisere Rancières logikker som idealtypiske avpolitiseringsdiskurser, og illustrerer denne analytiske strategien ved å anvende den på russisk offisiell diskurs i senere år (2015–2020). Slik forklarer artikkelen hvordan diskursive konstruksjoner har befestet Russland som autokrati: Den viser at autoritær konsolidering i Russland under Putin muliggjøres av rotfestede avpolitiserende diskurser som (re)produseres og forsterkes i et sammenvevet innenriks- og utenrikspolitisk felt. Artikkelen fremmer begrepet diskursiv avpolitisering som et nytt perspektiv på fagdebatter om den liberale verdensordens utfordringer og såkalte hybridregimer.
A survey study of success factors in data science projects
Iñigo Martinez, Elisabeth Viles, Igor G. Olaizola
In recent years, the data science community has pursued excellence and made significant research efforts to develop advanced analytics, focusing on solving technical problems at the expense of organizational and socio-technical challenges. According to previous surveys on the state of data science project management, there is a significant gap between technical and organizational processes. In this article we present new empirical data from a survey to 237 data science professionals on the use of project management methodologies for data science. We provide additional profiling of the survey respondents' roles and their priorities when executing data science projects. Based on this survey study, the main findings are: (1) Agile data science lifecycle is the most widely used framework, but only 25% of the survey participants state to follow a data science project methodology. (2) The most important success factors are precisely describing stakeholders' needs, communicating the results to end-users, and team collaboration and coordination. (3) Professionals who adhere to a project methodology place greater emphasis on the project's potential risks and pitfalls, version control, the deployment pipeline to production, and data security and privacy.
Improving Stance Detection by Leveraging Measurement Knowledge from Social Sciences: A Case Study of Dutch Political Tweets and Traditional Gender Role Division
Qixiang Fang, Anastasia Giachanou, Ayoub Bagheri
Stance detection concerns automatically determining the viewpoint (i.e., in favour of, against, or neutral) of a text's author towards a target. Stance detection has been applied to many research topics, among which the detection of stances behind political tweets is an important one. In this paper, we apply stance detection to a dataset of tweets from official party accounts in the Netherlands between 2017 and 2021, with a focus on stances towards traditional gender role division, a dividing issue between (some) Dutch political parties. To implement and improve stance detection of traditional gender role division, we propose to leverage an established survey instrument from social sciences, which has been validated for the purpose of measuring attitudes towards traditional gender role division. Based on our experiments, we show that using such a validated survey instrument helps to improve stance detection performance.