Hasil untuk "data science"

Menampilkan 20 dari ~44761281 hasil · dari DOAJ, CrossRef, Semantic Scholar

JSON API
S2 Open Access 2014
Outlier Detection for Temporal Data: A Survey

Manish Gupta, Jing Gao, C. Aggarwal et al.

In the statistics community, outlier detection for time series data has been studied for decades. Recently, with advances in hardware and software technology, there has been a large body of work on temporal outlier detection from a computational perspective within the computer science community. In particular, advances in hardware technology have enabled the availability of various forms of temporal data collection mechanisms, and advances in software technology have enabled a variety of data management mechanisms. This has fueled the growth of different kinds of data sets such as data streams, spatio-temporal data, distributed streams, temporal networks, and time series data, generated by a multitude of applications. There arises a need for an organized and detailed study of the work done in the area of outlier detection with respect to such temporal datasets. In this survey, we provide a comprehensive and structured overview of a large set of interesting outlier definitions for various forms of temporal data, novel techniques, and application scenarios in which specific definitions and techniques have been widely used.

1018 sitasi en Computer Science
S2 Open Access 2009
A Qualitative Framework for Collecting and Analyzing Data in Focus Group Research

A. Onwuegbuzie, W. Dickinson, N. Leech et al.

Despite the abundance of published material on conducting focus groups, scant specific information exists on how to analyze focus group data in social science research. Thus, the authors provide a new qualitative framework for collecting and analyzing focus group data. First, they identify types of data that can be collected during focus groups. Second, they identify the qualitative data analysis techniques best suited for analyzing these data. Third, they introduce what they term as a micro-interlocutor analysis, wherein meticulous information about which participant responds to each question, the order in which each participant responds, response characteristics, the nonverbal communication used, and the like is collected, analyzed, and interpreted. They conceptualize how conversation analysis offers great potential for analyzing focus group data. They believe that their framework goes far beyond analyzing only the verbal communication of focus group participants, thereby increasing the rigor of focus group analyses in social science research.

1686 sitasi en Psychology
S2 Open Access 2007
A correlated topic model of Science

D. Blei, J. Lafferty

Topic models, such as latent Dirichlet allocation (LDA), can be useful tools for the statistical analysis of document collections and other discrete data. The LDA model assumes that the words of each document arise from a mixture of topics, each of which is a distribution over the vocabulary. A limitation of LDA is the inability to model topic correlation even though, for example, a document about genetics is more likely to also be about disease than X-ray astronomy. This limitation stems from the use of the Dirichlet distribution to model the variability among the topic proportions. In this paper we develop the correlated topic model (CTM), where the topic proportions exhibit correlation via the logistic normal distribution [J. Roy. Statist. Soc. Ser. B 44 (1982) 139--177]. We derive a fast variational inference algorithm for approximate posterior inference in this model, which is complicated by the fact that the logistic normal is not conjugate to the multinomial. We apply the CTM to the articles from Science published from 1990--1999, a data set that comprises 57M words. The CTM gives a better fit of the data than LDA, and we demonstrate its use as an exploratory tool of large document collections.

1563 sitasi en Computer Science, Mathematics
S2 Open Access 2017
The importance of open data and software: Is energy research lagging behind?

S. Pfenninger, J. DeCarolis, Lion Hirth et al.

Energy policy often builds on insights gained from quantitative energy models and their underlying data. As climate change mitigation and economic concerns drive a sustained transformation of the energy sector, transparent and well-founded analyses are more important than ever. We assert that models and their associated data must be openly available to facilitate higher quality science, greater productivity through less duplicated effort, and a more effective science-policy boundary. There are also valid reasons why data and code are not open: ethical and security concerns, unwanted exposure, additional workload, and institutional or personal inertia. Overall, energy policy research ostensibly lags behind other fields in promoting more open and reproducible science. We take stock of the status quo and propose actionable steps forward for the energy research community to ensure that it can better engage with decision-makers and continues to deliver robust policy advice in a transparent and reproducible way.

331 sitasi en Engineering
DOAJ Open Access 2025
A Systematic Literature Review of Artificial Intelligence in Prehospital Emergency Care

Omar Elfahim, Kokou Laris Edjinedja, Johan Cossus et al.

<i>Background:</i> The emergency medical services (EMS) sector, as a complex system, presents substantial hurdles in providing excellent treatment while operating within limited resources, prompting greater adoption of artificial intelligence (AI) as a tool for improving operational efficiency. While AI models have proved beneficial in healthcare operations, there is limited explainability and interpretability, as well as a lack of data used in their application and technological advancement. <i>Methods:</i> The scoping review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for scoping reviews, using PubMed, IEEE Xplore, and Web of Science, with a procedure of double screening and extraction. The search included articles published from 2018 to the beginning of 2025. Studies were excluded if they did not explicitly identify an artificial intelligence (AI) component, lacked relevance to emergency department (ED) or prehospital contexts, failed to report measurable outcomes or evaluations, or did not exploit real-world data. We analyzed the data source used, clinical subclasses, AI domains, ML algorithms, their performance, as well as potential roles for large language models (LLMs) in future applications. <i>Results:</i> A comprehensive PRISMA-guided methodology was used to search academic databases, finding 1181 papers on prehospital emergency treatment from 2018 to 2025, with 65 articles identified after an extensive screening procedure. The results reveal a significant increase in AI publications. A notable technological advancement in the application of AI in EMS using different types of data was explored. <i>Conclusions:</i> These findings highlighted that AI and ML have emerged as revolutionary innovations with huge potential in the fields of healthcare and medicine. There are several promising AI interventions that can improve prehospital emergency care, particularly for out-of-hospital cardiac arrest and triage prioritization scenarios. <i>Implications for EMS Practice:</i> Integrating AI methods into prehospital care can optimize the use of available resources, as well as triage and dispatch efficiency. LLMs may have the potential to improve understanding and assist in decision-making under pressure in emergency situations by combining various forms of recorded data. However, there is a need to emphasize continued research and strong collaboration between AI experts and EMS physicians to ensure the safe, ethical, and effective integration of AI into EMS practice.

DOAJ Open Access 2025
Young People's mental health service utilization

Corine Driessens, Peter W.F. Smith, Kim Markham-Jones et al.

Objective This project aimed to uncover key factors that shape young people’s (YP) mental health care utilization. The Andersen’s Behavioral Model of Health Care Utilization was adapted in co-production, providing a framework for the predisposing characteristics, enabling resources, and perceived/evaluated need factors hypothesized to influence young people’s mental health care utilization. Methods The project is a secondary data analysis project with strong emphasis on YP involvement. YoungMinds and young researchers facilitated the co-production of an analysis plan with YP who had lived experience. This analysis plan was used to analyse existing data (Longitudinal Study of YP in England, also known as NEXT STEPS). Cohort data was linked to administrative health care data (Hospital Episode Statistics) to obtain objective measures of mental health care utilization. As this cohort is subjective to mental health related attrition, logistic regression models in combination with missing-not-at-random methodology was used to determine factors impacting mental health service utilization. Results The insights and experiences shared by YP in three workshops were captured in the YP’s model of secondary health care utilization for common mental health problems. NEXT STEPS-HES linked data showed that approximately 10% of the participants reporting common mental health problems between age 14 to 25 accessed secondary mental health services. The main predictor of utilization of secondary mental health services between age 17 to 25 is having a common mental health diagnosis before age 17. Interaction with social services and educational welfare at age 16 also facilitates utilization of mental health care services. Findings align with existing literature showing that women are more likely to utilize secondary mental health care services compared to men. Conclusion Despite growing recognition of mental health challenges among young people (YP), the findings indicate that only one in ten YP reporting common mental health problems utilized services. Secondary mental health care use was not only influenced by perceived mental wellbeing but also by societal perceptions and expectations.

Demography. Population. Vital events
DOAJ Open Access 2025
Developing interactive VR-based digital therapeutics for Acceptance and Commitment Therapy (ACT): a structured framework for the digital transformation integrating gamification and multimodal arts

Hyungsook Kim, Hyungsook Kim, Yoonyoung Choi

IntroductionDigital therapeutics (DTx) require structured methodologies to translate evidence-based psychotherapy into immersive digital formats. In response to this need, this study proposes a practical framework for the digital transformation of Acceptance and Commitment Therapy (ACT) into an interactive virtual reality (VR) system.MethodsDTx-ACT, designed as a therapeutic intervention for depression, is a VR-based system that delivers ACT through an immersive virtual experience. Its development followed five structured phases: preliminary research, design, development, advancement, and commercialization. The original ACT protocol was modularized into VR environments using the Session Structuring System (SSS) model. To enhance user engagement, gamification and multimodal arts strategies were incorporated. As part of the development process, evaluation metrics were defined to assess both clinical effectiveness and user interaction.ResultsThe final system comprises five immersive VR sessions, each lasting 6 to 12 minutes. These modules incorporate ACT metaphors, interactive tasks, and multisensory feedback to enhance therapeutic engagement. To support the digital transformation of ACT, three core components were established: (1) an evidence-based therapeutic protocol, (2) interactive VR elements—including gamification and multimodal arts-based guidance, and (3) a data-driven evaluation framework. Evaluation metrics, derived from a pilot study, were integrated into the system, which collects clinical and interaction data—such as real-time behavioral patterns and sensor-based information—to enable comprehensive evaluation.DiscussionBased on this development process, we propose a practical framework for designing interactive VR-based DTx. This framework bridges clinical structure, creative engagement, and real-time evaluation to support personalized and scalable applications in digital mental healthcare. It contributes to the standardization of digital transformation in evidence-based therapy and offers a transferable model for future therapeutic content development.

S2 Open Access 2020
Online Citizen Science: A Systematic Review of Effects on Learning and Scientific Literacy

M. Aristeidou, C. Herodotou

Participation in online citizen science is increasingly popular, yet studies that examine the impact on participants’ learning are limited. The aims of this paper are to identify the learning impact on volunteers who participate in online citizen science projects and to explore the methods used to study the impact. The ten empirical studies, examined in this systematic review, report learning impacts on citizens’ attitudes towards science, on their understanding of the nature of science, on topic-specific knowledge, on science knowledge, and on generic knowledge. These impacts were measured using self-reports, content analysis of contributed data and of forum posts, accuracy checks of contributed data, science and project-specific quizzes, and instruments for measuring scientific attitudes and beliefs. The findings highlight that certain technological affordances in online citizen science projects can cultivate citizens’ knowledge and skills, and they point to unexplored areas, including the lack of experimental and long-term studies, and studies in formal education settings.

155 sitasi en Sociology
S2 Open Access 2018
Data infrastructure literacy

J. Gray, C. Gerlitz, Liliana Bounegru

A recent report from the UN makes the case for “global data literacy” in order to realise the opportunities afforded by the “data revolution”. Here and in many other contexts, data literacy is characterised in terms of a combination of numerical, statistical and technical capacities. In this article, we argue for an expansion of the concept to include not just competencies in reading and working with datasets but also the ability to account for, intervene around and participate in the wider socio-technical infrastructures through which data is created, stored and analysed – which we call “data infrastructure literacy”. We illustrate this notion with examples of “inventive data practice” from previous and ongoing research on open data, online platforms, data journalism and data activism. Drawing on these perspectives, we argue that data literacy initiatives might cultivate sensibilities not only for data science but also for data sociology, data politics as well as wider public engagement with digital data infrastructures. The proposed notion of data infrastructure literacy is intended to make space for collective inquiry, experimentation, imagination and intervention around data in educational programmes and beyond, including how data infrastructures can be challenged, contested, reshaped and repurposed to align with interests and publics other than those originally intended.

210 sitasi en Sociology, Computer Science
S2 Open Access 2017
Data politics

E. Ruppert, E. Isin, D. Bigo

The commentary raises political questions about the ways in which data has been constituted as an object vested with certain powers, influence, and rationalities. We place the emergence and transformation of professional practices such as ‘data science’, ‘data journalism’, ‘data brokerage’, ‘data mining’, ‘data storage’, and ‘data analysis’ as part of the reconfiguration of a series of fields of power and knowledge in the public and private accumulation of data. Data politics asks questions about the ways in which data has become such an object of power and explores how to critically intervene in its deployment as an object of knowledge. It is concerned with the conditions of possibility of data that involve things (infrastructures of servers, devices, and cables), language (code, programming, and algorithms), and people (scientists, entrepreneurs, engineers, information technologists, designers) that together create new worlds. We define ‘data politics’ as both the articulation of political questions about these worlds and the ways in which they provoke subjects to govern themselves and others by making rights claims. We contend that without understanding these conditions of possibility – of worlds, subjects and rights – it would be difficult to intervene in or shape data politics if by that it is meant the transformation of data subjects into data citizens.

243 sitasi en Computer Science
S2 Open Access 2021
LamaH-CE: LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe

Christoph Klingler, K. Schulz, M. Herrnegger

Abstract. Very large and comprehensive datasets are increasingly used in the field of hydrology. Large-sample studies provide insights into the hydrological cycle that might not be available with small-scale studies. LamaH-CE (LArge-SaMple DAta for Hydrology and Environmental Sciences for Central Europe, LamaH for short; the geographical extension “-CE” is omitted in the text and the dataset) is a new dataset for large-sample studies and comparative hydrology in Central Europe. It covers the entire upper Danube to the state border of Austria–Slovakia, as well as all other Austrian catchments including their foreign upstream areas. LamaH covers an area of about 170 000 km2 in nine countries, ranging from lowland regions characterized by a continental climate to high alpine zones dominated by snow and ice. Consequently, a wide diversity of properties is present in the individual catchments. We represent this variability in 859 gauged catchments with over 60 catchment attributes, covering topography, climatology, hydrology, land cover, vegetation, soil and geological properties. LamaH further contains a collection of runoff time series as well as meteorological time series. These time series are provided with a daily and hourly resolution. All meteorological and the majority of runoff time series cover a span of over 35 years, which enables long-term analyses with a high temporal resolution. The runoff time series are classified by over 20 attributes including information about human impacts and indicators for data quality and completeness. The structure of LamaH is based on the well-known CAMELS (Catchment Attributes and MEteorology for Large-sample Studies) datasets. In contrast, however, LamaH does not only consider independent basins, covering the full upstream area. Intermediate catchments are covered as well, which allows together with novel attributes the considering of the hydrological network and river topology in applications. We not only describe the basic datasets used and methodology of data preparation but also focus on possible limitations and uncertainties. LamaH contains additionally results of a conceptual hydrological baseline model for checking plausibility of the inputs as well as benchmarking. Potential applications of LamaH are outlined as well, since it is intended to serve as a uniform data basis for further research. LamaH is available at https://doi.org/10.5281/zenodo.4525244 (Klingler et al., 2021).

102 sitasi en Environmental Science
DOAJ Open Access 2024
Methodology of Assessment of Athletes' Jumping Skills Using Electronic Equipment

Soyib Tajibaev, Shukhratulla Allamuratov, Bekir Erhan et al.

Background and Study. Vertical jump performance is a critical factor in volleyball, significantly influencing actions like spiking, blocking, and serving. Accurate assessment of jump height is essential for optimizing training strategies, especially at the elite level. Purpose: The aim of this study was to evaluate the validity and reliability of a novel computerized diagnostic equipment (CDE-A, Patent No. 001144) designed for precise measurement of vertical jump height in volleyball players. Materials and Methods. The study involved the development and validation of the CDE-A system to assess vertical jump performance. Participants included elite volleyball players from the Uzbekistan national team, various club teams, and students from the State University of Physical Education and Sports of Uzbekistan. The system's accuracy and reliability were tested through rigorous procedures, including data storage and analysis capabilities for maximum jump height and functional performance. The research involved developing and testing a specialized device (CDE-A) to evaluate elite volleyball players' vertical jump capabilities. Participants included 18 athletes aged 13–14, 16 athletes aged 15–16, and 50 students from the Uzbekistan State University of Physical Education and Sports. Measurements were conducted across age groups and educational levels. A pedagogical experiment compared traditional training (control group, CG) with specialized exercises for agility and jumping endurance (experimental group, EG). Results. The CDE-A device demonstrated high reliability and precision in measuring vertical jump height. Key features include the capability to store maximal jump data in computer memory and analyze its functional significance for training and performance evaluation. The device enables coaches to monitor and enhance athletes' jump performance with greater efficiency and accuracy. Conclusions. This research highlights the utility of the CDE-A system for assessing and improving vertical jump capabilities in volleyball players across all age groups. The study underscores its potential to revolutionize training methodologies by providing coaches with reliable, evidence-based insights into athletes' performance. The findings offer a foundation for further advancements in jump height measurement technologies and their application in sports science. This study establishes the CDE-A as a valuable tool for sports performance evaluation, with implications extending to volleyball and other sports requiring explosive jump abilities.

Halaman 27 dari 2238065