Hasil untuk "Computer Science"
Menampilkan 20 dari ~22579101 hasil · dari CrossRef, DOAJ, arXiv, Semantic Scholar
Meixiang Gao, Xiujuan Yan, Xin Li et al.
ABSTRACT The field of soil science has seen significant advancements in recent years, largely due to the integration of computational tools and statistical methods. Among these resources, the programming language R has emerged as a powerful and versatile platform for soil scientists, aiding in a spectrum of tasks from data analysis and modeling to visualization. Nonetheless, the broader trends and specific patterns of R usage in soil research have not been thoroughly documented. Our study investigated the prevalence of R and its package usage in 25,888 research articles published in 10 leading soil science journals over a decade, from 2014 to 2023. A considerable number of these articles, 7899 (or 30.5%), named R as their primary data analysis tool. The use of R has followed a steady linear growth pattern, rising from 13.9% in 2014 to 46.5% in 2023. The most commonly used R packages were “vegan,” “ggplot2,” “lme4,” “nlme,” and “randomForest,” with each journal showcasing unique research focuses, resulting in varying frequencies of R package applications across different publications. Furthermore, there was a notable increase in the average number of R packages used per article throughout the study period. This research highlights the pivotal role of R, armed with its robust statistical and visualization capabilities, in enabling soil scientists to conduct comprehensive analyses and gain in‐depth insights into the complex dimensions of soil science.
Rok Rajher, Mila Marinković, Polona Rus Prelog et al.
Abstract Schizophrenia is a chronic and severe mental disorder that still relies on time-intensive, clinician-administered assessments. Although several automated approaches have been proposed to support diagnosis, these systems often lack the level of explainability necessary for informed clinical decision-making. In this study, we present a fully automated and explainable pipeline for detecting schizophrenia from audio recordings of verbal fluency tests, collected from 126 Slovene-speaking participants (68 healthy controls, 58 individuals diagnosed with schizophrenia), leveraging recent advancements in automatic speech recognition (ASR) and large language model (LLM) systems. We evaluated three ASR models–Truebar, Whisper, and Soniox–for transcription quality, and selected the best-performing system for further processing. We semantically enriched the transcriptions using the generative capabilities of LLMs and extracted both verbal and non-verbal features grounded in established diagnostic criteria. We assessed the relevance of these features using a Bayesian statistical framework and trained multiple classical machine learning models for automatic classification. Our best-performing model, an Explainable Boosting Machine, achieved a classification accuracy of 0.82 and an AUC of 0.90. We further generated visual explanations for the model’s predictions, establishing the first fully automated and explainable schizophrenia detection framework developed for the Slovene language. Our approach prioritizes explainability through model-transparent outputs, while still achieving performance comparable to existing automated systems for speech-based schizophrenia detection.
Wenqi Fan, Yi Zhou, Shijie Wang et al.
Considering the significance of proteins, computational protein science has always been a critical scientific field, dedicated to revealing knowledge and developing applications within the protein sequence-structure-function paradigm. In the last few decades, Artificial Intelligence (AI) has made significant impacts in computational protein science, leading to notable successes in specific protein modeling tasks. However, those previous AI models still meet limitations, such as the difficulty in comprehending the semantics of protein sequences, and the inability to generalize across a wide range of protein modeling tasks. Recently, LLMs have emerged as a milestone in AI due to their unprecedented language processing & generalization capability. They can promote comprehensive progress in fields rather than solving individual tasks. As a result, researchers have actively introduced LLM techniques in computational protein science, developing protein Language Models (pLMs) that skillfully grasp the foundational knowledge of proteins and can be effectively generalized to solve a diversity of sequence-structure-function reasoning problems. While witnessing prosperous developments, it's necessary to present a systematic overview of computational protein science empowered by LLM techniques. First, we summarize existing pLMs into categories based on their mastered protein knowledge, i.e., underlying sequence patterns, explicit structural and functional information, and external scientific languages. Second, we introduce the utilization and adaptation of pLMs, highlighting their remarkable achievements in promoting protein structure prediction, protein function prediction, and protein design studies. Then, we describe the practical application of pLMs in antibody design, enzyme design, and drug discovery. Finally, we specifically discuss the promising future directions in this fast-growing field.
Youngsoo Choi, Siu Wun Cheung, Youngkyu Kim et al.
The widespread success of foundation models in natural language processing and computer vision has inspired researchers to extend the concept to scientific machine learning and computational science. However, this position paper argues that as the term "foundation model" is an evolving concept, its application in computational science is increasingly used without a universally accepted definition, potentially creating confusion and diluting its precise scientific meaning. In this paper, we address this gap by proposing a formal definition of foundation models in computational science, grounded in the core values of generality, reusability, and scalability. We articulate a set of essential and desirable characteristics that such models must exhibit, drawing parallels with traditional foundational methods, like the finite element and finite volume methods. Furthermore, we introduce the Data-Driven Finite Element Method (DD-FEM), a framework that fuses the modular structure of classical FEM with the representational power of data-driven learning. We demonstrate how DD-FEM addresses many of the key challenges in realizing foundation models for computational science, including scalability, adaptability, and physics consistency. By bridging traditional numerical methods with modern AI paradigms, this work provides a rigorous foundation for evaluating and developing novel approaches toward future foundation models in computational science.
Zhu Lan
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1186/s13635-024-00152-9
Sheng Ren, Rui Cao, Wenxue Tan et al.
The single image super-resolution based on deep learning has achieved extraordinary performance. However, due to inevitable environmental or technological limitations, some images not only have low resolution but also low brightness. The existing super-resolution methods for restoring images through low-light input may encounter issues such as low brightness and many missing details. In this paper, we propose a semantic-aware guided low-light image super-resolution method. Initially, we present a semantic perception guided super-resolution framework that utilizes the rich semantic prior knowledge of the semantic network module. Through the semantic-aware guidance module, reference semantic features and target image features are fused in a quantitative attention manner, guiding low-light image features to maintain semantic consistency during the reconstruction process. Second, we design a self-calibrated light adjustment module to constrain the convergence consistency of each illumination estimation block by self-calibrated block, improving the stability and robustness of output brightness enhancement features. Third, we design a lightweight super resolution module based on spatial and channel reconstruction convolution, which uses the attention module to further enhances the super-resolution reconstruction capability. Our proposed model surpasses methods such as RDN, RCAN, and NLSN in both qualitative and quantitative analysis of low-light image super-resolution reconstruction. The experiment proves the efficiency and effectiveness of our method.
Badar Almarri, Gaurav Gupta, Ravinder Kumar et al.
Kazuki Nakajima, Yuya Sasaki, Sohei Tokuno et al.
The number of citations received by papers often exhibits imbalances in terms of author attributes such as country of affiliation and gender. While recent studies have quantified citation imbalance in terms of the authors' gender in journal papers, the computer science discipline, where researchers frequently present their work at conferences, may exhibit unique patterns in gendered citation imbalance. Additionally, understanding how network properties in citations influence citation imbalances remains challenging due to a lack of suitable reference models. In this paper, we develop a family of reference models for citation networks and investigate gender imbalance in citations between papers published in computer science conferences. By deploying these reference models, we found that homophily in citations is strongly associated with gendered citation imbalance in computer science, whereas heterogeneity in the number of citations received per paper has a relatively minor association with it. Furthermore, we found that the gendered citation imbalance is most pronounced in papers published in the highest-ranked conferences, is present across different subfields, and extends to citation-based rankings of papers. Our study provides a framework for investigating associations between network properties and citation imbalances, aiming to enhance our understanding of the structure and dynamics of citations between research publications.
Ty Feng, Sa Liu, Dipak Ghosal
The growing enrollments in computer science courses and increase in class sizes necessitate scalable, automated tutoring solutions to adequately support student learning. While Large Language Models (LLMs) like GPT-4 have demonstrated potential in assisting students through question-answering, educators express concerns over student overreliance, miscomprehension of generated code, and the risk of inaccurate answers. Rather than banning these tools outright, we advocate for a constructive approach that harnesses the capabilities of AI while mitigating potential risks. This poster introduces CourseAssist, a novel LLM-based tutoring system tailored for computer science education. Unlike generic LLM systems, CourseAssist uses retrieval-augmented generation, user intent classification, and question decomposition to align AI responses with specific course materials and learning objectives, thereby ensuring pedagogical appropriateness of LLMs in educational settings. We evaluated CourseAssist against a baseline of GPT-4 using a dataset of 50 question-answer pairs from a programming languages course, focusing on the criteria of usefulness, accuracy, and pedagogical appropriateness. Evaluation results show that CourseAssist significantly outperforms the baseline, demonstrating its potential to serve as an effective learning assistant. We have also deployed CourseAssist in 6 computer science courses at a large public R1 research university reaching over 500 students. Interviews with 20 student users show that CourseAssist improves computer science instruction by increasing the accessibility of course-specific tutoring help and shortening the feedback loop on their programming assignments. Future work will include extensive pilot testing at more universities and exploring better collaborative relationships between students, educators, and AI that improve computer science learning experiences.
Keenan Jones, Fatima Zahrah, Jason R. C. Nurse
Privacy is a human right. It ensures that individuals are free to engage in discussions, participate in groups, and form relationships online or offline without fear of their data being inappropriately harvested, analyzed, or otherwise used to harm them. Preserving privacy has emerged as a critical factor in research, particularly in the computational social science (CSS), artificial intelligence (AI) and data science domains, given their reliance on individuals' data for novel insights. The increasing use of advanced computational models stands to exacerbate privacy concerns because, if inappropriately used, they can quickly infringe privacy rights and lead to adverse effects for individuals -- especially vulnerable groups -- and society. We have already witnessed a host of privacy issues emerge with the advent of large language models (LLMs), such as ChatGPT, which further demonstrate the importance of embedding privacy from the start. This article contributes to the field by discussing the role of privacy and the issues that researchers working in CSS, AI, data science and related domains are likely to face. It then presents several key considerations for researchers to ensure participant privacy is best preserved in their research design, data collection and use, analysis, and dissemination of research results.
Li-Hong Jiang, Hong-Yu Wu, Peng Dong et al.
Many two-place physical problems can be explicitly presented as related events model named Alice–Bob systems. In this paper, an integrable Alice–Bob Boussinesq system is introduced via the Boussinesq equation with parameters, which may meet the symmetry transformation of Psˆx (parity with a shift) and Tdˆt (time reversal with a delay). After constructing an Bäcklund transformation, the system has rich symmetry solutions with the aid of auxiliary functions. The structures of obtained soliton solutions, such as the breathers, lumps and their hybrids, are all satisfied the Pˆsx or Tˆdt symmetry. To illustrate the symmetric characteristic, some lower-order solutions and the related dynamic structures are explicitly presented. The residual symmetry and its finite transformation for this system are also verified.
Jing Liu, Xuesong Hai, Keqin Li
Massive amounts of data drive the performance of deep learning models, but in practice, data resources are often highly dispersed and bound by data privacy and security concerns, making it difficult for multiple data sources to share their local data directly. Data resources are difficult to aggregate effectively, resulting in a lack of support for model training. How to collaborate between data sources in order to aggregate the value of data resources is therefore an important research question. However, existing distributed-collaborative-learning architectures still face serious challenges in collaborating between nodes that lack mutual trust, with security and trust issues seriously affecting the confidence and willingness of data sources to participate in collaboration. Blockchain technology provides trusted distributed storage and computing, and combining it with collaboration between data sources to build trusted distributed-collaborative-learning architectures is an extremely valuable research direction for application. We propose a trusted distributed-collaborative-learning mechanism based on blockchain smart contracts. Firstly, the mechanism uses blockchain smart contracts to define and encapsulate collaborative behaviours, relationships and norms between distributed collaborative nodes. Secondly, we propose a model-fusion method based on feature fusion, which replaces the direct sharing of local data resources with distributed-model collaborative training and organises distributed data resources for distributed collaboration to improve model performance. Finally, in order to verify the trustworthiness and usability of the proposed mechanism, on the one hand, we implement formal modelling and verification of the smart contract by using Coloured Petri Net and prove that the mechanism satisfies the expected trustworthiness properties by verifying the formal model of the smart contract associated with the mechanism. On the other hand, the model-fusion method based on feature fusion is evaluated in different datasets and collaboration scenarios, while a typical collaborative-learning case is implemented for a comprehensive analysis and validation of the mechanism. The experimental results show that the proposed mechanism can provide a trusted and fair collaboration infrastructure for distributed-collaboration nodes that lack mutual trust and organise decentralised data resources for collaborative model training to develop effective global models.
Vladimir Barannik, Serhii Sidchenko, Dmitriy Barannik et al.
The subject of research in the article are the video images compression and encryption processes during the critically important objects managing process. The goal is to develop a method for compressing video images based on floating positional coding with an uneven codegrams length to simultaneously ensure information reliability and confidentiality during its transmission with a given time delay. Objectives: analyzing existing approaches to ensuring the video images confidentiality; development a method for compressing video images based on floating positional coding with an uneven codegrams length; evaluate the developed method effectiveness. The methods used are: digital image processing methods, digital image compression methods, image encryption and scrambling methods, structural-combinatorial coding methods, statistical analysis methods. The following results are obtained. The technology of floating encoding of an uneven sequence of blocks is proposed. Code values are formed from elements of different video image blocks. For this, a scheme for linearizing an image point coordinates from its four-dimensional representation on a plane into a one-dimensional element coordinate in a vector has been developed. The four-dimensional element coordinate on the plane describes the image block coordinates and the coordinates of the element in this block. Code values are formed under conditions of control their binary representation's length. At the same time, coding is implemented for an indeterminate number of video image elements. The number of elements depends on the length of the code word. Accordingly, codegrams with an indeterminate length are formed. Their length depends on the service data values, generated during the encoding process. Service data acts as a key element. Conclusions. The one-stage polyadic image encoding method in a differentiated basis has been further improved. The developed encoding method provides image compression without information quality loss. The original images volume compression provides by 3–20 % better compared to the TIFF data presentation format and by 4–15 % compared to the PNG format. The overhead amount is less than 2.5 % of the entire codestream size.
Helmi Temimi
In this paper, we present an innovative approach to solve a system of boundary value problems (BVPs), using the newly developed discontinuous Galerkin (DG) method, which eliminates the need for auxiliary variables. This work is the first in a series of papers on DG methods applied to partial differential equations (PDEs). By consecutively applying the DG method to each space variable of the PDE using the method of lines, we transform the problem into a system of ordinary differential equations (ODEs). We investigate the convergence criteria of the DG method on systems of ODEs and generalize the error analysis to PDEs. Our analysis demonstrates that the DG error’s leading term is determined by a combination of specific Jacobi polynomials in each element. Thus, we prove that DG solutions are superconvergent at the roots of these polynomials, with an order of convergence of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>O</mi><mo>(</mo><msup><mi>h</mi><mrow><mi>p</mi><mo>+</mo><mn>2</mn></mrow></msup><mo>)</mo></mrow></semantics></math></inline-formula>.
Journal of Control Science and Engineering
Halaman 13 dari 1128956