Hasil untuk "Cybernetics"

Menampilkan 20 dari ~134503 hasil · dari arXiv, DOAJ, Semantic Scholar, CrossRef

JSON API
arXiv Open Access 2025
EGSTalker: Real-Time Audio-Driven Talking Head Generation with Efficient Gaussian Deformation

Tianheng Zhu, Yinfeng Yu, Liejun Wang et al.

This paper presents EGSTalker, a real-time audio-driven talking head generation framework based on 3D Gaussian Splatting (3DGS). Designed to enhance both speed and visual fidelity, EGSTalker requires only 3-5 minutes of training video to synthesize high-quality facial animations. The framework comprises two key stages: static Gaussian initialization and audio-driven deformation. In the first stage, a multi-resolution hash triplane and a Kolmogorov-Arnold Network (KAN) are used to extract spatial features and construct a compact 3D Gaussian representation. In the second stage, we propose an Efficient Spatial-Audio Attention (ESAA) module to fuse audio and spatial cues, while KAN predicts the corresponding Gaussian deformations. Extensive experiments demonstrate that EGSTalker achieves rendering quality and lip-sync accuracy comparable to state-of-the-art methods, while significantly outperforming them in inference speed. These results highlight EGSTalker's potential for real-time multimedia applications.

en cs.SD, cs.AI
arXiv Open Access 2025
NMPC-Lander: Nonlinear MPC with Barrier Function for UAV Landing on a Mobile Platform

Amber Batool, Faryal Batool, Roohan Ahmed Khan et al.

Quadcopters are versatile aerial robots gaining popularity in numerous critical applications. However, their operational effectiveness is constrained by limited battery life and restricted flight range. To address these challenges, autonomous drone landing on stationary or mobile charging and battery-swapping stations has become an essential capability. In this study, we present NMPC-Lander, a novel control architecture that integrates Nonlinear Model Predictive Control (NMPC) with Control Barrier Functions (CBF) to achieve precise and safe autonomous landing on both static and dynamic platforms. Our approach employs NMPC for accurate trajectory tracking and landing, while simultaneously incorporating CBF to ensure collision avoidance with static obstacles. Experimental evaluations on the real hardware demonstrate high precision in landing scenarios, with an average final position error of 9.0 cm and 11 cm for stationary and mobile platforms, respectively. Notably, NMPC-Lander outperforms the B-spline combined with the A* planning method by nearly threefold in terms of position tracking, underscoring its superior robustness and practical effectiveness.

en cs.RO
arXiv Open Access 2025
Conjugated Capabilities: Interrelations of Elementary Human Capabilities and Their Implication on Human-Machine Task Allocation and Capability Testing Procedures

Nils Mandischer, Larissa Füller, Torsten Alles et al.

Human and automation capabilities are the foundation of every human-autonomy interaction and interaction pattern. Therefore, machines need to understand the capacity and performance of human doing, and adapt their own behavior, accordingly. In this work, we address the concept of conjugated capabilities, i.e. capabilities that are dependent or interrelated and between which effort can be distributed. These may be used to overcome human limitations, by shifting effort from a deficient to a conjugated capability with performative resources. For example: A limited arm's reach may be compensated by tilting the torso forward. We analyze the interrelation between elementary capabilities within the IMBA standard to uncover potential conjugation, and show evidence in data of post-rehabilitation patients. From the conjugated capabilities, within the example application of stationary manufacturing, we create a network of interrelations. With this graph, a manifold of potential uses is enabled. We showcase the graph's usage in optimizing IMBA test design to accelerate data recordings, and discuss implications of conjugated capabilities on task allocation between the human and an autonomy.

en cs.HC, cs.MA
arXiv Open Access 2025
Krasovskiĭ Stability Theorem for FDEs in the Extended Sense

Qian Feng, Wilfrid Perruquetti

The analysis of the stability of systems' equilibria plays a central role in the study of dynamical systems and control theory. This note establishes an extension of the celebrated Krasovskiĭ stability theorem for functional differential equations (FDEs) in the extended sense. Namely, the FDEs hold for $t \geq t_0$ almost everywhere with respect to the Lebesgue measure. The existence and uniqueness of such FDEs were briefly discussed in J.K Hale's classical treatise on FDEs, yet a corresponding stability theorem was not provided. A key step in proving the proposed stability theorem was to utilize an alternative strategy instead of relying on the mean value theorem of differentiable functions. The proposed theorem can be useful in the stability analysis of cybernetic systems, which are often subject to noise and glitches that have a countably infinite number of jumps. To demonstrate the usefulness of the proposed theorem, we provide examples of linear systems with time-varying delays in which the FDEs cannot be defined in the conventional sense.

en math.OC
arXiv Open Access 2025
HSFN: Hierarchical Selection for Fake News Detection building Heterogeneous Ensemble

Sara B. Coutinho, Rafael M. O. Cruz, Francimaria R. S. Nascimento et al.

Psychological biases, such as confirmation bias, make individuals particularly vulnerable to believing and spreading fake news on social media, leading to significant consequences in domains such as public health and politics. Machine learning-based fact-checking systems have been widely studied to mitigate this problem. Among them, ensemble methods are particularly effective in combining multiple classifiers to improve robustness. However, their performance heavily depends on the diversity of the constituent classifiers-selecting genuinely diverse models remains a key challenge, especially when models tend to learn redundant patterns. In this work, we propose a novel automatic classifier selection approach that prioritizes diversity, also extended by performance. The method first computes pairwise diversity between classifiers and applies hierarchical clustering to organize them into groups at different levels of granularity. A HierarchySelect then explores these hierarchical levels to select one pool of classifiers per level, each representing a distinct intra-pool diversity. The most diverse pool is identified and selected for ensemble construction from these. The selection process incorporates an evaluation metric reflecting each classifiers's performance to ensure the ensemble also generalises well. We conduct experiments with 40 heterogeneous classifiers across six datasets from different application domains and with varying numbers of classes. Our method is compared against the Elbow heuristic and state-of-the-art baselines. Results show that our approach achieves the highest accuracy on two of six datasets. The implementation details are available on the project's repository: https://github.com/SaraBCoutinho/HSFN .

en cs.CL, cs.AI
arXiv Open Access 2025
Cybernetic Marionette: Channeling Collective Agency Through a Wearable Robot in a Live Dancer-Robot Duet

Anup Sathya, Jiasheng Li, Zeyu Yan et al.

We describe DANCE^2, an interactive dance performance in which audience members channel their collective agency into a dancer-robot duet by voting on the behavior of a wearable robot affixed to the dancer's body. At key moments during the performance, the audience is invited to either continue the choreography or override it, shaping the unfolding interaction through real-time collective input. While post-performance surveys revealed that participants felt their choices meaningfully influenced the performance, voting data across four public performances exhibited strikingly consistent patterns. This tension between what audience members do, what they feel, and what actually changes highlights a complex interplay between agentive behavior, the experience of agency, and power. We reflect on how choreography, interaction design, and the structure of the performance mediate this relationship, offering a live analogy for algorithmically curated digital systems where agency is felt, but not exercised.

en cs.HC, cs.RO
arXiv Open Access 2025
MDD-Net: Multimodal Depression Detection through Mutual Transformer

Md Rezwanul Haque, Md. Milon Islam, S M Taslim Uddin Raju et al.

Depression is a major mental health condition that severely impacts the emotional and physical well-being of individuals. The simple nature of data collection from social media platforms has attracted significant interest in properly utilizing this information for mental health research. A Multimodal Depression Detection Network (MDD-Net), utilizing acoustic and visual data obtained from social media networks, is proposed in this work where mutual transformers are exploited to efficiently extract and fuse multimodal features for efficient depression detection. The MDD-Net consists of four core modules: an acoustic feature extraction module for retrieving relevant acoustic attributes, a visual feature extraction module for extracting significant high-level patterns, a mutual transformer for computing the correlations among the generated features and fusing these features from multiple modalities, and a detection layer for detecting depression using the fused feature representations. The extensive experiments are performed using the multimodal D-Vlog dataset, and the findings reveal that the developed multimodal depression detection network surpasses the state-of-the-art by up to 17.37% for F1-Score, demonstrating the greater performance of the proposed system. The source code is accessible at https://github.com/rezwanh001/Multimodal-Depression-Detection.

en cs.CV, cs.LG
DOAJ Open Access 2025
ANTI-ALIASING FILTERS WITH COMPLEX AMPLITUDE-FREQUENCY RESPONSES

Anatoly R. Gaiduk, Darya Y. Denisenko, Dmitry V. Kuznestov et al.

A method for designing anti-aliasing filters for use in information and control systems is proposed. A distinctive feature of the filters under consideration is that their amplitude-frequency response has transmission zeros at critical (from the point of view of interference suppression) frequencies and a specified attenuation in the rest of the frequency range. This allows the use of the filters in question, both for the suppression of undesirable harmonics of the network (50 Hz and 100 Hz), and as spectrum limiters at the inputs of analog-to-digital converters and, in particular, at the inputs of discrete-analog filters made on switchable capacitors. It is shown that the found transfer function of the filter can be implemented on the basis of cascade or multi-loop structures, and the circuitry of such filters can be made on the basis of RC circuits and op-amps. The calculation of the parameters of the elements of the circuit diagrams of filters is carried out according to the coefficients of the specified transfer function. Cascade implementation is based on the principle of internal models, which ensures the greatest stability of the filter's critical frequencies.

Information technology, Information theory
DOAJ Open Access 2025
Sea level forecasting using deep recurrent neural networks with high-resolution hydrodynamic model

Saeed Rajabi-Kiasari, Artu Ellmann, Nicole Delpeche-Ellmann

Changes in climate, along with increasing marine activities in coastal and offshore regions, highlight the need for effective sea level forecasting methods. In recent years, forecasting techniques, especially those utilizing machine learning/deep learning methods (ML/DL), have shown promising capabilities. However, sea level forecasting is often limited in accuracy and spatiotemporal coverage, primarily due to the challenges posed by available observational data, which complicates the assessment of existing ML/DL techniques in complex and dynamic regions like the Baltic Sea. This study addresses these challenges by utilizing a high-resolution spatiotemporal framework that integrates high-resolution hydrodynamic and marine geoid models available to Baltic countries, enabling further capabilities to be explored in terms of sea level accuracy and validation. Specifically, it examines short-term sea level forecasting in the eastern Baltic Sea and the potential of utilizing two recurrent neural network-based models such as the Long Short-Term Memory Networks (LSTMs), and the Gated Recurrent Unit (GRU) along with high-resolution input data sources. These models were specifically chosen, due to their expected capabilities with time series data and their ability to learn both short and long-term connections of the input datasets.To achieve this, a multivariate multistep-ahead (3, 6, 9, 12, and 24 h) forecasting framework was developed. The DL models' input components are high-resolution sea level data obtained from a bias-corrected hydrodynamic model, wind speed, surface pressure, and sea surface temperature. Results for various time steps (from 3 h to 24 h ahead), during the test period, revealed that the two DL models generally showed similar performance, with slightly superior results with the GRU model. For instance, GRU and LSTM showed an averaged root mean square error (RMSE) of 4.96 cm and 5.3 cm and a coefficient of determination (R²) of 0.93 and 0.92, respectively. Investigations of the time series forecasting performance at selected locations, also demonstrated the superiority of the GRU model, for all time steps, with Willmott's index (WI) values generally above 0.9 and high reliability as reflected in Prediction Interval Coverage Probability (PICP) values mostly exceeding 90 %. The results, however, weren't always perfect; both the GRU and LSTM models encountered limitations with forecasting the sea level maxima. Further examination of the spatial discrepancies also reveals some problematic areas in the eastern Gulf of Finland. This may have been influenced by the exclusion of some input components such as river discharge, salinity and meridional winds, further enhanced by complex hydrodynamics, extreme sea level variations, strong local currents, resonance-induced seiches and seasonal ice cover. In addition, an external validation of the GRU results was performed using along-track satellite altimetry from Sentinel 3A and 3B missions. For most of the satellite tracks, the discrepancy was better than 5 cm, proving the capabilities of the model generalization capabilities. These findings hold significant implications for advancing our comprehension of oceanic dynamics, enhancing maritime safety, and benefiting a wide range of applications that are dependent on accurate sea level forecasting.

Ocean engineering
DOAJ Open Access 2025
NeuroFusionNet: A Multi-Modal Graph Transformer with Contrastive Alignment and Evidential Uncertainty for Epileptic Seizure Detection

Jabiulla Riyazulla Rahman, Pasha Afroz, Prasad Pinnepalli Sadhashiviah et al.

Reliable epileptic seizure detection remains challenging due to the heterogeneity of modalities and poor interpretability in existing models. To address these issues, this research proposes NeuroFusionNet, a unified multi-modal framework that jointly leverages Electro-Encephalo-Gram (EEG) and functional Magnetic Resonance Imaging (fMRI) signals through modality-specific graph encoders and a Cross-Modal Graph Transformer (CMGT). The CMGT architecture captures both temporal and spatial-functional dynamics, enabling robust feature learning across modalities. Additionally, a modality-wise contrastive alignment objective is employed to ensure latent consistency, then an evidential uncertainty head is also incorporated, which assists in estimating clinical reliability for calibrated confidence. Hence, the model demonstrates strong generalization across CHB-MIT, resting-state (rs)-fMRI from UW–Madison, and 7 T fMRI datasets. Finally, the proposed NeuroFusionNet achieved higher results with 99.22% accuracy, 99.89% precision, and 99.85% recall, outperforming the existing TriSeizureDualNet model. These results determine that the proposed NeuroFusionNet is interpretable and trustworthy for seizure detection.

arXiv Open Access 2024
Matching Input and Output Devices and Physical Disabilities for Human-Robot Workstations

Carlo Weidemann, Nils Mandischer, Burkhard Corves

As labor shortage is rising at an alarming rate, it is imperative to enable all people to work, particularly people with disabilities and elderly people. Robots are often used as universal tool to assist people with disabilities. However, for such human-robot workstations universal design fails. We mitigate the challenges of selecting an individualized set of input and output devices by matching devices required by the work process and individual disabilities adhering to the Convention on the Rights of Persons with Disabilities passed by the United Nations. The objective is to facilitate economically viable workstations with just the required devices, hence, lowering overall cost of corporate inclusion and during redesign of workplaces. Our work focuses on developing an efficient approach to filter input and output devices based on a person's disabilities, resulting in a tailored list of usable devices. The methodology enables an automated assessment of devices compatible with specific disabilities defined in International Classification of Functioning, Disability and Health. In a mock-up, we showcase the synthesis of input and output devices from disabilities, thereby providing a practical tool for selecting devices for individuals with disabilities.

en cs.RO, cs.HC
arXiv Open Access 2024
Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security

Yihe Fan, Yuxin Cao, Ziyu Zhao et al.

Multimodal Large Language Models (MLLMs) demonstrate remarkable capabilities that increasingly influence various aspects of our daily lives, constantly defining the new boundary of Artificial General Intelligence (AGI). Image modalities, enriched with profound semantic information and a more continuous mathematical nature compared to other modalities, greatly enhance the functionalities of MLLMs when integrated. However, this integration serves as a double-edged sword, providing attackers with expansive vulnerabilities to exploit for highly covert and harmful attacks. The pursuit of reliable AI systems like powerful MLLMs has emerged as a pivotal area of contemporary research. In this paper, we endeavor to demostrate the multifaceted risks associated with the incorporation of image modalities into MLLMs. Initially, we delineate the foundational components and training processes of MLLMs. Subsequently, we construct a threat model, outlining the security vulnerabilities intrinsic to MLLMs. Moreover, we analyze and summarize existing scholarly discourses on MLLMs' attack and defense mechanisms, culminating in suggestions for the future research on MLLM security. Through this comprehensive analysis, we aim to deepen the academic understanding of MLLM security challenges and propel forward the development of trustworthy MLLM systems.

en cs.CR, cs.CV
DOAJ Open Access 2024
Predicting Day-Ahead Electricity Market Prices through the Integration of Macroeconomic Factors and Machine Learning Techniques

Adela Bâra, Simona-Vasilica Oprea

Abstract Several events in the last years changed to some extent the common understanding of the electricity day-ahead market (DAM). The shape of the electricity price curve has been altered as some factors that underpinned the electricity price forecast (EPF) lost their importance and new influential factors emerged. In this paper, we aim to showcase the changes in EPF, understand the effects of uncertainties and propose a forecasting method using machine learning (ML) algorithms to cope with random events such as COVID-19 pandemic and the conflict in Black Sea region. By adjusting the training period according to the standard deviation that reflects the price volatility, feature engineering and by using two regressors for weighing the results, significant improvements in the performance of the EPF are achieved. One of the contributions of the proposed method consists in adjusting the training period considering the price variation. Thus, we introduce a rule-based approach given an empirical observation that for days with a higher growth in prices the training interval should be shortened, capturing the sharp variations of prices. The results of several cutting-edge ML algorithms represent the input for a predictive meta-model to obtain the best forecasting solution. The input dataset spans from Jan. 2019 to Aug. 2022, testing the proposed EPF method for both stable and more tumultuous intervals and proving its robustness. This analysis provides decision makers with an understanding of the price trends and suggests measures to combat spikes. Numerical findings indicate that on average mean absolute error (MAE) improved by 48% and root mean squared error (RMSE) improved by 44% compared to the baseline model (without feature engineering/adjusting training). When the output of the ML algorithms is weighted using the proposed meta-model, MAE further improved by 2.3% in 2020 and 5.14% in 2022. Less errors are recorded in stable years like 2019 and 2020 (MAE = 6.71, RMSE = 14.67) compared to 2021 and 2022 (MAE = 9.45, RMSE = 20.64).

Electronic computers. Computer science
arXiv Open Access 2023
Resilient Clock Synchronization Architecture for Industrial Time-Sensitive Networking

Yafei Sun, Qimin Xu, Cailian Chen et al.

Time-Sensitive Networking (TSN) is a promising industrial Internet of Things technology. Clock synchronization provides unified time reference, which is critical to the deterministic communication of TSN. However, changes in internal network status and external work environments of devices both degrade practical synchronization performance. This paper proposes a temperature-resilient architecture considering delay asymmetry (TACD) to enhance the timing accuracy under the impacts of internal delay and external thermal changes. In TACD, an anti-delay-asymmetry method is developed, which employs a partial variational Bayesian algorithm to promote adaptability to non-stationary delay variation. An optimized skew estimator is further proposed, fusing the temperature skew model for ambiance perception with the traditional linear clock model to compensate for nonlinear error caused by temperature changes. Theoretical derivation of skew estimation lower bound proves the promotion of optimal accuracy after the fusion of clock models. Evaluations based on measured delay data demonstrate accuracy advantages regardless of internal or external influences.

en eess.SY

Halaman 21 dari 6726