Hasil untuk "Acoustics. Sound"

Menampilkan 20 dari ~1425011 hasil · dari CrossRef, arXiv, DOAJ, Semantic Scholar

JSON API
arXiv Open Access 2026
Reciprocal Latent Fields for Precomputed Sound Propagation

Hugo Seuté, Pranai Vasudev, Etienne Richan et al.

Realistic sound propagation is essential for immersion in a virtual scene, yet physically accurate wave-based simulations remain computationally prohibitive for real-time applications. Wave coding methods address this limitation by precomputing and compressing impulse responses of a given scene into a set of scalar acoustic parameters, which can reach unmanageable sizes in large environments with many source-receiver pairs. We introduce Reciprocal Latent Fields (RLF), a memory-efficient framework for encoding and predicting these acoustic parameters. The RLF framework employs a volumetric grid of trainable latent embeddings decoded with a symmetric function, ensuring acoustic reciprocity. We study a variety of decoders and show that leveraging Riemannian metric learning leads to a better reproduction of acoustic phenomena in complex scenes. Experimental validation demonstrates that RLF maintains replication quality while reducing the memory footprint by several orders of magnitude. Furthermore, a MUSHRA-like subjective listening test indicates that sound rendered via RLF is perceptually indistinguishable from ground-truth simulations.

en cs.SD, cs.LG
arXiv Open Access 2025
ESDD 2026: Environmental Sound Deepfake Detection Challenge Evaluation Plan

Han Yin, Yang Xiao, Rohan Kumar Das et al.

Recent advances in audio generation systems have enabled the creation of highly realistic and immersive soundscapes, which are increasingly used in film and virtual reality. However, these audio generators also raise concerns about potential misuse, such as generating deceptive audio content for fake videos and spreading misleading information. Existing datasets for environmental sound deepfake detection (ESDD) are limited in scale and audio types. To address this gap, we have proposed EnvSDD, the first large-scale curated dataset designed for ESDD, consisting of 45.25 hours of real and 316.7 hours of fake sound. Based on EnvSDD, we are launching the Environmental Sound Deepfake Detection Challenge. Specifically, we present two different tracks: ESDD in Unseen Generators and Black-Box Low-Resource ESDD, covering various challenges encountered in real-life scenarios. The challenge will be held in conjunction with the 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2026).

en cs.SD
arXiv Open Access 2025
Context-Aware Query Refinement for Target Sound Extraction: Handling Partially Matched Queries

Ryo Sato, Chiho Haruta, Nobuhiko Hiruma et al.

Target sound extraction (TSE) is the task of extracting a target sound specified by a query from an audio mixture. Much prior research has focused on the problem setting under the Fully Matched Query (FMQ) condition, where the query specifies only active sounds present in the mixture. However, in real-world scenarios, queries may include inactive sounds that are not present in the mixture. This leads to scenarios such as the Fully Unmatched Query (FUQ) condition, where only inactive sounds are specified in the query, and the Partially Matched Query (PMQ) condition, where both active and inactive sounds are specified. Among these conditions, the performance degradation under the PMQ condition has been largely overlooked. To achieve robust TSE under the PMQ condition, we propose context-aware query refinement. This method eliminates inactive classes from the query during inference based on the estimated sound class activity. Experimental results demonstrate that while conventional methods suffer from performance degradation under the PMQ condition, the proposed method effectively mitigates this degradation and achieves high robustness under diverse query conditions.

en eess.AS, cs.SD
arXiv Open Access 2025
Multizone sound field reproduction with direction-of-arrival-distribution-based regularization and its application to binaural-centered mode-matching

Ryo Matsuda, Makoto Otani

In higher-order Ambisonics, a framework for sound field reproduction, secondary-source driving signals are generally obtained by regularized mode matching. The authors have proposed a regularization technique based on direction-of-arrival (DoA) distribution of wavefronts in the primary sound field. Such DoA-distribution-based regularization enables a suppression of excessively large driving signal gains for secondary sources that are in the directions far from the primary source direction. This improves the reproduction accuracy at regions away from the reproduction center. First, this study applies the DoA-distribution-based regularization to a multizone sound field reproduction based on the addition theorem. Furthermore, the regularized multizone sound field reproduction is extended to a binaural-centered mode matching (BCMM), which produces two reproduction points, one at each ear, to avoid a degraded reproduction accuracy due to a shrinking sweet spot at higher frequencies. Free-field and binaural simulations were numerically performed to examine the effectiveness of the DoA-distribution-based regularization on the multizone sound field reproduction and the BCMM.

en eess.AS, cs.SD
arXiv Open Access 2025
SonicGauss: Position-Aware Physical Sound Synthesis for 3D Gaussian Representations

Chunshi Wang, Hongxing Li, Yawei Luo

While 3D Gaussian representations (3DGS) have proven effective for modeling the geometry and appearance of objects, their potential for capturing other physical attributes-such as sound-remains largely unexplored. In this paper, we present a novel framework dubbed SonicGauss for synthesizing impact sounds from 3DGS representations by leveraging their inherent geometric and material properties. Specifically, we integrate a diffusion-based sound synthesis model with a PointTransformer-based feature extractor to infer material characteristics and spatial-acoustic correlations directly from Gaussian ellipsoids. Our approach supports spatially varying sound responses conditioned on impact locations and generalizes across a wide range of object categories. Experiments on the ObjectFolder dataset and real-world recordings demonstrate that our method produces realistic, position-aware auditory feedback. The results highlight the framework's robustness and generalization ability, offering a promising step toward bridging 3D visual representations and interactive sound synthesis. Project page: https://chunshi.wang/SonicGauss

en cs.SD, cs.MM
DOAJ Open Access 2025
Failure Detection of Powertrain Components in Motor Vehicles Using Vibroacoustic Methods

Balázs József KRISTON, Károly JÁLICS

Although noise and vibration measurements are widespread in the machine diagnostics, they are not used in the diagnostics of the powertrain of motor vehicles. Our research aims to investigate the possibilities, advantages, and drawbacks of using noise and vibration diagnostics performed for motor vehicles. In this paper, we attempt to use vibroacoustic signals from a motor vehicle for diagnostic purposes. Ordinary audible malfunctions, for example, misfiring in a passenger car, were artificially created. The differences between the normal and faulty operating conditions were examined to identify evidence of failure in the vibration signal. Primarily, evaluation through Fourier transformation was performed to provide a visual correlation between the fault and the vibration behavior of the car. Detailed conclusions from the measurements and future research plans are discussed.

Acoustics. Sound
DOAJ Open Access 2025
A novel ultrasound-responsive cluster bomb system for efficient siRNA delivery in brain

Tianyu Guo, Feihong Dong, Jingyi Yin et al.

RNA-based therapeutics using RNA interference have become a research hotspot for brain tumors and neurodegenerative diseases with the advancement of nanocarrier delivery technology. However, even with specific modifications, RNA-loaded nanoparticles face significant challenges in effectively crossing the blood-brain barrier (BBB) to achieve precise delivery of therapeutic agents to the brain. Focused ultrasound combined with microbubbles and nanodroplets has emerged as a promising approach for temporarily opening the BBB. However, the low drug loading capacity and fixed stimulation focus of these methods limit their integration with current nano-drug delivery systems. Herein, we introduced a fluorinated surfactant and developed an ultrasound-responsive siRNA delivery carrier that contains nanodroplets loaded with siRNA-carrying nanoparticles (siRNA@NP@ND), termed as “ultrasound-responsive ’cluster bomb’ nanoplatform”. Under precise and flexible guidance and stimulation through a programmable diagnostic ultrasound, siRNA@NP@ND demonstrated over a seventy-fold increase in efficiency for delivering siRNA to the mouse brain. Additionally, Evans blue staining and hematological analysis indicated that ultrasound-triggered cavitation could reversibly open the BBB for up to 48 h without causing significant immune or inflammatory responses. The minor intracranial hemorrhage resulting from this process was also shown to be recoverable. Our research provides an advanced and controllable delivery platform for gene therapy of intracranial central nervous system diseases.

Chemistry, Acoustics. Sound
DOAJ Open Access 2025
Acoustic virtual sensors for industrial process monitoring using non-negative matrix factorization

Clara Luzon-Alvarez, Maximo Cobos, Jesus Lopez-Ballester et al.

Abstract In modern industrial environments, efficient and non-invasive monitoring of machinery operations is crucial for ensuring process optimization and early fault detection. Traditional physical sensors, while effective, can be costly and impractical to deploy extensively across complex systems. This paper introduces an innovative approach leveraging non-negative matrix factorization (NMF) to create acoustic virtual sensors that analyze sound spectrograms for real-time industrial process monitoring. By decomposing acoustic signals captured from machinery into distinct spectral components, the proposed method enables the detection of specific operational phases and potential anomalies. While the methodology is demonstrated using a plastic injection molding machine, it is designed to be adaptable to a wide range of industrial processes where machinery generates distinct acoustic signatures. The approach involves capturing high-fidelity acoustic data, applying NMF to extract activation matrices that represent unique acoustic patterns, and using clustering techniques to ensure robust identification of operational states across different environments. This generalizable framework allows for scalable monitoring solutions across various industrial applications, from manufacturing lines to heavy machinery operations. This study highlights the potential of acoustic virtual sensors as a cost-effective, scalable solution for industrial monitoring, offering new possibilities for predictive maintenance and anomaly detection in diverse manufacturing environments.

Acoustics. Sound, Electronic computers. Computer science
DOAJ Open Access 2025
Ultrasonic-assisted soldering of 7075 Al alloy joint using Ni mesh reinforced SAC305 composite solder: microstructure, bonding ratio, and mechanical properties

Dan Li, Yong Xiao, Yu Zhang et al.

Suppressing solder overflow has significant implications for promoting the application of ultrasonic-assisted soldering. In this work, an innovative strategy of adding metal mesh into Sn-based solder was utilized. 7075 Al alloy joints were ultrasonically soldered with Ni mesh reinforced SAC305 composite solders. The microstructure, bonding ratio, and shear properties of joints were systematically explored. Results showed that solder seams primarily consisted of Ni mesh, SAC305 solder, α-Al phase, Ag3Al2 phase, (Ni, Cu)3Sn4 phase, Al3(Ni, Cu)2 phase, and dispersed fine particles. The bonding interface between Ni mesh and Al substrate could be divided into contact and non-contact regions. A polycrystalline Al3(Ni, Cu)2 phase and a Cu-Al-O amorphous layer were formed at the contact regions. The bonding ratio of joints was mainly affected by the cavitation effects within non-contact regions. Adding Ni meshes could enhance the acoustic pressure and accelerate the flow of local solder in the solder seams. The decrease in the bonding ratio was attributed to the excessive solder flow, which induced the formation of defects. Benefiting from the intrinsic strengthening of Ni mesh and metallurgical reaction strengthening, Al/250#Ni-SAC/Al joints exhibited a shear strength of 71.87 MPa.

Chemistry, Acoustics. Sound
S2 Open Access 2022
Metamaterial-based real-time communication with high information density by multipath twisting of acoustic wave

Kai Wu, Jingjing Liu, Yu-jiang Ding et al.

Speeding up the transmission of information carried by waves is of fundamental interest for wave physics, with pivotal significance for underwater communications. To overcome the current limitations in information transfer capacity, here we propose and experimentally validate a mechanism using multipath sound twisting to realize real-time high-capacity communication free of signal-processing or sensor-scanning. The undesired channel crosstalk, conventionally reduced via time-consuming postprocessing, is virtually suppressed by using a metamaterial layer as purely-passive demultiplexer with high spatial selectivity. Furthermore, the compactness of system ensures high information density crucial for acoustics-based applications. A distinct example of complicated image transmission is experimentally demonstrated, showing as many independent channels as the path number multiplied by vortex mode number and an extremely-low bit error rate nearly 1/10 of the forward error correction limit. Our strategy opens an avenue to metamaterial-based high-capacity communication paradigm compatible with the conventional multiplexing mechanisms, with far-reaching impact on acoustics and other domains. Here, the authors demonstrate multipath twisting of acoustic waves with a thin metamaterial layer enabling high-speed transfer of information with no time-consuming post-processing or sensor scanning, showing important application potential in underwater communication.

85 sitasi en Medicine
S2 Open Access 2022
INRAS: Implicit Neural Representation for Audio Scenes

Kun Su, Mingfei Chen, Eli Shlizerman

The spatial acoustic information of a scene, i.e., how sounds emitted from a particular location in the scene are perceived in another location, is key for immersive scene modeling. Robust representation of scene’s acoustics can be formulated through a continuous field formulation along with impulse responses varied by emitter-listener locations. The impulse responses are then used to render sounds perceived by the listener. While such representation is advantageous, parameterization of impulse responses for generic scenes presents itself as a challenge. Indeed, traditional pre-computation methods have only implemented parameterization at discrete probe points and require large storage, while other existing methods such as geometry-based sound simulations still suffer from inability to simulate all wave-based sound effects. In this work, we introduce a novel neural network for light-weight Implicit Neural Representation for Audio Scenes (INRAS), which can render a high fidelity time-domain impulse responses at any arbitrary emitter-listener positions by learning a continuous implicit function. INRAS disentangles scene’s geometry features with three modules to generate independent features for the emitter, the geometry of the scene, and the listener respectively. These lead to an efficient reuse of scene-dependent features and support effective multi-condition training for multiple scenes. Our experimental results show that INRAS outperforms existing approaches for representation and rendering of sounds for varying emitter-listener locations in all aspects, including the impulse response quality, inference speed, and storage requirements.

81 sitasi en Computer Science
arXiv Open Access 2024
Leveraging Sound Source Trajectories for Universal Sound Separation

Donghang Wu, Xihong Wu, Tianshu Qu

Existing methods utilizing spatial information for sound source separation require prior knowledge of the direction of arrival (DOA) of the source or utilize estimated but imprecise localization results, which impairs the separation performance, especially when the sound sources are moving. In fact, sound source localization and separation are interconnected problems, that is, sound source localization facilitates sound separation while sound separation contributes to refined source localization. This paper proposes a method utilizing the mutual facilitation mechanism between sound source localization and separation for moving sources. The proposed method comprises three stages. The first stage is initial tracking, which tracks each sound source from the audio mixture based on the source signal envelope estimation. These tracking results may lack sufficient accuracy. The second stage involves mutual facilitation: Sound separation is conducted using preliminary sound source tracking results. Subsequently, sound source tracking is performed on the separated signals, thereby refining the tracking precision. The refined trajectories further improve separation performance. This mutual facilitation process can be iterated multiple times. In the third stage, a neural beamformer estimates precise single-channel separation results based on the refined tracking trajectories and multi-channel separation outputs. Simulation experiments conducted under reverberant conditions and with moving sound sources demonstrate that the proposed method can achieve more accurate separation based on refined tracking results.

en eess.AS, cs.SD
DOAJ Open Access 2024
Acoustic waves in gas-filled structured porous media: Asymptotic tortuosity/compliability and characteristic-lengths reevaluated to incorporate the influence of spatial dispersion

Lafarge D.

This study extends efforts to incorporate spatial dispersion into Biot-Allard’s theory, with a focus on poroelastic media with intricate microgeometries where spatial dispersion effects play a significant role. While preserving Biot’s small-scale quasi-“en-bloc” frame motion to keep the structure of Biot-Allard’s theory intact, the paper challenges Biot’s quasi-incompressibility of fluid motion at that scale by introducing structurations in the form of Helmholtz’s resonators. Consequently, Biot-Allard’s theory undergoes a significant augmentation, marked by the arising of non-local dynamic tortuosity and compliability, which are associated with potentially resonant fluid behavior. Building on an acoustic-electromagnetic analogy, the study defines these non-local responses and suggests simplifying them into pseudo-local ones, now potentially resonant and reminiscent of Veselago-type phenomena. In the high-frequency limit of small boundary layers and as an extension of the classical Johnson-Allard’s findings, simple field-averaged formulas are demonstrated for pseudo-local ideal-fluid tortuosity and compliability (complex frequency-dependent) and viscous and thermal characteristic lengths (positive frequency-dependent). These formulations are grounded in the Umov-Heaviside-Poynting thermodynamic macroscopic acoustic stress concept, suggested by the analogy. Future computational investigations, spanning various fundamental microgeometries, are planned to assess assumed pseudo-local simplifications, encompass low- and intermediate frequencies, and unveil potential behavioral outcomes resulting from the incorporation of spatial dispersion effects.

Acoustics in engineering. Acoustical engineering, Acoustics. Sound
DOAJ Open Access 2024
Acoustic arrival predictions using oceanographic measurements and models in the Beaufort Sea

Jessica B. Desrochers, Lora J. Van Uffelen, Sarah E. Webster

Acoustic propagation in the Beaufort Sea is particularly sensitive to upper-ocean sound-speed structure due to the presence of a subsurface duct known as the Beaufort duct. Comparisons of acoustic predictions based on existing Arctic models with predictions based on in situ data collected by Seaglider vehicles in the summer of 2017 show differences in the strength, depth, and number of ducts, highlighting the importance of in situ data. These differences have a significant effect on the later, more intense portion of the acoustic time front referred to as reverse geometric dispersion, where lower-order modes arrive prior to the final cutoff.

Acoustics. Sound
DOAJ Open Access 2024
Method development and validation for the extraction and quantification of sesquiterpene lactones in Dolomiaea costus

Mohammed Aldholmi

Dolomiaea costus, commonly known as Indian costus, is a medicinal plant from the Asteraceae family. The root and powder of costus have been widely used to treat various health conditions. The primary bioactive compounds in this plant are sesquiterpene lactones, particularly costunolide and dehydrocostus lactone. This study aimed to establish a rapid, environmentally friendly, and cost-effective method for the high-throughput extraction and quantification of sesquiterpene lactones in Indian costus. Ultrasonic bath (UB) and UPLC/MS-MS were employed to extract and analyse 49 Indian costus samples. Aqueous ethanol was identified as the most effective solvent system for extracting and analysing sesquiterpene lactones. The extraction efficiency of the ultrasonic bath was comparable to that of the ultrasonic homogeniser while shaking showed the lowest efficiency. The environmentally friendly UPLC/MS-MS analysis revealed mean concentrations (±SD; μg/100 μg) of 1.00 (±0.39) for costunolide and 0.70 (±0.25) for dehydrocostus lactone. An inverse correlation was observed between sesquiterpene lactone content and sample colour. Most samples contained costunolide levels above the minimum limit (0.6 %) specified by the Chinese monograph, but only a few met the 1.8 % threshold for total sesquiterpene lactones. Given the importance of bioactive sesquiterpene lactones for medicinal efficacy, insufficient levels may result in diminished therapeutic value. Therefore, standardising Indian costus products is crucial to ensure quality and appropriate dosing. This study contributes to the standardisation of Indian costus, a vital step towards ensuring the efficacy and safety of herbal products.

Chemistry, Acoustics. Sound
arXiv Open Access 2023
Measuring Acoustics with Collaborative Multiple Agents

Yinfeng Yu, Changan Chen, Lele Cao et al.

As humans, we hear sound every second of our life. The sound we hear is often affected by the acoustics of the environment surrounding us. For example, a spacious hall leads to more reverberation. Room Impulse Responses (RIR) are commonly used to characterize environment acoustics as a function of the scene geometry, materials, and source/receiver locations. Traditionally, RIRs are measured by setting up a loudspeaker and microphone in the environment for all source/receiver locations, which is time-consuming and inefficient. We propose to let two robots measure the environment's acoustics by actively moving and emitting/receiving sweep signals. We also devise a collaborative multi-agent policy where these two robots are trained to explore the environment's acoustics while being rewarded for wide exploration and accurate prediction. We show that the robots learn to collaborate and move to explore environment acoustics while minimizing the prediction error. To the best of our knowledge, we present the very first problem formulation and solution to the task of collaborative environment acoustics measurements with multiple agents.

en cs.AI, cs.MA
arXiv Open Access 2023
SSL-Net: A Synergistic Spectral and Learning-based Network for Efficient Bird Sound Classification

Yiyuan Yang, Kaichen Zhou, Niki Trigoni et al.

Efficient and accurate bird sound classification is of important for ecology, habitat protection and scientific research, as it plays a central role in monitoring the distribution and abundance of species. However, prevailing methods typically demand extensively labeled audio datasets and have highly customized frameworks, imposing substantial computational and annotation loads. In this study, we present an efficient and general framework called SSL-Net, which combines spectral and learned features to identify different bird sounds. Encouraging empirical results gleaned from a standard field-collected bird audio dataset validate the efficacy of our method in extracting features efficiently and achieving heightened performance in bird sound classification, even when working with limited sample sizes. Furthermore, we present three feature fusion strategies, aiding engineers and researchers in their selection through quantitative analysis.

en cs.SD, eess.AS

Halaman 14 dari 71251