Hasil "Acoustics. Sound"

DOAJ Open Access 2026

Advances in ultrasound-activated nano-sonosensitizers for cancer treatment: a systematic review and meta-analysis

Yasin Ayyami, Masoumeh Dastgir, Maedeh Yektamanesh et al.

Ultrasound (US)-activated nanotherapies represent a transformative frontier in oncology, leveraging the noninvasive deep penetration of acoustic energy for targeted tumor destruction. However, this field lacks a quantitative synthesis to guide the rational design of sono-sensitizing nanoparticles (NPs) and the optimization of therapeutic protocols. To address this issue, we performed a systematic review and meta-analysis of 144 recent research reports, establishing an evidence-based design hierarchy for nano-sonosensitizers. A meta-analysis of 86 in vitro studies revealed a profound synergistic reduction in cell viability when NPs were activated by US (pooled standard mean difference, SMD = −13.16, 95% CI [−14.67, −11.64], p < 0.001). NP size was the most influential design factor. Particles smaller than 50 nm showing the greatest effect in vitro (SMD = −13.32) and strongest tumor reduction in vivo (SMD = 5), a finding consistent with optimal exploitation of the enhanced permeability and retention (EPR) effect for tumor accumulation. Specific roles of materials were identified: polymeric NPs excelled in drug delivery (SMD = −19.36 versus US alone), while inorganic NPs served as direct sonocatalysts. This work provides a definitive quantitative framework to advance US-activated nanotherapy from exploratory discovery to clinical precision. To realize this potential, we recommend the adoption of standardized acoustic dosimetry and material characterization and inclusion of safety studies (ISO 10993. We hope that this review will accelerate the development of fundamental studies and therapeutic US applications for sonosensitizing NPs.

Chemistry, Acoustics. Sound

Detail DOI Sumber

DOAJ Open Access 2026

Impact of crystalline type on starch-polyphenol complexation under sequential gelatinization and ultrasonication

Sandra Kusumawardani, Naphatrapi Luangsakul

Inhibiting starch digestion is an effective approach to regulate postprandial blood glucose levels, and ultrasound assisted modification has emerged as a promising non-thermal strategy to tailor starch structure and functionality. In this study, sequential treatment of gelatinization and ultrasonication at varying amplitudes were employed to assess the impact of the interaction between different types of starch crystallinity (A, B, and C type) with red rice bran polyphenols. The results demonstrated moderate amplitude at 60% provided optimal ultrasound condition, significantly enhancing polyphenol binding and promoting molecular rearrangement across all starch types. B type starch exhibited the highest polyphenol binding capacity. In contrast, A and C type starches showed more pronounced increases in crystalline order and thermal stability, indicating stronger molecular reorganization under ultrasound stress. X-ray diffraction revealed no new peaks among all starch types, indicating that complexation occurred via non V-type interactions. Amplitude of ultrasound at 60% effectively reduced rapidly digestible starch while increasing slowly digestible and resistant starch fractions, resulting in a lower estimated glycemic index within the medium range (65–68). These findings elucidate the role of ultrasonic cavitation in modulating starch-polyphenol interactions and demonstrate how starch crystalline structure governs complexation behavior and digestion resistance. Overall, the present study contributes a promising strategy for the development of functional food ingredients, particularly for slowing down the digestibility.

Chemistry, Acoustics. Sound

Detail DOI Sumber

arXiv Open Access 2025

Segmenting Collision Sound Sources in Egocentric Videos

Kranti Kumar Parida, Omar Emara, Hazel Doughty et al.

Humans excel at multisensory perception and can often recognise object properties from the sound of their interactions. Inspired by this, we propose the novel task of Collision Sound Source Segmentation (CS3), where we aim to segment the objects responsible for a collision sound in visual input (i.e. video frames from the collision clip), conditioned on the audio. This task presents unique challenges. Unlike isolated sound events, a collision sound arises from interactions between two objects, and the acoustic signature of the collision depends on both. We focus on egocentric video, where sounds are often clear, but the visual scene is cluttered, objects are small, and interactions are brief. To address these challenges, we propose a weakly-supervised method for audio-conditioned segmentation, utilising foundation models (CLIP and SAM2). We also incorporate egocentric cues, i.e. objects in hands, to find acting objects that can potentially be collision sound sources. Our approach outperforms competitive baselines by $3\times$ and $4.7\times$ in mIoU on two benchmarks we introduce for the CS3 task: EPIC-CS3 and Ego4D-CS3.

en cs.CV, cs.SD

Detail Sumber

arXiv Open Access 2025

Spike Encoding for Environmental Sound: A Comparative Benchmark

Andres Larroza, Javier Naranjo-Alcazar, Vicent Ortiz et al.

Spiking Neural Networks (SNNs) offer energy efficient processing suitable for edge applications, but conventional sensor data must first be converted into spike trains for neuromorphic processing. Environmental sound, including urban soundscapes, poses challenges due to variable frequencies, background noise, and overlapping acoustic events, while most spike based audio encoding research has focused on speech. This paper analyzes three spike encoding methods, Threshold Adaptive Encoding (TAE), Step Forward (SF), and Moving Window (MW) across three datasets: ESC10, UrbanSound8K, and TAU Urban Acoustic Scenes. Our multiband analysis shows that TAE consistently outperforms SF and MW in reconstruction quality, both per frequency band and per class across datasets. Moreover, TAE yields the lowest spike firing rates, indicating superior energy efficiency. For downstream environmental sound classification with a standard SNN, TAE also achieves the best performance among the compared encoders. Overall, this work provides foundational insights and a comparative benchmark to guide the selection of spike encoders for neuromorphic environmental sound processing.

en cs.SD, cs.ET

Detail Sumber

arXiv Open Access 2025

Description and Discussion on DCASE 2025 Challenge Task 4: Spatial Semantic Segmentation of Sound Scenes

Masahiro Yasuda, Binh Thien Nguyen, Noboru Harada et al.

Spatial Semantic Segmentation of Sound Scenes (S5) aims to enhance technologies for sound event detection and separation from multi-channel input signals that mix multiple sound events with spatial information. This is a fundamental basis of immersive communication. The ultimate goal is to separate sound event signals with 6 Degrees of Freedom (6DoF) information into dry sound object signals and metadata about the object type (sound event class) and representing spatial information, including direction. However, because several existing challenge tasks already provide some of the subset functions, this task for this year focuses on detecting and separating sound events from multi-channel spatial input signals. This paper outlines the S5 task setting of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2025 Challenge Task 4 and the DCASE2025 Task 4 Dataset, newly recorded and curated for this task. We also report experimental results for an S5 system trained and evaluated on this dataset. The full version of this paper will be published after the challenge results are made public.

en cs.SD, eess.AS

Detail Sumber

arXiv Open Access 2025

Transmission of High-Amplitude Sound through Leakages of Ill-fitting Earplugs

Haocheng Yu, Krishan K. Ahuja, Lakshmi N. Sankar et al.

High sound pressure levels (SPL) pose notable risks in loud environments, particularly due to noise-induced hearing loss. Ill-fitting earplugs often lead to sound leakage, a phenomenon this study seeks to investigate. To validate our methodology, we first obtained computational and experimental acoustic transmission data for stand-alone slit resonators and orifices, for which extensive published data are readily available for comparison. We then examined the frequency-dependent acoustic power absorption coefficient and transmission loss (TL) across various leakage geometries, modeled using different orifice diameters. Experimental approaches spanned a frequency range of 1--5 kHz under SPL conditions of 120--150 dB. Key findings reveal that unsealed silicone rubber earplugs demonstrate an average TL reduction of approximately 18 dB at an overall incident SPL (OISPL) of 120 dB. Direct numerical simulations further highlight SPL-dependent acoustic dissipation mechanisms, showing the conversion of acoustic energy into vorticity in ill-fitting earplug models at an OISPL of 150 dB. These results highlight the role of earplug design for high-sound-pressure-level environments.

en cs.SD, eess.AS

Detail Sumber

arXiv Open Access 2025

HergNet: a Fast Neural Surrogate Model for Sound Field Predictions via Superposition of Plane Waves

Matteo Calafà, Yuanxin Xia, Cheol-Ho Jeong

We present a novel neural network architecture for the efficient prediction of sound fields in two and three dimensions. The network is designed to automatically satisfy the Helmholtz equation, ensuring that the outputs are physically valid. Therefore, the method can effectively learn solutions to boundary-value problems in various wave phenomena, such as acoustics, optics, and electromagnetism. Numerical experiments show that the proposed strategy can potentially outperform state-of-the-art methods in room acoustics simulation, in particular in the range of mid to high frequencies.

en cs.SD, cs.CE

Detail Sumber

DOAJ Open Access 2025

A chaotic behavior and stability analysis on quasi-zero stiffness vibration isolators with multi-control methodologies

Taher A Bahnasy, TS Amer, A Almahalawy et al.

A quasi-zero stiffness vibration isolator (QZSVI) is used in applications like precision instruments, aerospace, microelectronics manufacturing, and seismic isolation to protect sensitive equipment from low-frequency vibrations. Their key advantage lies in achieving near-zero stiffness, allowing for highly effective vibration attenuation while maintaining system stability. These passive systems are cost-effective and reliable, offering superior vibration isolation without the need for external power or active control. This work proposes the use of negative displacement, velocity, and cubic velocity feedback control techniques to enhance the QZSVI’s isolation performance. We found that the composite negative velocity and cubic velocity control (NVFC + NCVFC) is more effective with low cost compared to other types of controller (its effectiveness is about 94.8%). The approximate solutions (AS) of the controlling system of equations of motion (EOM) are acquired using a multiple-scales procedure (MSP) up to the second order, and it is subsequently validated numerically through the Runge–Kutta method (RKM) from the fourth-order. Modulation equations (ME) are obtained by exploring resonance instances and solvability conditions. Time history graphs and frequency response curves, generated via MATLAB and Wolfram Mathematica 13.2, are presented to analyze stability and steady-state solutions. It is investigated how altering the parameters affects the system amplitude. Poincaré maps, Lyapunov exponent spectra (LEs), and bifurcation diagrams are presented to illustrate the system’s diverse behavior patterns. Furthermore, the transmissibility of force, displacement, and acceleration is computed and displayed. A QZSVI minimizes low-frequency vibrations, making it ideal for precision applications in metrology, automotive, aerospace, civil engineering, medical equipment, and renewable energy. It achieves superior damping, ensuring high stability and precision.

Control engineering systems. Automatic machinery (General), Acoustics. Sound

Detail DOI Sumber

arXiv Open Access 2024

The role of direct sound spherical harmonics representation in externalization using binaural reproduction

Eran Miller, Boaz Rafaely

The importance of the information in the direct sound to human perception of spatial sound sources is an ongoing research topic. The classification between direct sound and diffuse or reverberant sound forms the basis of numerous studies in the field of spatial audio. In particular, parametric spatial audio representation methods use this classification and employ signal processing in order to enhance the audio quality at reproduction. However, current literature does not provide information concerning the impact of ideal direct sound representation on externalization, in the context of Ambisonics. This paper aims to assess the importance of the spatial information in the direct sound in the externalization of a sound field when using binaural reproduction. This is done in the spherical harmonics (SH) domain, where an ideal direct sound representation within an otherwise Ambisonics signal is simulated, and its perceived externalization is evaluated in a formal listening test. This investigation leads to the conclusion that externalization of a first order Ambisonics signal may be significantly improved by enhancing the direct sound component, up to a level similar to a third order Ambisonics signal.

en eess.AS, cs.SD

Detail DOI Sumber

arXiv Open Access 2024

Timbre Difference Capturing in Anomalous Sound Detection

Tomoya Nishida, Harsh Purohit, Kota Dohi et al.

This paper proposes a framework of explaining anomalous machine sounds in the context of anomalous sound detection~(ASD). While ASD has been extensively explored, identifying how anomalous sounds differ from normal sounds is also beneficial for machine condition monitoring. However, existing sound difference captioning methods require anomalous sounds for training, which is impractical in typical machine condition monitoring settings where such sounds are unavailable. To solve this issue, we propose a new strategy for explaining anomalous differences that does not require anomalous sounds for training. Specifically, we introduce a framework that explains differences in predefined timbre attributes instead of using free-form text captions. Objective metrics of timbre attributes can be computed using timbral models developed through psycho-acoustical research, enabling the estimation of how and what timbre attributes have changed from normal sounds without training machine learning models. Additionally, to accurately determine timbre differences regardless of variations in normal training data, we developed a method that jointly conducts anomalous sound detection and timbre difference estimation based on a k-nearest neighbors method in an audio embedding space. Evaluation using the MIMII DG dataset demonstrated the effectiveness of the proposed method.

en eess.AS, cs.SD

Detail Sumber

DOAJ Open Access 2024

Response analysis of the vibro-impact system under fractional-order joint random excitation

Jun Wang, Zijian Yang, Wanqi Sun et al.

As a kind of good damping material, viscoelastic material is widely used in machinery, civil engineering, and other fields. In this paper, the viscoelasticity of the system is described by fractional differentiation. The dynamic response of a unilateral vibro-impact system with a viscoelastic oscillator under joint random excitation is studied, in which joint random excitation is composed of additive and multiplicative white noise. The fractional-order derivative was calculated based on Caputo’s definition, and the fractional derivative was equivalent to the corresponding linear damping force and linear restoring force. As a result, a new random system without fractional-order terms was obtained. A non-smooth transformation was introduced, which was equivalent to the original system to a new system without a velocity jump. The steady-state probability density functions of fractional-order vibro-impact systems under joint random excitation are solved by using the random average method and non-smooth transformation. In addition, the effects of parameters on the steady-state response of the system are analyzed.

Control engineering systems. Automatic machinery (General), Acoustics. Sound

Detail DOI Sumber

DOAJ Open Access 2024

A novel ultrasound-assisted enzyme extraction method of total flavonoids from Viticis Fructus and processed Viticis Fructus: Comparison of in vitro antioxidant activity

Yuman Li, Qing Zhang, Qi Fang et al.

In this study, it is the first that the Viticis Fructus (VF) was used as the raw material for extracting total flavonoids using the ultrasound-assisted enzyme extraction (UAE) method. Response surface methodology was employed to determine the optimal extraction parameters. The optimal conditions were as follows: 60 % ethanol solution as the extract solvent, material–liquid ratio of 1:25, pH value of 4, enzyme addition amount of 1.5 %, enzymatic hydrolysis time of 30 min, enzymatic hydrolysis temperature of 40 ℃, and ultrasonic time of 50 min. Comparing the total flavonoid yield of VF and processed VF (PVF) extracted using different methods, it was observed that UAE resulted in a higher total flavonoid yield compared to traditional ultrasound extraction and enzyme extraction. Additionally, the total flavonoid yield of PVF extracted by all three methods was generally higher than that of VF. The PVF solution extracted by UAE also demonstrated better in vitro antioxidant activity compared to VF. These results suggest that UAE is an effective method to enhance the activity of natural total flavonoids. The study of the physicochemical properties and in vitro antioxidant activity of VF and PVF showed that the total flavonoid yield and antioxidant activity significantly increased after VF stir-frying, indicating that their efficacy can also be enhanced.

Chemistry, Acoustics. Sound

Detail DOI Sumber

S2 Open Access 2022

Visual Acoustic Matching

Changan Chen, Ruohan Gao, P. Calamia et al.

We introduce the visual acoustic matching task, in which an audio clip is transformed to sound like it was recorded in a target environment. Given an image of the target environment and a waveform for the source audio, the goal is to re-synthesize the audio to match the target room acoustics as suggested by its visible geometry and materials. To address this novel task, we propose a cross-modal transformer model that uses audio-visual attention to inject visual properties into the audio and generate realistic audio output. In addition, we devise a self-supervised training objective that can learn acoustic matching from in-the-wild Web videos, despite their lack of acoustically mismatched audio. We demonstrate that our approach successfully translates human speech to a variety of real-world environments depicted in images, outperforming both traditional acoustic matching and more heavily supervised baselines.

66 sitasi en Computer Science, Engineering

Detail DOI Sumber

arXiv Open Access 2023

Adaptive Representations of Sound for Automatic Insect Recognition

Marius Faiß, Dan Stowell

Insect population numbers and biodiversity have been rapidly declining with time, and monitoring these trends has become increasingly important for conservation measures to be effectively implemented. But monitoring methods are often invasive, time and resource intense, and prone to various biases. Many insect species produce characteristic sounds that can easily be detected and recorded without large cost or effort. Using deep learning methods, insect sounds from field recordings could be automatically detected and classified to monitor biodiversity and species distribution ranges. We implement this using recently published datasets of insect sounds (Orthoptera and Cicadidae) and machine learning methods and evaluate their potential for acoustic insect monitoring. We compare the performance of the conventional spectrogram-based audio representation against LEAF, a new adaptive and waveform-based frontend. LEAF achieved better classification performance than the mel-spectrogram frontend by adapting its feature extraction parameters during training. This result is encouraging for future implementations of deep learning technology for automatic insect sound recognition, especially as larger datasets become available.

en cs.SD, eess.AS

Detail DOI Sumber

arXiv Open Access 2023

Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects

Shoichi Koyama, Masaki Nakada, Juliano G. C. Ribeiro et al.

A method for estimating the incident sound field inside a region containing scattering objects is proposed. The sound field estimation method has various applications, such as spatial audio capturing and spatial active noise control; however, most existing methods do not take into account the presence of scatterers within the target estimation region. Although several techniques exist that employ knowledge or measurements of the properties of the scattering objects, it is usually difficult to obtain them precisely in advance, and their properties may change during the estimation process. Our proposed method is based on the kernel ridge regression of the incident field, with a separation from the scattering field represented by a spherical wave function expansion, thus eliminating the need for prior modeling or measurements of the scatterers. Moreover, we introduce a weighting matrix to induce smoothness of the scattering field in the angular direction, which alleviates the effect of the truncation order of the expansion coefficients on the estimation accuracy. Experimental results indicate that the proposed method achieves a higher level of estimation accuracy than the kernel ridge regression without separation.

en cs.SD, eess.AS

Detail Sumber

arXiv Open Access 2023

Cross-domain Sound Recognition for Efficient Underwater Data Analysis

Jeongsoo Park, Dong-Gyun Han, Hyoung Sul La et al.

This paper presents a novel deep learning approach for analyzing massive underwater acoustic data by leveraging a model trained on a broad spectrum of non-underwater (aerial) sounds. Recognizing the challenge in labeling vast amounts of underwater data, we propose a two-fold methodology to accelerate this labor-intensive procedure. The first part of our approach involves PCA and UMAP visualization of the underwater data using the feature vectors of an aerial sound recognition model. This enables us to cluster the data in a two dimensional space and listen to points within these clusters to understand their defining characteristics. This innovative method simplifies the process of selecting candidate labels for further training. In the second part, we train a neural network model using both the selected underwater data and the non-underwater dataset. We conducted a quantitative analysis to measure the precision, recall, and F1 score of our model for recognizing airgun sounds, a common type of underwater sound. The F1 score achieved by our model exceeded 84.3%, demonstrating the effectiveness of our approach in analyzing underwater acoustic data. The methodology presented in this paper holds significant potential to reduce the amount of labor required in underwater data analysis and opens up new possibilities for further research in the field of cross-domain data analysis.

en cs.SD, cs.LG

Detail Sumber

DOAJ Open Access 2023

Elastic and thermo-elastic characterizations of thin resin films using colored picosecond acoustics and spectroscopic ellipsometry

A. Devos, F. Chevreux, C. Licitra et al.

Colored Picosecond Acoustics (CPA) and Spectroscopic Ellipsometry (SE) are combined to measure elastic and thermoelastic properties of polymer thin-film resins deposited on 300 mm wafers. Film thickness and refractive index are measured using SE. Sound velocity and thickness are measured using CPA from the refractive index. Comparing the two thicknesses allows checking consistency between both approaches. The same combination is then applied at various temperatures from 19° to 180°C. As the sample is heated, both thickness and sound velocity change. By monitoring these contributions separately, the Temperature Coefficient on sound Velocity (TCV) and the Coefficient on Thermal Expansion are deduced. The protocol is applied to five industrial samples made of different thin-film resins currently used by microelectronic industry. Young’s modulus varies from resin to resin by up to 20%. TCV is large on each resin and varies from one resin to another up to 57%.

Physics, Acoustics. Sound

Detail DOI Sumber

DOAJ Open Access 2023

Design and Experiments of A New Internal Cone Type Traveling Wave Ultrasonic Motor

Ye CHEN, Junlin YANG, Liang LI et al.

In order to simplify the motor structure, to reduce the difficulty of rotor pre-pressure application and to obtain better output performance, a new internal cone type rotating traveling wave ultrasonic motor is proposed. The parametric model of the internal cone type ultrasonic motor was established by the ANSYS finite element software. The ultrasonic motor consists of an internal cone type vibrator and a tapered rotor. The dynamic analysis of the motor vibrator is carried out, and two in-plane third-order bending modes with the same frequency and orthogonality are selected as the working modes. The other advantages of this motor are that pre-pressure can be imposed by the weight of the rotor. The prototype was trial-manufactured and experimentally tested for its vibration characteristics and output performance. When the excitation frequency is 22260.0 Hz, the pre-pressure is 0.1 N and the peak-to-peak excitation voltage is 300 V, the maximum output torque of the prototype is 1.06 N · mm, and the maximum no-load speed can reach 441.2 rpm. The optimal pre-pressure force under different loads is studied, and the influence of the pre-pressure force on the mechanical properties of the ultrasonic motor is analyzed. It is instructive in the practical application of this ultrasonic motor.

Acoustics. Sound

Detail DOI Sumber

DOAJ Open Access 2023

The effect of natural antioxidants, pH, and green solvents upon catechins stability during ultrasonic extraction from green tea leaves (Camellia sinensis)

Rizwan Ahmad, Mohammed Aldholmi, Aljawharah Alqathama et al.

Background: This is a first-time report to evaluate the effect of natural antioxidants, pH, and green solvents upon catechins yield and stability during the active process of extraction from green tea leaves. Methodology: Green solvents (model-A) augmented with piperine (PPN) and quercetin (QT) as natural antioxidants (model-B) at different pH 2–6 (model-C) were used to extract catechins from green tea leaves using an ultrasonic extraction process (USE). For quantification of catechins (EC; epicatechins, ECG; epicatechin gallate, and EGCG; epigallocatechin gallate), a green and sensitive UHPLC-MS/MS method was developed and validated. Results: The UHPLC-MS/MS method showed an accuracy of 98.3–102.6 % within the linearity range of 1–500 ppb for EC (m/z) 289 → 245 → 109, ECG (m/z) 441.2 → 169 → 289, and EGCG (m/z) 457.1 → 169 → 125.1. The general yield (ppb) for EC, ECG, and EGCG was observed with the ranges and sum of (N = 180) 0.06–157.80 and 6696.83, 0.04–316.93 and 12632.60 and, 0.12–584.11 and 26144.83, respectively. Model-C revealed the highest yield for catechins at the lowest pH-2 with an individual catechin yield of EGCG (584.11) > ECG (316.93) > EC (157.80) in CW2. In terms of stability, EGCG was the most unstable catechin whereas, catechins extracted in model-B exhibited more stability (%recovery of 14.70 for EC, 10.55 for ECG, and 5.36 for EGCG in BEP). Moreover, model-B showed the minimal degradation for catechins within the range of 11.81–94.64 (BEP); even the most degradable EGCG was seen with the smallest %loss of 11.81–94.64 at time 24–70 h, as compared to the loss of > 95 % in model-A and C. The ANOVA score for catechins yield was; F11,168 = 61.06 (EC), F11,168 = 66.53 (ECG), and F11,168 = 48.92 (EGCG) (P = 0.00) with mean scores of (M = 94.63, SD = 25.46) for EC, (M = 194.87, SD = 51.41) ECG, and (M = 357.57, SD = 96.80) EGCG in CE2. Conclusion: A significant effect on catechins yield and stability was observed with the use of natural antioxidants and lowest pH-2.

Chemistry, Acoustics. Sound

Detail DOI Sumber

arXiv Open Access 2022

Denoising Induction Motor Sounds Using an Autoencoder

Thanh Tran, Sebastian Bader, Jan Lundgren

Denoising is the process of removing noise from sound signals while improving the quality and adequacy of the sound signals. Denoising sound has many applications in speech processing, sound events classification, and machine failure detection systems. This paper describes a method for creating an autoencoder to map noisy machine sounds to clean sounds for denoising purposes. There are several types of noise in sounds, for example, environmental noise and generated frequency-dependent noise from signal processing methods. Noise generated by environmental activities is environmental noise. In the factory, environmental noise can be created by vehicles, drilling, people working or talking in the survey area, wind, and flowing water. Those noises appear as spikes in the sound record. In the scope of this paper, we demonstrate the removal of generated noise with Gaussian distribution and the environmental noise with a specific example of the water sink faucet noise from the induction motor sounds. The proposed method was trained and verified on 49 normal function sounds and 197 horizontal misalignment fault sounds from the Machinery Fault Database (MAFAULDA). The mean square error (MSE) was used as the assessment criteria to evaluate the similarity between denoised sounds using the proposed autoencoder and the original sounds in the test set. The MSE is below or equal to 0.14 when denoise both types of noises on 15 testing sounds of the normal function category. The MSE is below or equal to 0.15 when denoising 60 testing sounds on the horizontal misalignment fault category. The low MSE shows that both the generated Gaussian noise and the environmental noise were almost removed from the original sounds with the proposed trained autoencoder.

en cs.SD, cs.AI

Detail Sumber

Hasil untuk "Acoustics. Sound"