Hasil untuk "cs.SD"

Menampilkan 20 dari ~139085 hasil · dari CrossRef, arXiv

JSON API
arXiv Open Access 2026
Echoes of Ideology: Toward an Audio Analysis Pipeline to Unveil Character Traits in Historical Nazi Propaganda Films

Nicolas Ruth, Manuel Burghardt

This study investigates the use of computational audio analysis to examine ideological narratives in Nazi propaganda films. Employing a three-step pipeline, speaker diarization, audio transcription and psycholinguistic analysis, it reveals ideological patterns in characters. Despite current issues with speaker diarization, the methodology provides insights into character traits and propaganda narratives, suggesting scalable applications.

en cs.SD, eess.AS
arXiv Open Access 2026
A Mamba-Based Model for Automatic Chord Recognition

Chunyu Yuan, Johanna Devaney

In this work, we propose a new efficient solution, which is a Mamba-based model named BMACE (Bidirectional Mamba-based network, for Automatic Chord Estimation), which utilizes selective structured state-space models in a bidirectional Mamba layer to effectively model temporal dependencies. Our model achieves high prediction performance comparable to state-of-the-art models, with the advantage of requiring fewer parameters and lower computational resources

en cs.SD
CrossRef Open Access 2024
Environmental sustainability and green logistics: Evidence from BRICS and Gulf countries by cross‐sectionally augmented autoregressive distributed lag (CS‐ARDL) approach

Manel Ouni, Khaled Ben Abdallah

AbstractThe logistics sector plays a crucial role in supporting various aspects of the economy, making it an essential part of a nation's development. However, this sector also contributes to environmental pollution through various emissions. The adoption of environmentally friendly logistics practices presents a promising solution to mitigate adverse environmental impacts. This study aims to investigate the influence of economic growth, green innovation, foreign direct investment, transport emissions, renewable energy, and trade openness on green logistics in both Brazil, Russia, India, China, and South Africa (BRICS) and Gulf countries from 1992 to 2020. This study used an advanced panel approach to obtain robust results, considering cross‐sectional dependency and slope heterogeneity. The cross‐sectionally augmented autoregressive distributed lag method was employed to analyze long and short‐run estimations. Our findings reveal that in Gulf countries, both transport emissions and foreign direct investment have a negative impact on green logistics. In the BRICS countries, economic growth, transport emissions, trade openness, renewable energy, and green innovation have a positive impact on green logistics. The study proposes several recommendations to improve logistics development in both groups of nations and promote sustainability. To achieve carbon neutrality, it is important to adopt green logistics, promote green investments, and support renewable energy, innovation, and sustainable growth.

arXiv Open Access 2024
An Exploratory Study of Multimodal Physiological Data in Jazz Improvisation Using Basic Machine Learning Techniques

Yawen Zhang

Our study delves into the "Embodied Musicking Dataset," exploring the intertwined relationships and correlations between physiological and psychological dimensions during improvisational music performances. The primary objective is to ascertain the presence of a definitive causal or correlational relationship between these states and comprehend their manifestation in musical compositions. This rich dataset provides a perspective on how musicians coordinate their physicality with sonic events in real-time improvisational scenarios, emphasizing the concept of "Embodied Musicking."

en cs.SD, eess.AS
arXiv Open Access 2022
TimbreCLIP: Connecting Timbre to Text and Images

Nicolas Jonason, Bob L. T. Sturm

We present work in progress on TimbreCLIP, an audio-text cross modal embedding trained on single instrument notes. We evaluate the models with a cross-modal retrieval task on synth patches. Finally, we demonstrate the application of TimbreCLIP on two tasks: text-driven audio equalization and timbre to image generation.

en cs.SD, cs.LG
CrossRef Open Access 2020
Semi-equilibrated global sea-level change projections for the next 10 000 years

Jonas Van Breedam, Heiko Goelzer, Philippe Huybrechts

Abstract. The emphasis for informing policy makers on future sea-level rise has been on projections by the end of the 21st century. However, due to the long lifetime of atmospheric CO2, the thermal inertia of the climate system and the slow equilibration of the ice sheets, global sea level will continue to rise on a multi-millennial timescale even when anthropogenic CO2 emissions cease completely during the coming decades to centuries. Here we present global sea-level change projections due to the melting of land ice combined with steric sea effects during the next 10 000 years calculated in a fully interactive way with the Earth system model of intermediate complexity LOVECLIMv1.3. The greenhouse forcing is based on the Extended Concentration Pathways defined until 2300 CE with no carbon dioxide emissions thereafter, equivalent to a cumulative CO2 release of between 460 and 5300 GtC. We performed one additional experiment for the highest-forcing scenario with the inclusion of a methane emission feedback where methane is slowly released due to a strong increase in surface and oceanic temperatures. After 10 000 years, the sea-level change rate drops below 0.05 m per century and a semi-equilibrated state is reached. The Greenland ice sheet is found to nearly disappear for all forcing scenarios. The Antarctic ice sheet contributes only about 1.6 m to sea level for the lowest forcing scenario with a limited retreat of the grounding line in West Antarctica. For the higher-forcing scenarios, the marine basins of the East Antarctic Ice Sheet also become ice free, resulting in a sea-level rise of up to 27 m. The global mean sea-level change after 10 000 years ranges from 9.2 to more than 37 m. For the highest-forcing scenario, the model uncertainty does not exclude the complete melting of the Antarctic ice sheet during the next 10 000 years.

arXiv Open Access 2019
Multitask Learning for Polyphonic Piano Transcription, a Case Study

Rainer Kelz, Sebastian Böck, Gerhard Widmer

Viewing polyphonic piano transcription as a multitask learning problem, where we need to simultaneously predict onsets, intermediate frames and offsets of notes, we investigate the performance impact of additional prediction targets, using a variety of suitable convolutional neural network architectures. We quantify performance differences of additional objectives on the large MAESTRO dataset.

en cs.SD, eess.AS
arXiv Open Access 2018
Statistical Speech Model Description with VMF Mixture Model

Zhanyu Ma, Arne Leijon

In this paper, we present the LSF parameters by a unit vector form, which has directional characteristics. The underlying distribution of this unit vector variable is modeled by a von Mises-Fisher mixture model (VMM). With the high rate theory, the optimal inter-component bit allocation strategy is proposed and the distortion-rate (D-R) relation is derived for the VMM based-VQ (VVQ). Experimental results show that the VVQ outperforms our recently introduced DVQ and the conventional GVQ.

en cs.SD, eess.AS
arXiv Open Access 2018
SpeechPy - A Library for Speech Processing and Recognition

Amirsina Torfi

SpeechPy is an open source Python package that contains speech preprocessing techniques, speech features, and important post-processing operations. It provides most frequent used speech features including MFCCs and filterbank energies alongside with the log-energy of filter-banks. The aim of the package is to provide researchers with a simple tool for speech feature extraction and processing purposes in applications such as Automatic Speech Recognition and Speaker Verification.

en cs.SD, eess.AS
arXiv Open Access 2017
Modeling temporal constraints for a system of interactive scores

Mauricio Toro, Myriam Desainte-Catherine, Antoine Allombert

In this chapter we explain briefly the fundamentals of the interactive scores formalism. Then we develop a solution for implementing the ECO machine by mixing petri nets and constraints propagation. We also present another solution for implementing the ECO machine using concurrent constraint programming. Finally, we present an extension of interactive score with conditional branching.

en cs.SD, cs.LO
arXiv Open Access 2017
Machine listening intelligence

C. E. Cella

This manifesto paper will introduce machine listening intelligence, an integrated research framework for acoustic and musical signals modelling, based on signal processing, deep learning and computational musicology.

en cs.SD, cs.LG
CrossRef Open Access 2016
The enigma of intricately fitted beach boulders near Raglan, New Zealand

CS Nelson, SD Hood

ABSTRACT An intertidal rocky platform tucked in behind a rocky headland on open‐ocean Gibson Beach, near Raglan, supports an agglomeration of cobble‐ to large‐boulder‐sized clasts of Cenozoic sandstone and limestone. Rather than exhibiting just point contacts, many larger clasts are tightly interlocked and fitted with their neighbours and/or the underlying platform bedrock. Clast interface geometry relates to the strength contrast between adjacent rock types, linked to their calcite (cement) content. The end‐product is an armoured, highly stable framework of boulder clasts resembling a giant three‐dimensional jigsaw puzzle. While the direct impact of breaking waves likely plays a role in in situ jostling of boulders, we speculate that mechanical abrasion and fitting between larger clasts may also be promoted and maintained by in situ microvibration of the boulders as a consequence of wave‐induced microseismic shaking within the cliff‐backed rocky platform and headland, especially during major storm wave assault from the southwest.

2 sitasi en
arXiv Open Access 2016
Automatic Determination of Chord Roots

Samuel Rupprechter

Even though chord roots constitute a fundamental concept in music theory, existing models do not explain and determine them to full satisfaction. We present a new method which takes sequential context into account to resolve ambiguities and detect nonharmonic tones. We extract features from chord pairs and use a decision tree to determine chord roots. This leads to a quantitative improvement in correctness of the predicted roots in comparison to other models. All this raises the question how much harmonic and nonharmonic tones actually contribute to the perception of chord roots.

en cs.SD
arXiv Open Access 2016
Breath Activity Detection Algorithm

Eric E. Hamke, Ramiro Jordan, Manel Ramon-Martinez

This report describes the use of a support vector machines with a novel kernel, to determine the breathing rate and inhalation duration of a fire fighter wearing a Self-Contained Breathing Apparatus. With this information, an incident commander can monitor the firemen in his command for exhaustion and ensure timely rotation of personnel to ensure overall fire fighter safety

en cs.SD
CrossRef Open Access 2015
Impact of ice sheet meltwater fluxes on the climate evolution at the onset of the Last Interglacial

H. Goelzer, P. Huybrechts, M.-F. Loutre et al.

Abstract. Large climate perturbations occurred during Termination II when the ice sheets retreated from their glacial configuration. Here we investigate the impact of ice sheet changes and associated freshwater fluxes on the climate evolution at the onset of the Last Interglacial. The period from 135 to 120 kyr BP is simulated with the Earth system model of intermediate complexity LOVECLIM v.1.3 with prescribed evolution of the Antarctic ice sheet, the Greenland ice sheet and the other Northern Hemisphere ice sheets. Variations in meltwater fluxes from the Northern Hemisphere ice sheets lead to North Atlantic temperature changes and modifications of the strength of the Atlantic meridional overturning circulation. By means of the interhemispheric see-saw effect, variations in the Atlantic meridional overturning circulation also give rise to temperature changes in the Southern Hemisphere, which are modulated by the direct impact of Antarctic meltwater fluxes into the Southern Ocean. Freshwater fluxes from the melting Antarctic ice sheet lead to a millennial time scale oceanic cold event in the Southern Ocean with expanded sea ice as evidenced in some ocean sediment cores, which may be used to constrain the timing of ice sheet retreat.

arXiv Open Access 2015
Transformée en scattering sur la spirale temps-chroma-octave

Vincent Lostanlen, Stéphane Mallat

We introduce a scattering representation for the analysis and classification of sounds. It is locally translation-invariant, stable to deformations in time and frequency, and has the ability to capture harmonic structures. The scattering representation can be interpreted as a convolutional neural network which cascades a wavelet transform in time and along a harmonic spiral. We study its application for the analysis of the deformations of the source-filter model.

en cs.SD
arXiv Open Access 2015
Speech Dereverberation in the STFT Domain

Richard Stanton, Mike Brookes

Reverberation is damaging to both the quality and the intelligibility of a speech signal. We propose a novel single-channel method of dereverberation based on a linear filter in the Short Time Fourier Transform domain. Each enhanced frame is constructed from a linear sum of nearby frames based on the channel impulse response. The results show that the method can resolve any reverberant signal with knowledge of the impulse response to a non-reverberant signal.

en cs.SD
arXiv Open Access 2015
Deep Denoising Auto-encoder for Statistical Speech Synthesis

Zhenzhou Wu, Shinji Takaki, Junichi Yamagishi

This paper proposes a deep denoising auto-encoder technique to extract better acoustic features for speech synthesis. The technique allows us to automatically extract low-dimensional features from high dimensional spectral features in a non-linear, data-driven, unsupervised way. We compared the new stochastic feature extractor with conventional mel-cepstral analysis in analysis-by-synthesis and text-to-speech experiments. Our results confirm that the proposed method increases the quality of synthetic speech in both experiments.

en cs.SD, cs.LG

Halaman 2 dari 6955