Hasil untuk "q-fin.PM"

Menampilkan 20 dari ~1535124 hasil · dari CrossRef, arXiv, Semantic Scholar

JSON API
S2 Open Access 2014
Ultrasensitive terahertz sensing with high-Q Fano resonances in metasurfaces

Ranjan Singh, W. Cao, I. Al-Naib et al.

High quality factor resonances are extremely promising for designing ultra-sensitive refractive index label-free sensors, since it allows intense interaction between electromagnetic waves and the analyte material. Metamaterial and plasmonic sensing have recently attracted a lot of attention due to subwavelength confinement of electromagnetic fields in the resonant structures. However, the excitation of high quality factor resonances in these systems has been a challenge. We excite an order of magnitude higher quality factor resonances in planar terahertz metamaterials that we exploit for ultrasensitive sensing. The low-loss quadrupole and Fano resonances with extremely narrow linewidths enable us to measure the minute spectral shift caused due to the smallest change in the refractive index of the surrounding media. We achieve sensitivity levels of 7.75 × 103 nm/refractive index unit (RIU) with quadrupole and 5.7 × 104 nm/RIU with the Fano resonances which could be further enhanced by using thinner substrates. These findings would facilitate the design of ultrasensitive real time chemical and biomolecular sensors in the fingerprint region of the terahertz regime.

610 sitasi en Physics
S2 Open Access 2024
From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function

Rafael Rafailov, Joey Hejna, Ryan Park et al.

Reinforcement Learning From Human Feedback (RLHF) has been critical to the success of the latest generation of generative AI models. In response to the complex nature of the classical RLHF pipeline, direct alignment algorithms such as Direct Preference Optimization (DPO) have emerged as an alternative approach. Although DPO solves the same objective as the standard RLHF setup, there is a mismatch between the two approaches. Standard RLHF deploys reinforcement learning in a specific token-level MDP, while DPO is derived as a bandit problem in which the whole response of the model is treated as a single arm. In this work we rectify this difference. We theoretically show that we can derive DPO in the token-level MDP as a general inverse Q-learning algorithm, which satisfies the Bellman equation. Using our theoretical results, we provide three concrete empirical insights. First, we show that because of its token level interpretation, DPO is able to perform some type of credit assignment. Next, we prove that under the token level formulation, classical search-based algorithms, such as MCTS, which have recently been applied to the language generation space, are equivalent to likelihood-based search on a DPO policy. Empirically we show that a simple beam search yields meaningful improvement over the base DPO policy. Finally, we show how the choice of reference policy causes implicit rewards to decline during training. We conclude by discussing applications of our work, including information elicitation in multi-turn dialogue, reasoning, agentic applications and end-to-end training of multi-model systems.

247 sitasi en Computer Science
S2 Open Access 2023
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Yevgen Chebotar, Q. Vuong, A. Irpan et al.

In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data. Our method uses a Transformer to provide a scalable representation for Q-functions trained via offline temporal difference backups. We therefore refer to the method as Q-Transformer. By discretizing each action dimension and representing the Q-value of each action dimension as separate tokens, we can apply effective high-capacity sequence modeling techniques for Q-learning. We present several design decisions that enable good performance with offline RL training, and show that Q-Transformer outperforms prior offline RL algorithms and imitation learning techniques on a large diverse real-world robotic manipulation task suite. The project's website and videos can be found at https://qtransformer.github.io

144 sitasi en Computer Science
arXiv Open Access 2026
Understanding the Long-Only Minimum Variance Portfolio

Nick L. Gunther, Alec N. Kercheval, Ololade Sowunmi

For a covariance matrix coming from a factor model of returns, we investigate the relationship between the long-only global minimum variance portfolio and the asset exposures to the factors. In the case of a 1-factor model, we provide a rigorous and explicit description of the long-only solution in terms of the parameters of the covariance matrix. For $q>1$ factors, we provide a description of the long-only portfolio in geometric terms. The results are illustrated with empirical daily returns of US stocks.

en q-fin.MF, q-fin.PM
CrossRef Open Access 2025
Pengaruh Emosional, Kualitas Produk, dan Harga terhadap Kepuasan Konsumen Pada Toko PM Collection Pekanbaru

Sephia Amanda, Mashur Fadli

Industri fashion saat ini tengah mengalami pertumbuhan yang sangat cepat, yang terlihat dari meningkatnya permintaan pasar. Oleh karena itu, PM Collection Pekanbaru yang bergerak di bidang fashion perlu mampu beradaptasi dengan perubahan trend dan mampu bersaing agar mampu bertahan dipasaran. Penelitian ini bertujuan untuk melihat pengaruh emosional, kualitas produk dan harga terhadap kepuasan konsumen pada PM Collection Pekanbaru. Dalam penelitian ini menggunakan metode kuantitatif dengan sampel 99 responden. Penelitian ini memakai metode statistik deskriptif dan untuk pengukuran menggunakan IBM SPSS 29 (Statistical Package for Social Science) serta sumber data yang dipakai yaitu data primer yang diperoleh dari penggumpulan data melalui penyebaran kuesioner kepada konsumen PM Collection Pekanbaru. Penelitian ini menerapkan teknik pengambilan sampel non probability sampling dengan pendekatan accidental sampling. Hasil pada penelitian ini menunjukan bahwa emosional berpengaruh signifikan terhadap kepuasan konsumen, kualitas produk berpengaruh positif signifikan terhadap kepuasan konsumen dan harga juga berpengaruh positif signifikan terhadap kepuasan konsumen. Secara simultan emosional, kualitas produk, dan harga berpengaruh terhadap kepuasan konsumen. Dapat disimpulkan bahwa emosional, kualitas produk, dan harga merupakan faktor yang penting dalam meningkatkan kepuasan konsumen. Semakin baik emosional, kualitas produk dan harga yang digunakan PM Collection Pekanbaru maka akan semakin meningkat pula kepuasan konsumen pada PM Collection Pekanbaru.

arXiv Open Access 2025
Distributionally Robust Deep Q-Learning

Chung I Lu, Julian Sester, Aijia Zhang

We propose a novel distributionally robust $Q$-learning algorithm for the non-tabular case accounting for continuous state spaces where the state transition of the underlying Markov decision process is subject to model uncertainty. The uncertainty is taken into account by considering the worst-case transition from a ball around a reference probability measure. To determine the optimal policy under the worst-case state transition, we solve the associated non-linear Bellman equation by dualising and regularising the Bellman operator with the Sinkhorn distance, which is then parameterized with deep neural networks. This approach allows us to modify the Deep Q-Network algorithm to optimise for the worst case state transition. We illustrate the tractability and effectiveness of our approach through several applications, including a portfolio optimisation task based on S\&{P}~500 data.

en cs.LG, math.OC
arXiv Open Access 2025
Deep Reinforcement Learning for Optimal Asset Allocation Using DDPG with TiDE

Rongwei Liu, Jin Zheng, John Cartlidge

The optimal asset allocation between risky and risk-free assets is a persistent challenge due to the inherent volatility in financial markets. Conventional methods rely on strict distributional assumptions or non-additive reward ratios, which limit their robustness and applicability to investment goals. To overcome these constraints, this study formulates the optimal two-asset allocation problem as a sequential decision-making task within a Markov Decision Process (MDP). This framework enables the application of reinforcement learning (RL) mechanisms to develop dynamic policies based on simulated financial scenarios, regardless of prerequisites. We use the Kelly criterion to balance immediate reward signals against long-term investment objectives, and we take the novel step of integrating the Time-series Dense Encoder (TiDE) into the Deep Deterministic Policy Gradient (DDPG) RL framework for continuous decision-making. We compare DDPG-TiDE with a simple discrete-action Q-learning RL framework and a passive buy-and-hold investment strategy. Empirical results show that DDPG-TiDE outperforms Q-learning and generates higher risk adjusted returns than buy-and-hold. These findings suggest that tackling the optimal asset allocation problem by integrating TiDE within a DDPG reinforcement learning framework is a fruitful avenue for further exploration.

en q-fin.PM, cs.AI
arXiv Open Access 2025
Solving dynamic portfolio selection problems via score-based diffusion models

Ahmad Aghapour, Erhan Bayraktar, Fengyi Yuan

In this paper, we tackle the dynamic mean-variance portfolio selection problem in a {\it model-free} manner, based on (generative) diffusion models. We propose using data sampled from the real model $\mathbb P$ (which is unknown) with limited size to train a generative model $\mathbb Q$ (from which we can easily and adequately sample). With adaptive training and sampling methods that are tailor-made for time series data, we obtain quantification bounds between $\mathbb P$ and $\mathbb Q$ in terms of the adapted Wasserstein metric $\mathcal A W_2$. Importantly, the proposed adapted sampling method also facilitates {\it conditional sampling}. In the second part of this paper, we provide the stability of the mean-variance portfolio optimization problems in $\mathcal A W _2$. Then, combined with the error bounds and the stability result, we propose a policy gradient algorithm based on the generative environment, in which our innovative adapted sampling method provides approximate scenario generators. We illustrate the performance of our algorithm on both simulated and real data. For real data, the algorithm based on the generative environment produces portfolios that beat several important baselines, including the Markowitz portfolio, the equal weight (naive) portfolio, and S\&P 500.

en q-fin.PM, stat.ML
arXiv Open Access 2025
Latent Variable Estimation in Bayesian Black-Litterman Models

Thomas Y. L. Lin, Jerry Yao-Chieh Hu, Paul W. Chiou et al.

We revisit the Bayesian Black-Litterman (BL) portfolio model and remove its reliance on subjective investor views. Classical BL requires an investor "view": a forecast vector $q$ and its uncertainty matrix $Ω$ that describe how much a chosen portfolio should outperform the market. Our key idea is to treat $(q,Ω)$ as latent variables and learn them from market data within a single Bayesian network. Consequently, the resulting posterior estimation admits closed-form expression, enabling fast inference and stable portfolio weights. Building on these, we propose two mechanisms to capture how features interact with returns: shared-latent parametrization and feature-influenced views; both recover classical BL and Markowitz portfolios as special cases. Empirically, on 30-year Dow-Jones and 20-year sector-ETF data, we improve Sharpe ratios by 50% and cut turnover by 55% relative to Markowitz and the index baselines. This work turns BL into a fully data-driven, view-free, and coherent Bayesian framework for portfolio optimization.

en q-fin.PM, cs.LG
arXiv Open Access 2024
Kullback-Leibler cluster entropy to quantify volatility correlation and risk diversity

L. Ponta, A. Carbone

The Kullback-Leibler cluster entropy $\mathcal{D_{C}}[P \| Q] $ is evaluated for the empirical and model probability distributions $P$ and $Q$ of the clusters formed in the realized volatility time series of five assets (SP\&500, NASDAQ, DJIA, DAX, FTSEMIB). The Kullback-Leibler functional $\mathcal{D_{C}}[P \| Q] $ provides complementary perspectives about the stochastic volatility process compared to the Shannon functional $\mathcal{S_{C}}[P]$. While $\mathcal{D_{C}}[P \| Q] $ is maximum at the short time scales, $\mathcal{S_{C}}[P]$ is maximum at the large time scales leading to complementary optimization criteria tracing back respectively to the maximum and minimum relative entropy evolution principles. The realized volatility is modelled as a time-dependent fractional stochastic process characterized by power-law decaying distributions with positive correlation ($H>1/2$). As a case study, a multiperiod portfolio built on diversity indexes derived from the Kullback-Leibler entropy measure of the realized volatility. The portfolio is robust and exhibits better performances over the horizon periods. A comparison with the portfolio built either according to the uniform distribution or in the framework of the Markowitz theory is also reported.

en q-fin.ST, physics.data-an
arXiv Open Access 2024
Beyond Monte Carlo: Harnessing Diffusion Models to Simulate Financial Market Dynamics

Andrew Lesniewski, Giulio Trigila

We propose a highly efficient and accurate methodology for generating synthetic financial market data using a diffusion model approach. The synthetic data produced by our methodology align closely with observed market data in several key aspects: (i) they pass the two-sample Cramer - von Mises test for portfolios of assets, and (ii) Q - Q plots demonstrate consistency across quantiles, including in the tails, between observed and generated market data. Moreover, the covariance matrices derived from a large set of synthetic market data exhibit significantly lower condition numbers compared to the estimated covariance matrices of the observed data. This property makes them suitable for use as regularized versions of the latter. For model training, we develop an efficient and fast algorithm based on numerical integration rather than Monte Carlo simulations. The methodology is tested on a large set of equity data.

en q-fin.CP, cs.AI
arXiv Open Access 2024
Strict universality of the square-root law in price impact across stocks: a complete survey of the Tokyo stock exchange

Yuki Sato, Kiyoshi Kanazawa

Universal power laws have been scrutinised in physics and beyond, and a long-standing debate exists in econophysics regarding the strict universality of the nonlinear price impact, commonly referred to as the square-root law (SRL). The SRL posits that the average price impact $I$ follows a power law with respect to transaction volume $Q$, such that $I(Q) \propto Q^δ$ with $δ\approx 1/2$. Some researchers argue that the exponent $δ$ should be system-specific, without universality. Conversely, others contend that $δ$ should be exactly $1/2$ for all stocks across all countries, implying universality. However, resolving this debate requires high-precision measurements of $δ$ with errors of around $0.1$ across hundreds of stocks, which has been extremely challenging due to the scarcity of large microscopic datasets -- those that enable tracking the trading behaviour of all individual accounts. Here we conclusively support the universality hypothesis of the SRL by a complete survey of all trading accounts for all liquid stocks on the Tokyo Stock Exchange (TSE) over eight years. Using this comprehensive microscopic dataset, we show that the exponent $δ$ is equal to $1/2$ within statistical errors at both the individual stock level and the individual trader level. Additionally, we rejected two prominent models supporting the nonuniversality hypothesis: the Gabaix-Gopikrishnan-Plerou-Stanley and the Farmer-Gerig-Lillo-Waelbroeck models (Nature 2003, QJE 2006, and Quant. Finance 2013). Our work provides exceptionally high-precision evidence for the universality hypothesis in social science and could prove useful in evaluating the price impact by large investors -- an important topic even among practitioners.

en q-fin.TR, cond-mat.stat-mech
S2 Open Access 2011
Deterministic design of wavelength scale, ultra-high Q photonic crystal nanobeam cavities.

Q. Quan, M. Lončar

Photonic crystal nanobeam cavities are versatile platforms of interest for optical communications, optomechanics, optofluidics, cavity QED, etc. In a previous work [Appl. Phys. Lett. 96, 203102 (2010)], we proposed a deterministic method to achieve ultrahigh Q cavities. This follow-up work provides systematic analysis and verifications of the deterministic design recipe and further extends the discussion to air-mode cavities. We demonstrate designs of dielectric-mode and air-mode cavities with Q > 10⁹, as well as dielectric-mode nanobeam cavities with both ultrahigh-Q (> 10⁷) and ultrahigh on-resonance transmissions (T > 95%).

430 sitasi en Physics, Medicine

Halaman 1 dari 76757