Existing reinforcement learning (RL) approaches treat large language models (LLMs) as a unified policy, overlooking their internal mechanisms. In this paper, we decompose the LLM-based policy into Internal Layer Policies and Internal Modular Policies via Transformer's residual stream. Our entropy analysis on internal policy reveals distinct patterns: (1) universally, policies evolve from high-entropy exploration in early layers to deterministic refinement in top layers; and (2) Qwen exhibits a progressive, human-like reasoning structure, contrasting with the abrupt final-layer convergence in Llama. Furthermore, we discover that optimizing internal layers induces feature refinement, forcing lower layers to capture high-level reasoning representations early. Motivated by these findings, we propose Bottom-up Policy Optimization (BuPO), a novel RL paradigm that reconstructs the LLM's reasoning foundation from the bottom up by optimizing internal layers in early stages. Extensive experiments on complex reasoning benchmarks demonstrate the effectiveness of BuPO. Our code is available at https://github.com/Trae1ounG/BuPO.
This paper studies the welfare effects of self-preferencing by Airbnb, a practice where Airbnb utilizes its pricing algorithm to prioritize maximizing platform-wide commission revenue rather than optimizing individual host revenues. To examine this welfare implication, I construct a Bertrand competition model with differentiated products between Airbnb hosts and hotels. Using unique data from Tokyo's 23 wards, I estimate the model and conduct counterfactual simulations to evaluate the welfare effects of self-preferencing. Counterfactual simulations reveal that self-preferencing reduces social welfare by 5.08% on average, equivalent to an annual loss of about 14.90% of Tokyo's vacation rental market size in 2023 while increasing Airbnb's commission revenue by an average of 37.73%. These findings highlight the significant trade-offs between platform-driven revenue optimization and market efficiency, emphasizing the urgent need for competition policy reforms and greater transparency and accountability in platform practices.
The article reflects the author's point of view on the disasters that occurred with American atomic submarines Thresher and Scorpion in 1960th and Indonesian diesel submarine Nanggala-420 in April 2021. A possible role of giant internal waves in such tragic events is discussed. Parameters of large-amplitude internal solitary waves, their shapes and speeds of propagation are presented. The author's reconstruction of the disasters is given.
Megan C. Engel, Flavio Romano, Ard A. Louis
et al.
We present a new method for calculating internal forces in DNA structures using coarse-grained models and demonstrate its utility with the oxDNA model. The instantaneous forces on individual nucleotides are explored and related to model potentials, and using our framework, internal forces are calculated for two simple DNA systems and for a recently-published nanoscopic force clamp. Our results highlight some pitfalls associated with conventional methods for estimating internal forces, which are based on elastic polymer models, and emphasise the importance of carefully considering secondary structure and ionic conditions when modelling the elastic behaviour of single-stranded DNA. Beyond its relevance to the DNA nanotechnological community, we expect our approach to be broadly applicable to calculations of internal force in a variety of structures -- from DNA to protein -- and across other coarse-grained simulation models.
We have seen many developments in Marchenko equation-based methods for internal multiple attenuation in the past years. Starting from a wave-equation based method that required a smooth velocity model, there are now Marchenko equation-based methods that do not require any model information or user-input. In principle, these methods accurately predict internal multiples. Therefore, the role of the adaptive filter has changed for these methods. Rather than needing an aggressive adaptive filter to compensate for inaccurate internal multiple predictions, only a conservative adaptive filter is needed to compensate for minor amplitude and/or phase errors in the internal multiple predictions caused by imperfect acquisition and preprocessing of the input data. We demonstate that a conservative adaptive filter can be used to improve the attenuation of internal multiples when applying a Marchenko multiple elimination (MME) method to a 2D line of streamer data. In addition, we suggest that an adaptive filter can be used as a feedback mechanism to improve the preprocessing of the input data.
We consider the revenue maximization problem of a monopolist via a non-Myersonian approach that could generalize to multiple items and multiple buyers. Although such an approach does not lead to any closed-form solution of the problem, it does provide some insights into this problem from different angles. In particular, we consider both Bayesian (Bayesian Incentive Compatible + Bayesian Individually Rational) and Dominant-Strategy (Dominant-Strategy Incentive Compatible + ex-post Individually Rational) implementations, where all the buyers have additive valuations and quasi-linear utilities and all the valuations are independent across buyers (not necessarily independent across items). The main technique of our approach is to formulate the problem as an LP (probably with exponential size) and apply primal-dual analysis. We observe that any optimal solution of the dual program naturally defines the virtual value functions for the primal revenue maximization problem in the sense that any revenue-maximizing auction must be a virtual welfare maximizer (cf. Myerson's auction for a single item [Myerson, 1981]). Based on this observation, we characterize a sufficient and necessary condition for BIC = DSIC, i.e., the optimal revenue of Bayesian implementations equals to the optimal revenue of dominant-strategy implementations (BRev = DRev). The condition is if and only if the optimal DSIC revenue DRev can be achieved by a DSIC and ex-post IR virtual welfare maximizer with buyer-independent virtual value functions (buyer i's virtual value is independent of other buyers' valuations). In light of the characterization, we further show that when all the valuations are i.i.d., it is further equivalent to that separate-selling is optimal. In particular, it respects one result from the recent breakthrough work on the exact optimal solutions in the multi-item multi-buyer setting by Yao [2016].
We extend some fundamental definitions and constructions in the established generalisation of Lie theory involving Lie groupoids by reformulating them in terms of groupoids internal to a well-adapted model of synthetic differential geometry. In particular we define internal counterparts of the definitions of source path and source simply connected groupoid and the integration of $A$-paths. The main results of this paper show that if a classical Hausdorff Lie groupoid satisfies one of the classical connectedness conditions it also satisfies its internal counterpart.
In this note, we define a moment of instability m(c) for internal solitary waves in continuously stratified fluids, which seems not to have been done before. To underline the suitability of the proposed moment of instability, we identify the relation m"(c)=0 as a formal Fredholm condition, and we show that m(c) displays a definite sign for small-amplitude waves.
The agreement between the fragments' internal and kinetic temperatures with the breakup temperature is investigated using a Statistical Multifragmentation Model which makes no a priori as- sumption on the relationship between them. We thus examine the conditions for obtaining such agreement and find that, in the framework of our model, this holds only in a relatively narrow range of excitation energy. The role played by the qualitative shape of the fragments' state densities is also examined. Our results suggest that the internal temperature of the light fragments may be affected by this quantity, whose behavior may lead to constant internal temperatures over a wide excitation energy range. It thus suggests that the nuclear thermometry may provide valuable information on the nuclear state density.
We formulate a flexible micro-to-macro kinetic model which is able to explain the emergence of income profiles out of a whole of individual economic interactions. The model is expressed by a system of several nonlinear differential equations which involve parameters defined by probabilities. Society is described as an ensemble of individuals divided into income classes; the individuals exchange money through binary and ternary interactions, leaving the total wealth unchanged. The ternary interactions represent taxation and redistribution effects. Dynamics is investigated through computational simulations, the focus being on the effects that different fiscal policies and differently weighted welfare policies have on the long-run income distributions. The model provides a tool which may contribute to the identification of the most effective actions towards a reduction of economic inequality. We find for instance that, under certain hypotheses, the Gini index is more affected by a policy of reduction of the welfare and subsidies for the rich classes than by an increase of the upper tax rate. Such a policy also has the effect of slightly increasing the total tax revenue.
In this paper we consider a multidimensional Kaluza-Klein (KK) model with a Ricci-flat internal space, e.g., a Calabi-Yau manifold. We perturb this background metrics by a system of gravitating masses, e.g., astrophysical objects such as our Sun. We suppose that these masses are pressureless in the external space but they have relativistic pressure in the internal space. We show that metric perturbations do not depend on coordinates of the internal space and gravitating masses should be uniformly smeared over the internal space. This means, first, that KK modes corresponding to the metric fluctuations are absent and, second, particles should be only in the ground quantum state with respect to the internal space. In our opinion, these results look very unnatural. According to statistical physics, any nonzero temperature should result in fluctuations, i.e. in KK modes. We also get formulae for the metric correction terms which enable to calculate the gravitational tests: the deflection of light, the time-delay of the radar echoes and the perihelion advance.
The performance of one type (Carnot) of Internal Combustion Engine (ICE) cycle is analyzed within the framework of thermodynamic free energies. ICE performance is different from that of an External Combustion Engine (ECE) which is dictated by Carnot's rule.
As fine templates to prepare microcapsules, multiple emulsions with complex internal structures have been generated through microfluidics. To study effects of layers, inner droplets and asymmetry of internal structures on rheology of multiple emulsions, a generalized boundary integral method is developed to investigate multiple emulsions with orderly structures up to n layers and up to mi droplets in the i-th layer. Under a modest extensional flow, the complication of internal structures and the collision among inner droplets will subject the particle to stronger shears. However, the particle will ease the added tension through the simplification of internal structures (destabilization) such as coalescence or release of inner droplets. Since the rheology of multiple emulsions is sensitive to internal structures and their change, modelling them as the core-shell droplets to obtain viscosity equation should be modified by introducing the variable "time". Asymmetric internal structures will induce oriented contact and merging of the outer and inner interface. The start time of the interface merging can be controlled by adjusting the viscosity ratio and enhancing the asymmetry, which is promising in the controlled release of inner droplets for targeted drug delivery.
A generalization of the Tangherlini solution for the case of n internal Ricci-flat spaces is obtained. It is shown that in the (2+d)-dimensional section a horizon exists only in the trivial case when the internal-space factors are constant. The p-adic analog of the solution is also considered.
We study the thermal equilibrium and stability of isobaric, spherical structures having a radiation source located at its center. The thermal conduction coefficient, external heating and cooling rates are represented as power laws of the temperature. The internal heating decreases with distance from the source r approximately as exp(-tau)/(r**2), being tau the optical depth. We find that the influence of the radiation source is important only in the central region, but its effect is enough to make the system thermally unstable above a certain threshold central temperature. This threshold temperature decreases as the internal heating efficiency increases, but, otherwise, it does not depend on the structure size. Our results suggest that a solar-like star migrating into a diffuse interstellar region may destabilize the surrounding medium.
J. J. Sławianowski, V. Kovalchuk, B. Gołubowska
et al.
Presented is description of kinematics and dynamics of material points with internal degrees of freedom moving in a Riemannian manifold. The models of internal degrees of freedom we concentrate on are based on the orthogonal and affine groups. Roughly speaking, we consider infinitesimal gyroscopes and homogeneously deformable gyroscopes (affienly-rigid bodies) in curved manifolds. We follow our earlier models of extended rigid and affinely-rigid bodies moving in a flat space. It is well known that in curved spaces in general there is no well-defined concept of extended rigid or affinely-rigid body. Our infinitesimal models are mathematically well defined and physically they may be interpreted as an approximate description of "small" rigid and affinely-rigid bodies. We derive equations of motion and show how internal degrees of freedom interact with spatial geometry, first of all with the curvature but also with the torsion. Integrability and degeneracy problems are discussed.