Hasil untuk "Labor policy. Labor and the state"
Menampilkan 20 dari ~3871754 hasil · dari DOAJ, arXiv, Semantic Scholar, CrossRef
Olga F. Alyokhina, Larisa M. Butova, Vera A. Akimenko et al.
Russian labor market responds to Western sanctions. A timely control of these reactions makes it possible to take measures and eliminate the imbalance of supply and demand of labor resources. The authors identified the effects of economic sanctions in 2022–2024 as part of an economic and statistical study of Russian labor market and developed a set of measures that could level the negative impact. They used the methods of economic analysis, logical approach, and comparison to conduct a dynamic analysis of the labor market in 2021–2024 and reveal the causes of the current situation in the context of economic sanctions. The effects of the economic sanctions of 2022–2024 on the Russian labor market were classified as negative, dual, and positive. The negative ones can be leveled by institutional and organizational state policy measures. Despite the current brain drain, the demand for labor in manufacturing and high-tech areas is increasing due to the development of domestic production, which creates new jobs. As a result, the labor shortage and the labor market imbalance are more acute. Instead of restraining the Russian economy, the sanctions have led to structural changes in the Russian economy.
Nikolaos Tsagkas, Andreas Sochopoulos, Duolikun Danier et al.
The adoption of pre-trained visual representations (PVRs), leveraging features from large-scale vision models, has become a popular paradigm for training visuomotor policies. However, these powerful representations can encode a broad range of task-irrelevant scene information, making the resulting trained policies vulnerable to out-of-domain visual changes and distractors. In this work we address visuomotor policy feature pooling as a solution to the observed lack of robustness in perturbed scenes. We achieve this via Attentive Feature Aggregation (AFA), a lightweight, trainable pooling mechanism that learns to naturally attend to task-relevant visual cues, ignoring even semantically rich scene distractors. Through extensive experiments in both simulation and the real world, we demonstrate that policies trained with AFA significantly outperform standard pooling approaches in the presence of visual perturbations, without requiring expensive dataset augmentation or fine-tuning of the PVR. Our findings show that ignoring extraneous visual information is a crucial step towards deploying robust and generalisable visuomotor policies. Project Page: tsagkas.github.io/afa
Lee Kennedy-Shaffer, Alan Hamilton Kennedy
In the United States, firearm-related deaths and injuries are a major public health issue. Because of limited federal action, state policies are particularly important, and their evaluation informs the actions of other policymakers. The movement of firearms across state and local borders, however, can undermine the effectiveness of these policies and have statistical consequences for their empirical evaluation. This movement causes spillover and bypass effects of policies, wherein interventions affect nearby control states and the lack of intervention in nearby states reduces the effectiveness in the intervention states. While some causal inference methods exist to account for spillover effects and reduce bias, these do not necessarily align well with the data available for firearm research or with the most policy-relevant estimands. Integrated data infrastructure and new methods are necessary for a better understanding of the effects these policies would have if widely adopted. In the meantime, appropriately understanding and interpreting effect estimates from quasi-experimental analyses is crucial for ensuring that effective policies are not dismissed due to these statistical challenges.
Ri-Zhao Qiu, Shiqi Yang, Xuxin Cheng et al.
Training manipulation policies for humanoid robots with diverse data enhances their robustness and generalization across tasks and platforms. However, learning solely from robot demonstrations is labor-intensive, requiring expensive tele-operated data collection which is difficult to scale. This paper investigates a more scalable data source, egocentric human demonstrations, to serve as cross-embodiment training data for robot learning. We mitigate the embodiment gap between humanoids and humans from both the data and modeling perspectives. We collect an egocentric task-oriented dataset (PH2D) that is directly aligned with humanoid manipulation demonstrations. We then train a human-humanoid behavior policy, which we term Human Action Transformer (HAT). The state-action space of HAT is unified for both humans and humanoid robots and can be differentiably retargeted to robot actions. Co-trained with smaller-scale robot data, HAT directly models humanoid robots and humans as different embodiments without additional supervision. We show that human data improves both generalization and robustness of HAT with significantly better data collection efficiency. Code and data: https://human-as-robot.github.io/
Ziheng Cheng, Xin Guo, Yufei Zhang
The theory of continuous-time reinforcement learning (RL) has progressed rapidly in recent years. While the ultimate objective of RL is typically to learn deterministic control policies, most existing continuous-time RL methods rely on stochastic policies. Such approaches often require sampling actions at very high frequencies, and involve computationally expensive expectations over continuous action spaces, resulting in high-variance gradient estimates and slow convergence. In this paper, we introduce and develop deterministic policy gradient (DPG) methods for continuous-time RL. We derive a continuous-time policy gradient formula expressed as the expected gradient of an advantage rate function and establish a martingale characterization for both the value function and the advantage rate. These theoretical results provide tractable estimators for deterministic policy gradients in continuous-time RL. Building on this foundation, we propose a model-free continuous-time Deep Deterministic Policy Gradient (CT-DDPG) algorithm that enables stable learning for general reinforcement learning problems with continuous time-and-state. Numerical experiments show that CT-DDPG achieves superior stability and faster convergence compared to existing stochastic-policy methods, across a wide range of learning tasks with varying time discretizations and noise levels.
Quenizia Vieira Lopes, Adriana Regina de Jesus Santos
Busca-se discorrer sobre a Teoria da Atividade de Leontiev, a qual tem como base teórica a Teoria Histórico-Cultural de Vigotski e, por conseguinte, o Materialismo Histórico-Dialético, de Karl Marx. Expõe-se uma síntese da teoria, abordando seus pontos principais, como a estrutura da atividade. Este trabalho visa auxiliar a assimilação de conhecimentos teóricos de interessados na temática. Como procedimento metodológico, utilizou-se uma abordagem qualitativa a partir da adoção de pesquisa documental. Ao final desse estudo, detectou-se que a Teoria da Atividade pode ser utilizada no contexto educacional. Palavra-chave: Atividade; Motivo; Sentido; Significação; Trabalho.
- -
--
Qirui Mi, Zhiyu Zhao, Chengdong Ma et al.
Macroeconomic outcomes emerge from individuals' decisions, making it essential to model how agents interact with macro policy via consumption, investment, and labor choices. We formulate this as a dynamic Stackelberg game: the government (leader) sets policies, and agents (followers) respond by optimizing their behavior over time. Unlike static models, this dynamic formulation captures temporal dependencies and strategic feedback critical to policy design. However, as the number of agents increases, explicitly simulating all agent-agent and agent-government interactions becomes computationally infeasible. To address this, we propose the Dynamic Stackelberg Mean Field Game (DSMFG) framework, which approximates these complex interactions via agent-population and government-population couplings. This approximation preserves individual-level feedback while ensuring scalability, enabling DSMFG to jointly model three core features of real-world policymaking: dynamic feedback, asymmetry, and large scale. We further introduce Stackelberg Mean Field Reinforcement Learning (SMFRL), a data-driven algorithm that learns the leader's optimal policies while maintaining personalized responses for individual agents. Empirically, we validate our approach in a large-scale simulated economy, where it scales to 1,000 agents (vs. 100 in prior work) and achieves a fourfold increase in GDP over classical economic methods and a nineteenfold improvement over the static 2022 U.S. federal income tax policy.
Joshua Hang Sai Ip, Georgios Makrygiorgos, Ali Mesbah
Deep neural networks are increasingly used as an effective parameterization of control policies in various learning-based control paradigms. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a NODE approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a target state. The approach, termed Lyapunov-NODE control (L-NODEC), uses a novel Lyapunov loss formulation that incorporates an exponentially-stabilizing control Lyapunov function to learn a state-feedback neural control policy, bridging the gap of solving continuous-time OCPs via NODEs with stability guarantees. The proposed Lyapunov loss allows L-NODEC to guarantee exponential stability of the controlled system, as well as its adversarial robustness to perturbations to the initial state. The performance of L-NODEC is illustrated in two problems, including a dose delivery problem in plasma medicine. In both cases, L-NODEC effectively stabilizes the controlled system around the target state despite perturbations to the initial state and reduces the inference time necessary to reach the target.
Madeleine Pollack, Lauren N. Steimle
Markov Decision Processes (MDPs) are mathematical models of sequential decision-making under uncertainty that have found applications in healthcare, manufacturing, logistics, and others. In these models, a decision-maker observes the state of a stochastic process and determines which action to take with the goal of maximizing the expected total discounted rewards received. In many applications, the state space of the true system is large and there may be limited observations out of certain states to estimate the transition probability matrix. To overcome this, modelers will aggregate the true states into ``superstates" resulting in a smaller state space. This aggregation process improves computational tractability and increases the number of observations among superstates. Thus, the modeler's choice of state space leads to a trade-off in transition probability estimates. While coarser discretization of the state space gives more observations in each state to estimate the transition probability matrix, this comes at the cost of precision in the state characterization and resulting policy recommendations. In this paper, we consider the implications of this modeling decision on the resulting policies from MDPs for which the true model is expected to have a threshold policy that is optimal. We analyze these MDPs and provide conditions under which the aggregated MDP will also have an optimal threshold policy. Using a simulation study, we explore the trade-offs between more fine and more coarse aggregation. We explore the the show that there is the highest potential for policy improvement on larger state spaces, but that aggregated MDPs are preferable under limited data. We discuss how these findings the implications of our findings for modelers who must select which state space design to use.
Julia Barnett, Kimon Kieslich, Nicholas Diakopoulos
The rapid advancement of AI technologies yields numerous future impacts on individuals and society. Policymakers are tasked to react quickly and establish policies that mitigate those impacts. However, anticipating the effectiveness of policies is a difficult task, as some impacts might only be observable in the future and respective policies might not be applicable to the future development of AI. In this work we develop a method for using large language models (LLMs) to evaluate the efficacy of a given piece of policy at mitigating specified negative impacts. We do so by using GPT-4 to generate scenarios both pre- and post-introduction of policy and translating these vivid stories into metrics based on human perceptions of impacts. We leverage an already established taxonomy of impacts of generative AI in the media environment to generate a set of scenario pairs both mitigated and non-mitigated by the transparency policy in Article 50 of the EU AI Act. We then run a user study (n=234) to evaluate these scenarios across four risk-assessment dimensions: severity, plausibility, magnitude, and specificity to vulnerable populations. We find that this transparency legislation is perceived to be effective at mitigating harms in areas such as labor and well-being, but largely ineffective in areas such as social cohesion and security. Through this case study we demonstrate the efficacy of our method as a tool to iterate on the effectiveness of policy for mitigating various negative impacts. We expect this method to be useful to researchers or other stakeholders who want to brainstorm the potential utility of different pieces of policy or other mitigation strategies.
Xin Chen, Yifan Hu, Minda Zhao
Policy gradient methods are widely used in reinforcement learning. Yet, the nonconvexity of policy optimization poses significant challenges in understanding the global convergence of policy gradient methods. For a class of finite-horizon Markov Decision Processes (MDPs) with general state and action spaces, we identify a set of structural properties to establish a benign nonconvex landscape, the Polyak-Łojasiewicz-Kurdyka (PŁK) condition of the policy optimization. Leveraging the PŁK condition, policy gradient methods converge to the globally optimal policy with a non-asymptotic rate despite nonconvexity. Our results apply to various control and operations models, including entropy-regularized tabular MDPs, Linear Quadratic Regulator problems, and both stochastic inventory models and stochastic cash balance problems with strongly convex costs. In these models, stochastic policy gradient methods obtain an $ε$-optimal policy using a sample size of $\tilde{\mathcal{O}}(ε^{-1})$ and polynomial in terms of the planning horizon. To the best of our knowledge, we provide the first sample-complexity guarantees for multi-period inventory systems with Markov-modulated demand and for stochastic cash balance problems. We complement the theory with numerical experiments showing that policy gradient methods outperform several benchmark algorithms from the literature across these operations models.
Utkan Uluçay
The United Nations Sustainable Development Goals (UN SDG 10) seek to reduce inequalities within and between countries. The root cause is the inequality of income distribution due to the poor distribution of global wealth. Each country deals with this problem on its own terms, and factors such as differences in country conditions, historical background, and inconsistency of data make running a comparison challenging. This study examines policy alternatives for complex and highly interactive socioeconomic structures using a highly reproducible simulator package that does not require simulation experience. No research has been found in the literature to have examined the effect of policy alternatives on the Gini coefficient in a simulated environment. Therefore, by reviewing the income distribution inequality measurement methods, this study evaluates basic policy components with a discrete event/agent-based simulation. In order to reduce the inequality of income distribution, a schedule has been proposed that includes a short-term shock program, gradual increase of the wealth tax by 20%-50%, and a transfer of 20% of the obtained funds to the poorest 20%. Attention has been drawn to questions that are useful for contributing to decision makers and for discussions on the political agenda.
José Barata-Moura
--
Buket Ökten Sipahioğlu
In the forced migration context, the term"refugee" is mostly related to negative issues because of the prior refugee crisis experiences and the frighteningly increasing numbers of displaced people around the world. Although the term "crisis" mostly evokes negative connotations, it still makes sense for unsolved events. The crisis of the refugees still awaits solutions and thereby demands responses. Furthermore, not every refugee receives the same response from their host community or enjoys the same rights as has recently been seen in the Syrian-Ukrainian refugee examples. Some of them are discriminated against in their race and ethnicity, are "unseen" in their host community, and cannot raise their voices as needed. In this sense, this study points out, in a descriptive way, how refugees are "labeled", and therefore, discriminated against and treated differently based on their race and ethnicity, taking out Syrian and Ukrainian refugees as an example. Accordingly, the main argument of the study is that policymakers, as well as refugee advocates, can somehow change the current ‘anti-refugee’ perception and execution of the law.
Weiye Zhao, Rui Chen, Yifan Sun et al.
Reinforcement Learning (RL) algorithms have shown tremendous success in simulation environments, but their application to real-world problems faces significant challenges, with safety being a major concern. In particular, enforcing state-wise constraints is essential for many challenging tasks such as autonomous driving and robot manipulation. However, existing safe RL algorithms under the framework of Constrained Markov Decision Process (CMDP) do not consider state-wise constraints. To address this gap, we propose State-wise Constrained Policy Optimization (SCPO), the first general-purpose policy search algorithm for state-wise constrained reinforcement learning. SCPO provides guarantees for state-wise constraint satisfaction in expectation. In particular, we introduce the framework of Maximum Markov Decision Process, and prove that the worst-case safety violation is bounded under SCPO. We demonstrate the effectiveness of our approach on training neural network policies for extensive robot locomotion tasks, where the agent must satisfy a variety of state-wise safety constraints. Our results show that SCPO significantly outperforms existing methods and can handle state-wise constraints in high-dimensional robotics tasks.
Wenzhe Cai, Teng Wang, Guangran Cheng et al.
In recent years, learning-based approaches have demonstrated significant promise in addressing intricate navigation tasks. Traditional methods for training deep neural network navigation policies rely on meticulously designed reward functions or extensive teleoperation datasets as navigation demonstrations. However, the former is often confined to simulated environments, and the latter demands substantial human labor, making it a time-consuming process. Our vision is for robots to autonomously learn navigation skills and adapt their behaviors to environmental changes without any human intervention. In this work, we discuss the self-supervised navigation problem and present Dynamic Graph Memory (DGMem), which facilitates training only with on-board observations. With the help of DGMem, agents can actively explore their surroundings, autonomously acquiring a comprehensive navigation policy in a data-efficient manner without external feedback. Our method is evaluated in photorealistic 3D indoor scenes, and empirical studies demonstrate the effectiveness of DGMem.
M. Duggan, A. Guo, Andrew C. Johnston
Unemployment-insurance taxes are experience rated, penalizing firms that dismiss workers. We examine whether experience rating serves as an automatic stabilizer in the labor market. Taking advantage of the variation in layoff penalties across states, we utilize detailed data on state tax schedules and assess whether firms are less responsive to labor-demand shocks when facing higher layoff penalties. Our findings show that average layoff penalties from UI reduce firm adjustments to negative shocks by 11%. This indicates that experience rating contributes to labor market stabilization. For example, during the Great Recession, experience rating preserved nearly a million jobs.
Vien The Giang, Vo Thi My Huong
The article analyzes and clarifies the position and role of business households in the system of business entities in the market economy in Vietnam. From its small-scale position, restrictions on the rights to use labor, and business locations, current Vietnamese laws have established provisions to ensure equality in legal status, autonomy, and self-responsibility on business transactions of business households. However, the business household is built and managed based on the family, the members of the business household both show blood relation and economic relation. Therefore, the family traditional cultural factors have a huge impact on the internal and business relations of household businesses. The traditional family relationship and the relationship among the members (of the business household) in the business relations related to asset liability and the development support policy of the State will form the pillar to promote the development of household businesses to become an important and indispensable part of the market economy and international integration in our country currently.
Halaman 24 dari 193588