SILK: Smooth InterpoLation frameworK for motion in-betweening A Simplified Computational Approach
Elly Akhoundi, Hung Yu Ling, Anup Anand Deshmukh
et al.
Motion in-betweening is a crucial tool for animators, enabling intricate control over pose-level details in each keyframe. Recent machine learning solutions for motion in-betweening rely on complex models, incorporating skeleton-aware architectures or requiring multiple modules and training steps. In this work, we introduce a simple yet effective Transformer-based framework, employing a single Transformer encoder to synthesize realistic motions for motion in-betweening tasks. We find that data modeling choices play a significant role in improving in-betweening performance. Among others, we show that increasing data volume can yield equivalent or improved motion transitions, that the choice of pose representation is vital for achieving high-quality results, and that incorporating velocity input features enhances animation performance. These findings challenge the assumption that model complexity is the primary determinant of animation quality and provide insights into a more data-centric approach to motion interpolation. Additional videos and supplementary material are available at https://silk-paper.github.io.
Weakly and Self-Supervised Class-Agnostic Motion Prediction for Autonomous Driving
Ruibo Li, Hanyu Shi, Zhe Wang
et al.
Understanding motion in dynamic environments is critical for autonomous driving, thereby motivating research on class-agnostic motion prediction. In this work, we investigate weakly and self-supervised class-agnostic motion prediction from LiDAR point clouds. Outdoor scenes typically consist of mobile foregrounds and static backgrounds, allowing motion understanding to be associated with scene parsing. Based on this observation, we propose a novel weakly supervised paradigm that replaces motion annotations with fully or partially annotated (1%, 0.1%) foreground/background masks for supervision. To this end, we develop a weakly supervised approach utilizing foreground/background cues to guide the self-supervised learning of motion prediction models. Since foreground motion generally occurs in non-ground regions, non-ground/ground masks can serve as an alternative to foreground/background masks, further reducing annotation effort. Leveraging non-ground/ground cues, we propose two additional approaches: a weakly supervised method requiring fewer (0.01%) foreground/background annotations, and a self-supervised method without annotations. Furthermore, we design a Robust Consistency-aware Chamfer Distance loss that incorporates multi-frame information and robust penalty functions to suppress outliers in self-supervised learning. Experiments show that our weakly and self-supervised models outperform existing self-supervised counterparts, and our weakly supervised models even rival some supervised ones. This demonstrates that our approaches effectively balance annotation effort and performance.
Collecting Human Motion Data in Large and Occlusion-Prone Environments using Ultra-Wideband Localization
Janik Kaden, Maximilian Hilger, Tim Schreiter
et al.
With robots increasingly integrating into human environments, understanding and predicting human motion is essential for safe and efficient interactions. Modern human motion and activity prediction approaches require high quality and quantity of data for training and evaluation, usually collected from motion capture systems, onboard or stationary sensors. Setting up these systems is challenging due to the intricate setup of hardware components, extensive calibration procedures, occlusions, and substantial costs. These constraints make deploying such systems in new and large environments difficult and limit their usability for in-the-wild measurements. In this paper we investigate the possibility to apply the novel Ultra-Wideband (UWB) localization technology as a scalable alternative for human motion capture in crowded and occlusion-prone environments. We include additional sensing modalities such as eye-tracking, onboard robot LiDAR and radar sensors, and record motion capture data as ground truth for evaluation and comparison. The environment imitates a museum setup, with up to four active participants navigating toward random goals in a natural way, and offers more than 130 minutes of multi-modal data. Our investigation provides a step toward scalable and accurate motion data collection beyond vision-based systems, laying a foundation for evaluating sensing modalities like UWB in larger and complex environments like warehouses, airports, or convention centers.
Explicit solitary wave structure for the stochastic resonance nonlinear Schrödinger equation under Brownian motion with dynamical analysis
Sumaira Nawaz, Muhammad Ozair Ahmed, Muhammad Zafarullah Baber
et al.
Abstract This study, analyzed the explicit solitary wave soliton for the stochastic resonance nonlinear Schrödinger equation under the Brownian motion. The Schrödinger equations are mostly used to describe how light moves via planar wave guides and nonlinear optical fibres. Analytical technique is applied to gained the various solitary waves and soliton solutions for the resonance nonlinear Schrödinger equation namely, generalized exponential rational function method. This approach is used to find several new trigonometric, exponential, and hyperbolic solutions under the noise. This method is provided us the soliton solutions for nonlinear models that is a computed using an efficient, accurate, capable, and trustworthy method. Furthermore, by varying the parameters, a few graphs of the developed solutions are shown to illustrate the physical setup of stochastic solutions. We anticipate that the obtained results will have significant potential applications in quantum mechanics, magneto-electrodynamics, optical fibres, and heavy ion collisions. Moreover, using the Galilean transformation, the dynamical system of the governing equation is obtained, and the theory of the planar dynamical system is used to carry out its sensitivity, chaotic and bifurcation. By providing certain two- and three-dimensional phase pictures, the existence of chaotic behaviors of the resonance nonlinear Schrödinger equation is examined by taking into account a perturbed term in the resulting dynamical system.
The State of Robot Motion Generation
Kostas E. Bekris, Joe Doerr, Patrick Meng
et al.
This paper reviews the large spectrum of methods for generating robot motion proposed over the 50 years of robotics research culminating in recent developments. It crosses the boundaries of methodologies, typically not surveyed together, from those that operate over explicit models to those that learn implicit ones. The paper discusses the current state-of-the-art as well as properties of varying methodologies, highlighting opportunities for integration.
Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers
Omer Sahin Tas, Royden Wagner
Transformer-based models generate hidden states that are difficult to interpret. In this work, we analyze hidden states and modify them at inference, with a focus on motion forecasting. We use linear probing to analyze whether interpretable features are embedded in hidden states. Our experiments reveal high probing accuracy, indicating latent space regularities with functionally important directions. Building on this, we use the directions between hidden states with opposing features to fit control vectors. At inference, we add our control vectors to hidden states and evaluate their impact on predictions. Remarkably, such modifications preserve the feasibility of predictions. We further refine our control vectors using sparse autoencoders (SAEs). This leads to more linear changes in predictions when scaling control vectors. Our approach enables mechanistic interpretation as well as zero-shot generalization to unseen dataset characteristics with negligible computational overhead.
A Proposed Workflow for the Restoration of Image Artifacts in Forensic Applications
Fabrizio Argenti, Stefania Bellavia, Marco Fontani
et al.
The last years have witnessed significant developments in image acquisition systems and in algorithms for extracting information from them. Nevertheless, in many scenarios, several factors can hinder the recovery of useful data from images. This is especially true and important in forensic applications, where images are often accidentally captured by an imaging system not engineered for that specific acquisition (for example, the surveillance system designed to monitor the entrance of a bank may accidentally capture the license plate of a vehicle passing outside the bank). Therefore, the acquired images often need to be processed to facilitate extracting information from them. When facing a combination of several impairment factors, such as blur and perspective distortion, several image restoration algorithms must be applied. Then, it is necessary to choose a restoration order, which means the order by which single restoration algorithms are chained together to obtain the enhanced image. This study aims to understand whether such an order may impact the final result. Of course, there exists a wide variety of image impairments; in this study, we focus on the case of an image affected by a combination of optical/motion blur, perspective distortion, and additive noise, which are all widespread artifacts in forensic image applications. To answer the question about the importance of choosing one restoration workflow over another, we first model each considered defect and its restoration operator and then analyze and compare the effects of the composition of such operators on the restored output. Such a comparison is made from both a mathematical and experimental point of view, using both images with synthetically generated impairments and pictures with real degradations. The results show that the restoration order can affect significantly the results, especially when the defects are severe.
Electrical engineering. Electronics. Nuclear engineering
"TÁR" (2022) de Todd Field. A Maestrina de Berlim
Teresa Norton Dias
TÁR (2022) retrata uma fase da vida de uma maestrina, Lydia Tár, da Orquestra Sinfónica de Berlim, na sua atitude perante a arte que domina, como autora; na sua profissão, como maestrina; na sua vida pessoal, como mulher e na sua relação com os outros. Impondo-se Lydia Tár numa estrutura vertical, essencialmente masculina, são apresentados a quem vê o filme, pelo autor do guião e realizador, Todd Field, um conjunto de aspetos que extravasam os da sua profissão como líder de uma orquestra, onde a predominância das chefias é também ela, maioritariamente masculina. Nesta análise crítica procuraremos, com ajuda de autores como Judith Butler, estabelecer um paralelo entre a realidade trabalhada nesta ficção e a realidade que o século XXI nos oferece.
Visual arts, Motion pictures
Situational Adaptive Motion Prediction for Firefighting Squads in Indoor Search and Rescue
Nils Mandischer, Frederik Schicks, Burkhard Corves
Firefighting is a complex, yet low automated task. To mitigate ergonomic and safety related risks on the human operators, robots could be deployed in a collaborative approach. To allow human-robot teams in firefighting, important basics are missing. Amongst other aspects, the robot must predict the human motion as occlusion is ever-present. In this work, we propose a novel motion prediction pipeline for firefighters' squads in indoor search and rescue. The squad paths are generated with an optimal graph-based planning approach representing firefighters' tactics. Paths are generated per room which allows to dynamically adapt the path locally without global re-planning. The motion of singular agents is simulated using a modification of the headed social force model. We evaluate the pipeline for feasibility with a novel data set generated from real footage and show the computational efficiency.
MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying
Shaoshuai Shi, Li Jiang, Dengxin Dai
et al.
Motion prediction is crucial for autonomous driving systems to understand complex driving scenarios and make informed decisions. However, this task is challenging due to the diverse behaviors of traffic participants and complex environmental contexts. In this paper, we propose Motion TRansformer (MTR) frameworks to address these challenges. The initial MTR framework utilizes a transformer encoder-decoder structure with learnable intention queries, enabling efficient and accurate prediction of future trajectories. By customizing intention queries for distinct motion modalities, MTR improves multimodal motion prediction while reducing reliance on dense goal candidates. The framework comprises two essential processes: global intention localization, identifying the agent's intent to enhance overall efficiency, and local movement refinement, adaptively refining predicted trajectories for improved accuracy. Moreover, we introduce an advanced MTR++ framework, extending the capability of MTR to simultaneously predict multimodal motion for multiple agents. MTR++ incorporates symmetric context modeling and mutually-guided intention querying modules to facilitate future behavior interaction among multiple agents, resulting in scene-compliant future trajectories. Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.
A Clinical Dataset for the Evaluation of Motion Planners in Medical Applications
Inbar Fried, Jason A. Akulian, Ron Alterovitz
The prospect of using autonomous robots to enhance the capabilities of physicians and enable novel procedures has led to considerable efforts in developing medical robots and incorporating autonomous capabilities. Motion planning is a core component for any such system working in an environment that demands near perfect levels of safety, reliability, and precision. Despite the extensive and promising work that has gone into developing motion planners for medical robots, a standardized and clinically-meaningful way to compare existing algorithms and evaluate novel planners and robots is not well established. We present the Medical Motion Planning Dataset (Med-MPD), a publicly-available dataset of real clinical scenarios in various organs for the purpose of evaluating motion planners for minimally-invasive medical robots. Our goal is that this dataset serve as a first step towards creating a larger robust medical motion planning benchmark framework, advance research into medical motion planners, and lift some of the burden of generating medical evaluation data.
"You can't beat evil by doing evil":Buffy, Discursive Challenges, and Nuclear Weapons
Broadhead, Lee-Anne
Age-Related Differences in Strategy in the Hand Mental Rotation Task
Izumi Nagashima, Kotaro Takeda, Yusuke Harada
et al.
Mental imagery of movement is a potentially valuable rehabilitation task, but its therapeutic efficacy may depend on the specific cognitive strategy employed. Individuals use two main strategies to perform the hand mental rotation task (HMRT), which involves determining whether a visual image depicts a left or right hand. One is the motor imagery (MI) strategy, which involves mentally simulating one’s own hand movements. In this case, task performance as measured by response time (RT) is subject to a medial–lateral effect wherein the RT is reduced when the fingertips are directed medially, presumably as the actual motion would be easier. The other strategy is to employ visual imagery (VI), which involves mentally rotating the picture and is not subject to this medial–lateral effect. The rehabilitative benefits of the HMRT are thought to depend on the MI strategy (mental practice), so it is essential to examine the effects of individual factors such as age, image perspective (e.g., palm or back of the hand), and innate ability (as indicated by baseline RT) on the strategy adopted. When presented with pictures of the palm, all subjects in the current study used the MI strategy, regardless of age and ability. In contrast, when subjects were presented with pictures of the back of the hand, the VI strategy predominated among the young age group regardless of performance, while the strategy used by middle-age and elderly groups depended on performance ability. In the middle-age and elderly groups, the VI approach predominated in those with high performance skill, whereas the MI strategy predominated among those with low performance skill. Thus, higher-skill middle-aged and elderly individuals may not necessarily form a motion image during the HMRT, potentially limiting rehabilitation efficacy.
Neurosciences. Biological psychiatry. Neuropsychiatry
Unpaired Motion Style Transfer from Video to Animation
Kfir Aberman, Yijia Weng, Dani Lischinski
et al.
Transferring the motion style from one animation clip to another, while preserving the motion content of the latter, has been a long-standing problem in character animation. Most existing data-driven approaches are supervised and rely on paired data, where motions with the same content are performed in different styles. In addition, these approaches are limited to transfer of styles that were seen during training. In this paper, we present a novel data-driven framework for motion style transfer, which learns from an unpaired collection of motions with style labels, and enables transferring motion styles not observed during training. Furthermore, our framework is able to extract motion styles directly from videos, bypassing 3D reconstruction, and apply them to the 3D input motion. Our style transfer network encodes motions into two latent codes, for content and for style, each of which plays a different role in the decoding (synthesis) process. While the content code is decoded into the output motion by several temporal convolutional layers, the style code modifies deep features via temporally invariant adaptive instance normalization (AdaIN). Moreover, while the content code is encoded from 3D joint rotations, we learn a common embedding for style from either 3D or 2D joint positions, enabling style extraction from videos. Our results are comparable to the state-of-the-art, despite not requiring paired training data, and outperform other methods when transferring previously unseen styles. To our knowledge, we are the first to demonstrate style transfer directly from videos to 3D animations - an ability which enables one to extend the set of style examples far beyond motions captured by MoCap systems.
Learning for Advanced Motion Control
Tom Oomen
Iterative Learning Control (ILC) can achieve perfect tracking performance for mechatronic systems. The aim of this paper is to present an ILC design tutorial for industrial mechatronic systems. First, a preliminary analysis reveals the potential performance improvement of ILC prior to its actual implementation. Second, a frequency domain approach is presented, where fast learning is achieved through noncausal model inversion, and safe and robust learning is achieved by employing a contraction mapping theorem in conjunction with nonparametric frequency response functions. The approach is demonstrated on a desktop printer. Finally, a detailed analysis of industrial motion systems leads to several shortcomings that obstruct the widespread implementation of ILC algorithms. An overview of recently developed algorithms, including extensions using machine learning algorithms, is outlined that are aimed to facilitate broad industrial deployment.
Brownian motion and affine Kac-Moody algebras
Manon Defosseux
This is a summary (in French) of my work about brownian motion and Kac-Moody algebras during the last seven years, presented towards the Habilitation degree.
Socially intelligent task and motion planning for human-robot interaction
Andrea Frank, Laurel Riek
As social beings, much human behavior is predicated on social context - the ambient social state that includes cultural norms, social signals, individual preferences, etc. In this paper, we propose a socially-aware task and motion planning algorithm that considers social context to generate appropriate and effective plans in human social environments (HSEs). The key strength of our proposed approach is that it explicitly models how potential actions not only affect objective cost, but also transform the social context in which it plans and acts. We investigate strategies to limit the complexity of our algorithm, so that our planner will remain tractable for mobile platforms in complex HSEs like hospitals and factories. The planner will also consider the relative importance and urgency of its tasks, which it uses to determine when it is and is not appropriate to violate social expectations to achieve its objective. This social awareness will allow robots to understand a fundamental rule of society: just because something makes your job easier, does not make it the right thing to do! To our knowledge, the proposed work is the first task and motion planning approach that supports socially intelligent robot policy for HSEs. Through this ongoing work, robots will be able to understand, respect, and leverage social context accomplish tasks both acceptably and effectively in HSEs.
Learning Manifolds for Sequential Motion Planning
Isabel M. Rayas Fernández, Giovanni Sutanto, Peter Englert
et al.
Motion planning with constraints is an important part of many real-world robotic systems. In this work, we study manifold learning methods to learn such constraints from data. We explore two methods for learning implicit constraint manifolds from data: Variational Autoencoders (VAE), and a new method, Equality Constraint Manifold Neural Network (ECoMaNN). With the aim of incorporating learned constraints into a sampling-based motion planning framework, we evaluate the approaches on their ability to learn representations of constraints from various datasets and on the quality of paths produced during planning.
A Rough Super-Brownian Motion
Nicolas Perkowski, Tommaso Cornelis Rosati
We study the scaling limit of a branching random walk in static random environment in dimension $d=1,2$ and show that it is given by a super-Brownian motion in a white noise potential. In dimension $1$ we characterize the limit as the unique weak solution to the stochastic PDE: \[\partial_t μ= (Δ{+} ξ) μ{+} \sqrt{2νμ} \tildeξ\] for independent space white noise $ξ$ and space-time white noise $\tildeξ$. In dimension $2$ the study requires paracontrolled theory and the limit process is described via a martingale problem. In both dimensions we prove persistence of this rough version of the super-Brownian motion.
As Favelas nos Documentários Brasileiros: A participação da comunidade local na representação da realidade
Lara Silva Fagundes
O filme documentário pode ser considerado a essência do cinema. Como destaca Penafria (1999), as primeiras imagens em movimento tinham como objetivo apenas registrar os acontecimentos da vida. O documentário assume importância antropológica e relevância em dimensão social e cultural, pela exposição e representação de realidades distintas e pela reflexão sobre a construção de identidades sobre territórios e os corpos que os habitam. Esta pesquisa foi desenvolvida com o objetivo de investigar dez documentários produzidos entre 2010 e 2016, no Rio de Janeiro (Brasil), sobre a temática “favelas”, para desenvolver uma reflexão e análise categorial sobre os aspetos relevantes de cada produção audiovisual e debater enfoques que representam a realidade dessas comunidades e de seus moradores. A análise permitiu promover reflexão sobre atores reais em seus contextos culturais, as próprias representações da realidade e construção de identidades. Os filmes evidenciam a relação dos protagonistas com os filmes, as participações na conceção dos documentários, a relação entre homem e território, as experiências e vivências dos moradores e suas múltiplas atuações. O caráter participativo das produções permite refletir sobre a autorrepresentação dos moradores das “favelas” e sobre o documentário enquanto instrumento de mobilização social que ultrapassa barreiras urbanísticas e promove [...].
Visual arts, Motion pictures