Deconstructing Paperlessness: Documentary, Mise-en-scene and Participation in Feminist and Decolonial Film Practices; the Case Study of LALA
Gabriele
This paper critically examines LALA, a hybrid documentary-fiction film rooted in feminist, decolonial, and participatory filmmaking practices. Using an autoethnographic lens, the project reflects on the ethical and creative challenges of representing “paperlessness” —a condition of legal and symbolic invisibility experienced by second-generation Roma youth in Italy. Drawing on bell hooks’ concept of the “politics of location,” the work situates personal and collective trauma as sites of cultural critique and transformation. The film’s participatory development, including workshops inspired by Augusto Boal, Paulo Freire, and Pina Bausch, fostered co-creation and embodied storytelling with marginalised teenagers. This paper explores how fiction, performance, and lived experience interweave to disrupt dominant narratives and reclaim agency for those rendered invisible by state structures. LALA thus emerges not only as a film but as a political and reparative process—one that reimagines representation through vulnerability, collaboration, and intersectional resistance.
A Self-Supervised Approach on Motion Calibration for Enhancing Physical Plausibility in Text-to-Motion
Gahyeon Shim, Soogeun Park, Hyemin Ahn
Generating semantically aligned human motion from textual descriptions has made rapid progress, but ensuring both semantic and physical realism in motion remains a challenge. In this paper, we introduce the Distortion-aware Motion Calibrator (DMC), a post-hoc module that refines physically implausible motions (e.g., foot floating) while preserving semantic consistency with the original textual description. Rather than relying on complex physical modeling, we propose a self-supervised and data-driven approach, whereby DMC learns to obtain physically plausible motions when an intentionally distorted motion and the original textual descriptions are given as inputs. We evaluate DMC as a post-hoc module to improve motions obtained from various text-to-motion generation models and demonstrate its effectiveness in improving physical plausibility while enhancing semantic consistency. The experimental results show that DMC reduces FID score by 42.74% on T2M and 13.20% on T2M-GPT, while also achieving the highest R-Precision. When applied to high-quality models like MoMask, DMC improves the physical plausibility of motions by reducing penetration by 33.0% as well as adjusting floating artifacts closer to the ground-truth reference. These results highlight that DMC can serve as a promising post-hoc motion refinement framework for any kind of text-to-motion models by incorporating textual semantics and physical plausibility.
Separate Motion from Appearance: Customizing Motion via Customizing Text-to-Video Diffusion Models
Huijie Liu, Jingyun Wang, Shuai Ma
et al.
Motion customization aims to adapt the diffusion model (DM) to generate videos with the motion specified by a set of video clips with the same motion concept. To realize this goal, the adaptation of DM should be possible to model the specified motion concept, without compromising the ability to generate diverse appearances. Thus, the key to solving this problem lies in how to separate the motion concept from the appearance in the adaptation process of DM. Typical previous works explore different ways to represent and insert a motion concept into large-scale pretrained text-to-video diffusion models, e.g., learning a motion LoRA, using latent noise residuals, etc. While those methods can encode the motion concept, they also inevitably encode the appearance in the reference videos, resulting in weakened appearance generation capability. In this paper, we follow the typical way to learn a motion LoRA to encode the motion concept, but propose two novel strategies to enhance motion-appearance separation, including temporal attention purification (TAP) and appearance highway (AH). Specifically, we assume that in the temporal attention module, the pretrained Value embeddings are sufficient to serve as basic components needed by producing a new motion. Thus, in TAP, we choose only to reshape the temporal attention with motion LoRAs so that Value embeddings can be reorganized to produce a new motion. Further, in AH, we alter the starting point of each skip connection in U-Net from the output of each temporal attention module to the output of each spatial attention module. Extensive experiments demonstrate that compared to previous works, our method can generate videos with appearance more aligned with the text descriptions and motion more consistent with the reference videos.
Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image
Jerred Chen, Ronald Clark
In many robotics and VR/AR applications, fast camera motions lead to a high level of motion blur, causing existing camera pose estimation methods to fail. In this work, we propose a novel framework that leverages motion blur as a rich cue for motion estimation rather than treating it as an unwanted artifact. Our approach works by predicting a dense motion flow field and a monocular depth map directly from a single motion-blurred image. We then recover the instantaneous camera velocity by solving a linear least squares problem under the small motion assumption. In essence, our method produces an IMU-like measurement that robustly captures fast and aggressive camera movements. To train our model, we construct a large-scale dataset with realistic synthetic motion blur derived from ScanNet++v2 and further refine our model by training end-to-end on real data using our fully differentiable pipeline. Extensive evaluations on real-world benchmarks demonstrate that our method achieves state-of-the-art angular and translational velocity estimates, outperforming current methods like MASt3R and COLMAP.
Não Sou Nada (2023): Uma análise das representações mediáticas
Lima, Teresa
The current research focused on collating national and international news related to the film Não Sou Nada – The Nothingness Club, by Portuguese filmmaker Edgar Pêra, with a time frame from July 2020 to April 2024. According to the initial motivation, we sought to understand, through the film, what kind of representations the media devoted to Fernando Pessoa and Edgar Pêra. There was also an interest in understanding the place the film would occupy in the public sphere, through the mediation of the mass media. The subsequent analysis of the data provided some answers to the questions raised and led us to other kinds of questions that were not initially anticipated. Thus, we have found that the majority of news information disseminated in Portugal is characterized by the uncritical reproduction of press releases or information from news agencies. Beyond the evidence listed, in this review we have tried to question the role of the media in the construction of new socio-cultural realities.
French literature - Italian literature - Spanish literature - Portuguese literature
Human Motion Synthesis_ A Diffusion Approach for Motion Stitching and In-Betweening
Michael Adewole, Oluwaseyi Giwa, Favour Nerrise
et al.
Human motion generation is an important area of research in many fields. In this work, we tackle the problem of motion stitching and in-betweening. Current methods either require manual efforts, or are incapable of handling longer sequences. To address these challenges, we propose a diffusion model with a transformer-based denoiser to generate realistic human motion. Our method demonstrated strong performance in generating in-betweening sequences, transforming a variable number of input poses into smooth and realistic motion sequences consisting of 75 frames at 15 fps, resulting in a total duration of 5 seconds. We present the performance evaluation of our method using quantitative metrics such as Frechet Inception Distance (FID), Diversity, and Multimodality, along with visual assessments of the generated outputs.
BAMM: Bidirectional Autoregressive Motion Model
Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang
et al.
Generating human motion from text has been dominated by denoising motion models either through diffusion or generative masking process. However, these models face great limitations in usability by requiring prior knowledge of the motion length. Conversely, autoregressive motion models address this limitation by adaptively predicting motion endpoints, at the cost of degraded generation quality and editing capabilities. To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework. BAMM consists of two key components: (1) a motion tokenizer that transforms 3D human motion into discrete tokens in latent space, and (2) a masked self-attention transformer that autoregressively predicts randomly masked tokens via a hybrid attention masking strategy. By unifying generative masked modeling and autoregressive modeling, BAMM captures rich and bidirectional dependencies among motion tokens, while learning the probabilistic mapping from textual inputs to motion outputs with dynamically-adjusted motion sequence length. This feature enables BAMM to simultaneously achieving high-quality motion generation with enhanced usability and built-in motion editability. Extensive experiments on HumanML3D and KIT-ML datasets demonstrate that BAMM surpasses current state-of-the-art methods in both qualitative and quantitative measures. Our project page is available at https://exitudio.github.io/BAMM-page
Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators
Daniel Geng, Andrew Owens
Diffusion models are capable of generating impressive images conditioned on text descriptions, and extensions of these models allow users to edit images at a relatively coarse scale. However, the ability to precisely edit the layout, position, pose, and shape of objects in images with diffusion models is still difficult. To this end, we propose motion guidance, a zero-shot technique that allows a user to specify dense, complex motion fields that indicate where each pixel in an image should move. Motion guidance works by steering the diffusion sampling process with the gradients through an off-the-shelf optical flow network. Specifically, we design a guidance loss that encourages the sample to have the desired motion, as estimated by a flow network, while also being visually similar to the source image. By simultaneously sampling from a diffusion model and guiding the sample to have low guidance loss, we can obtain a motion-edited image. We demonstrate that our technique works on complex motions and produces high quality edits of real and generated images.
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing
Yi Zuo, Lingling Li, Licheng Jiao
et al.
Existing diffusion-based methods have achieved impressive results in human motion editing. However, these methods often exhibit significant ghosting and body distortion in unseen in-the-wild cases. In this paper, we introduce Edit-Your-Motion, a video motion editing method that tackles these challenges through one-shot fine-tuning on unseen cases. Specifically, firstly, we utilized DDIM inversion to initialize the noise, preserving the appearance of the source video and designed a lightweight motion attention adapter module to enhance motion fidelity. DDIM inversion aims to obtain the implicit representations by estimating the prediction noise from the source video, which serves as a starting point for the sampling process, ensuring the appearance consistency between the source and edited videos. The Motion Attention Module (MA) enhances the model's motion editing ability by resolving the conflict between the skeleton features and the appearance features. Secondly, to effectively decouple motion and appearance of source video, we design a spatio-temporal two-stage learning strategy (STL). In the first stage, we focus on learning temporal features of human motion and propose recurrent causal attention (RCA) to ensure consistency between video frames. In the second stage, we shift focus on learning the appearance features of the source video. With Edit-Your-Motion, users can edit the motion of humans in the source video, creating more engaging and diverse content. Extensive qualitative and quantitative experiments, along with user preference studies, show that Edit-Your-Motion outperforms other methods.
Understanding visual processing of motion: Completing the picture using experimentally driven computational models of MT
Parvin Zarei Eskikand, David B Grayden, Tatiana Kameneva
et al.
Computational modeling helps neuroscientists to integrate and explain experimental data obtained through neurophysiological and anatomical studies, thus providing a mechanism by which we can better understand and predict the principles of neural computation. Computational modeling of the neuronal pathways of the visual cortex has been successful in developing theories of biological motion processing. This review describes a range of computational models that have been inspired by neurophysiological experiments. Theories of local motion integration and pattern motion processing are presented, together with suggested neurophysiological experiments designed to test those hypotheses.
The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion Prediction
Xiaogang Peng, Xiao Zhou, Yikai Luo
et al.
3D multi-person motion prediction is a challenging task that involves modeling individual behaviors and interactions between people. Despite the emergence of approaches for this task, comparing them is difficult due to the lack of standardized training settings and benchmark datasets. In this paper, we introduce the Multi-Person Interaction Motion (MI-Motion) Dataset, which includes skeleton sequences of multiple individuals collected by motion capture systems and refined and synthesized using a game engine. The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes. To facilitate research in multi-person motion prediction, we also provide benchmarks to evaluate the performance of prediction methods in three settings: short-term, long-term, and ultra-long-term prediction. Additionally, we introduce a novel baseline approach that leverages graph and temporal convolutional networks, which has demonstrated competitive results in multi-person motion prediction. We believe that the proposed MI-Motion benchmark dataset and baseline will facilitate future research in this area, ultimately leading to better understanding and modeling of multi-person interactions.
Motion-R3: Fast and Accurate Motion Annotation via Representation-based Representativeness Ranking
Jubo Yu, Tianxiang Ren, Shihui Guo
et al.
In this paper, we follow a data-centric philosophy and propose a novel motion annotation method based on the inherent representativeness of motion data in a given dataset. Specifically, we propose a Representation-based Representativeness Ranking R3 method that ranks all motion data in a given dataset according to their representativeness in a learned motion representation space. We further propose a novel dual-level motion constrastive learning method to learn the motion representation space in a more informative way. Thanks to its high efficiency, our method is particularly responsive to frequent requirements change and enables agile development of motion annotation models. Experimental results on the HDM05 dataset against state-of-the-art methods demonstrate the superiority of our method.
Motion Gait: Gait Recognition via Motion Excitation
Yunpeng Zhang, Zhengyou Wang, Shanna Zhuang
et al.
Gait recognition, which can realize long-distance and contactless identification, is an important biometric technology. Recent gait recognition methods focus on learning the pattern of human movement or appearance during walking, and construct the corresponding spatio-temporal representations. However, different individuals have their own laws of movement patterns, simple spatial-temporal features are difficult to describe changes in motion of human parts, especially when confounding variables such as clothing and carrying are included, thus distinguishability of features is reduced. In this paper, we propose the Motion Excitation Module (MEM) to guide spatio-temporal features to focus on human parts with large dynamic changes, MEM learns the difference information between frames and intervals, so as to obtain the representation of temporal motion changes, it is worth mentioning that MEM can adapt to frame sequences with uncertain length, and it does not add any additional parameters. Furthermore, we present the Fine Feature Extractor (FFE), which independently learns the spatio-temporal representations of human body according to different horizontal parts of individuals. Benefiting from MEM and FFE, our method innovatively combines motion change information, significantly improving the performance of the model under cross appearance conditions. On the popular dataset CASIA-B, our proposed Motion Gait is better than the existing gait recognition methods.
Motion Deblurring with Real Events
Fang Xu, Lei Yu, Bishan Wang
et al.
In this paper, we propose an end-to-end learning framework for event-based motion deblurring in a self-supervised manner, where real-world events are exploited to alleviate the performance degradation caused by data inconsistency. To achieve this end, optical flows are predicted from events, with which the blurry consistency and photometric consistency are exploited to enable self-supervision on the deblurring network with real-world data. Furthermore, a piece-wise linear motion model is proposed to take into account motion non-linearities and thus leads to an accurate model for the physical formation of motion blurs in the real-world scenario. Extensive evaluation on both synthetic and real motion blur datasets demonstrates that the proposed algorithm bridges the gap between simulated and real-world motion blurs and shows remarkable performance for event-based motion deblurring in real-world scenarios.
A Temporal Pre-Filter For Video Coding Based On Bilateral Filtering
Jack Enhorn, Rickard Sjöberg, Per Wennersten
This paper presents a motion compensated temporal bilateral denoising pre-filter for video coding. The filtering process is applied before encoding and uses the two closest preceding pictures, the two closest succeeding pictures, the position of the current picture in the Group of Pictures (GOP) hierarchy and the value of the Quantization Parameter (QP) to filter a current picture. The filter is designed for both random-access and low-delay configurations, and succeeding pictures are not used in low-delay configurations. The filter is reported to achieve an average luma BD-rate of –3.9% relative to the VTM-7.0 video encoder in random-access configuration. The subjective quality compared to VTM-7.0 is reported as similar or better at lower encoded bit rate.
17 sitasi
en
Computer Science
Vampire Slaying in Buffy the Vampire Slayer May Result from Disrupted Ion Signaling
Julian Freedland
Sobre el territorio del misterio en la cinematografía y pintura
Pau Pascual Galbis
Estudio sobre el arcano término misterio, y presente en el cine de David Lynch, y Alfred Hitchcock, a más de la pintura de René Magritte y Giorgio de Chirico.
Por consiguiente, este artículo indaga de una manera interdisciplinar y poética dicha facultad, relacionando las obras de artistas muy vinculados con las artes plásticas y con la imagen en movimiento. Recorridos paralelos por el surrealismo, el inconsciente y lo siniestro. Extraños objetos, mansiones lujosas, personajes trastornados, asesinatos sin resolver y, sobre todo, muchas calles con largas sombras. Argumentos y paisajes sublimes que coinciden todos hacia el ritual de lo incógnito. Antesala de futuras adversidades.
Visual arts, Motion pictures
Motion Planning through Demonstration to Deal with Complex Motions in Assembly Process
Yan Wang, Kensuke Harada, Weiwei Wan
Complex and skillful motions in actual assembly process are challenging for the robot to generate with existing motion planning approaches, because some key poses during the human assembly can be too skillful for the robot to realize automatically. In order to deal with this problem, this paper develops a motion planning method using skillful motions from demonstration, which can be applied to complete robotic assembly process including complex and skillful motions. In order to demonstrate conveniently without redundant third-party devices, we attach augmented reality (AR) markers to the manipulated object to track and capture poses of the object during the human assembly process, which are employed as key poses to execute motion planning by the planner. Derivative of every key pose serves as criterion to determine the priority of use of key poses in order to accelerate the motion planning. The effectiveness of the presented method is verified through some numerical examples and actual robot experiments.
3D analysis of Osteosyntheses material using semi-automated CT segmentation: a case series of a 4 corner fusion plate
Rebecca Woehl, Johannes Maier, Sebastian Gehmert
et al.
Abstract Backround Scaphoidectomy and midcarpal fusion can be performed using traditional fixation methods like K-wires, staples, screws or different dorsal (non)locking arthrodesis systems. The aim of this study is to test the Aptus four corner locking plate and to compare the clinical findings to the data revealed by CT scans and semi-automated segmentation. Methods This is a retrospective review of eleven patients suffering from scapholunate advanced collapse (SLAC) or scaphoid non-union advanced collapse (SNAC) wrist, who received a four corner fusion between August 2011 and July 2014. The clinical evaluation consisted of measuring the range of motion (ROM), strength and pain on a visual analogue scale (VAS). Additionally, the Disabilities of the Arm, Shoulder and Hand (QuickDASH) and the Mayo Wrist Score were assessed. A computerized tomography (CT) of the wrist was obtained six weeks postoperatively. After semi-automated segmentation of the CT scans, the models were post processed and surveyed. Results During the six-month follow-up mean range of motion (ROM) of the operated wrist was 60°, consisting of 30° extension and 30° flexion. While pain levels decreased significantly, 54% of grip strength and 89% of pinch strength were preserved compared to the contralateral healthy wrist. Union could be detected in all CT scans of the wrist. While X-ray pictures obtained postoperatively revealed no pathology, two user related technical complications were found through the 3D analysis, which correlated to the clinical outcome. Conclusion Due to semi-automated segmentation and 3D analysis it has been proved that the plate design can keep up to the manufacturers’ promises. Over all, this case series confirmed that the plate can compete with the coexisting techniques concerning clinical outcome, union and complication rate.
Diseases of the musculoskeletal system
The gauge-invariant Lagrangian, the Power-Zienau-Woolley picture, and the choices of field momenta in nonrelativistic quantum electrodynamics
A. Vukics, G. Kónya, P. Domokos
We show that the Power-Zienau-Woolley picture of the electrodynamics of nonrelativistic neutral particles (atoms) can be derived from a gauge-invariant Lagrangian without making reference to any gauge whatsoever in the process. This equivalence is independent of choices of canonical field momentum or quantization strategies. In the process, we emphasize that in nonrelativistic (quantum) electrodynamics, the all-time appropriate generalized coordinate for the field is the transverse part of the vector potential, which is itself gauge invariant, and the use of which we recommend regardless of the choice of gauge, since in this way it is possible to sidestep most issues of constraints. Furthermore, we point out a freedom of choice for the conjugate momenta in the respective pictures, the conventional choices being good ones in the sense that they drastically reduce the set of system constraints.