Hasil untuk "Motion pictures"

Menampilkan 20 dari ~2223790 hasil · dari DOAJ, arXiv, Semantic Scholar, CrossRef

JSON API
arXiv Open Access 2026
Training-free Motion Factorization for Compositional Video Generation

Zixuan Wang, Ziqin Zhou, Feng Chen et al.

Compositional video generation aims to synthesize multiple instances with diverse appearance and motion. However, current approaches mainly focus on binding semantics, neglecting to understand diverse motion categories specified in prompts. In this paper, we propose a motion factorization framework that decomposes complex motion into three primary categories: motionlessness, rigid motion, and non-rigid motion. Specifically, our framework follows a planning before generation paradigm. (1) During planning, we reason about motion laws on the motion graph to obtain frame-wise changes in the shape and position of each instance. This alleviates semantic ambiguities in the user prompt by organizing it into a structured representation of instances and their interactions. (2) During generation, we modulate the synthesis of distinct motion categories in a disentangled manner. Conditioned on the motion cues, guidance branches stabilize appearance in motionless regions, preserve rigid-body geometry, and regularize local non-rigid deformations. Crucially, our two modules are model-agnostic, which can be seamlessly incorporated into various diffusion model architectures. Extensive experiments demonstrate that our framework achieves impressive performance in motion synthesis on real-world benchmarks. Code is available at https://github.com/ZixuanWang0525/MF-CVG.

en cs.CV
DOAJ Open Access 2025
Entre silêncio e a manipulação: Bergman e a censura em Portugal, anos 60

António Rebelo

Partimos da ideia de território, conceito e temática central deste Encontro. Desde logo, esta noção tem uma origem etimológica complexa, mas esclarecedora: a sua raiz principal, originária do latim territorium, é formado pelos étimos: - terra -, significando “solo”, “região”, “país”; - torium -, um sufixo que indica um lugar associado a uma acção. Assim, territorium expressava originalmente “a extensão de terra pertencente a uma cidade ou Estado”. É justamente neste domínio concreto - o Estado Novo, com implicações políticas, sociais, éticas e morais - que se pretende falar de Bergman e dos filmes exibidos entre nós. Importa então afirmar que este estudo - “Entre o Silêncio e a Manipulação: Bergman e a Censura em Portugal, Anos 60” -, começa por contextualizar a situação existente em Portugal, cuja menção não é de ordem geográfica, mas de jurisdição de uma comunidade, e fá-lo balizando, nessa abordagem, a realidade da década de 60. Destacando-se este território específico, configura-se um contexto singular de censura e recepção cinematográfica, da responsabilidade do Estado Novo, distinto do que ocorria noutros países europeus. [...].

Visual arts, Motion pictures
DOAJ Open Access 2025
Designing a Remote Monitoring and Security System for Broadcast Transmitters

Mansur Beştaş, Ruhi Taş

Remote monitoring systems are vital for many systems. Especially in the broadcasting industry, it is critical to keep the broadcasts running 24/7 uninterrupted. It is another challenge to be aware of the interruptions in terrestrial broadcasts on the country's geography and to intervene as soon as possible. For this reason, it is aimed to monitor TRT (Turkish Radio Television Co.) FM and TV transmitters and to gather information about their malfunctions and following station security warning. With the developed system, the follow-up of those responsible has been facilitated, and the response times have been shortened and with the help of motion-sensitive cameras, the security level was increased by taking pictures in the OMC (Operation Monitoring Center). In addition, it is recommended to have appropriate spare materials by ensuring that the teams have information about the type of failure.

Technology, Engineering (General). Civil engineering (General)
arXiv Open Access 2025
VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Xinran Ling, Chen Zhu, Meiqi Wu et al.

Video generation has advanced rapidly, improving evaluation methods, yet assessing video's motion remains a major challenge. Specifically, there are two key issues: 1) current motion metrics do not fully align with human perceptions; 2) the existing motion prompts are limited. Based on these findings, we introduce VMBench--a comprehensive Video Motion Benchmark that has perception-aligned motion metrics and features the most diverse types of motion. VMBench has several appealing properties: 1) Perception-Driven Motion Evaluation Metrics, we identify five dimensions based on human perception in motion video assessment and develop fine-grained evaluation metrics, providing deeper insights into models' strengths and weaknesses in motion quality. 2) Meta-Guided Motion Prompt Generation, a structured method that extracts meta-information, generates diverse motion prompts with LLMs, and refines them through human-AI validation, resulting in a multi-level prompt library covering six key dynamic scene dimensions. 3) Human-Aligned Validation Mechanism, we provide human preference annotations to validate our benchmarks, with our metrics achieving an average 35.3% improvement in Spearman's correlation over baseline methods. This is the first time that the quality of motion in videos has been evaluated from the perspective of human perception alignment. Additionally, we will soon release VMBench at https://github.com/GD-AIGC/VMBench, setting a new standard for evaluating and advancing motion generation models.

en cs.CV
arXiv Open Access 2025
Absolute Coordinates Make Motion Generation Easy

Zichong Meng, Zeyu Han, Xiaogang Peng et al.

State-of-the-art text-to-motion generation models rely on the kinematic-aware, local-relative motion representation popularized by HumanML3D, which encodes motion relative to the pelvis and to the previous frame with built-in redundancy. While this design simplifies training for earlier generation models, it introduces critical limitations for diffusion models and hinders applicability to downstream tasks. In this work, we revisit the motion representation and propose a radically simplified and long-abandoned alternative for text-to-motion generation: absolute joint coordinates in global space. Through systematic analysis of design choices, we show that this formulation achieves significantly higher motion fidelity, improved text alignment, and strong scalability, even with a simple Transformer backbone and no auxiliary kinematic-aware losses. Moreover, our formulation naturally supports downstream tasks such as text-driven motion control and temporal/spatial editing without additional task-specific reengineering and costly classifier guidance generation from control signals. Finally, we demonstrate promising generalization to directly generate SMPL-H mesh vertices in motion from text, laying a strong foundation for future research and motion-related applications.

en cs.CV
DOAJ Open Access 2024
Dasein and the Question of the Heterogenous Film Viewer: A Commentary on Loht’s Heideggerian Phenomenology of Film

Annie Sandrussi

In response to Shawn Loht’s 2017 project delineating a Heideggerian phenomenology of film, Phenomenology of Film: A Heideggerian Account of the Film Experience, I examine how productive Loht’s Dasein-centric account of the film viewer might be for considering diverse film-viewer experiences. Starting from Loht’s premise that the film–viewer relation is the constitutive ground of filmic disclosure, I raise two concerns regarding Heidegger’s account of Dasein that might obscure an account of the diversity of film viewers and associated heterogeneity of filmic disclosure: Dasein’s lack of concreteness with respect to lived experience, and Heidegger’s neglect of embodiment and bodily difference. I firstly argue that Dasein’s lack of concreteness is productive for theorising the diversity of film viewers by discussing the concept of Dasein through formal indication, a Heideggerian notion that Loht draws upon in order to account for filmic disclosure. I then examine a key objection, that Heidegger’s neglect of embodiment limits the productivity of a Dasein-centric account of film viewers for theorising diversity. I refer to Katharina Lindner’s interpretation of Heidegger’s notion of being-in-the-world in a queer phenomenology of film, set out in “Questions of Embodied Difference: Film and Queer Phenomenology” (2012), to suggest a pathway for how Dasein’s being-in-the-world might offer an account of film viewing that takes into consideration the embodiment of diverse viewers.

Motion pictures, Philosophy (General)
arXiv Open Access 2024
LEAD: Latent Realignment for Human Motion Diffusion

Nefeli Andreou, Xi Wang, Victoria Fernández Abrevaya et al.

Our goal is to generate realistic human motion from natural language. Modern methods often face a trade-off between model expressiveness and text-to-motion alignment. Some align text and motion latent spaces but sacrifice expressiveness; others rely on diffusion models producing impressive motions, but lacking semantic meaning in their latent space. This may compromise realism, diversity, and applicability. Here, we address this by combining latent diffusion with a realignment mechanism, producing a novel, semantically structured space that encodes the semantics of language. Leveraging this capability, we introduce the task of textual motion inversion to capture novel motion concepts from a few examples. For motion synthesis, we evaluate LEAD on HumanML3D and KIT-ML and show comparable performance to the state-of-the-art in terms of realism, diversity, and text-motion consistency. Our qualitative analysis and user study reveal that our synthesized motions are sharper, more human-like and comply better with the text compared to modern methods. For motion textual inversion, our method demonstrates improved capacity in capturing out-of-distribution characteristics in comparison to traditional VAEs.

en cs.CV, cs.AI
arXiv Open Access 2024
Video Motion Transfer with Diffusion Transformers

Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov et al.

We propose DiTFlow, a method for transferring the motion of a reference video to a newly synthesized one, designed specifically for Diffusion Transformers (DiT). We first process the reference video with a pre-trained DiT to analyze cross-frame attention maps and extract a patch-wise motion signal called the Attention Motion Flow (AMF). We guide the latent denoising process in an optimization-based, training-free, manner by optimizing latents with our AMF loss to generate videos reproducing the motion of the reference one. We also apply our optimization strategy to transformer positional embeddings, granting us a boost in zero-shot motion transfer capabilities. We evaluate DiTFlow against recently published methods, outperforming all across multiple metrics and human evaluation.

en cs.CV, cs.AI
arXiv Open Access 2024
Run-and-tumble motion of ellipsoidal microswimmers

Gordei Anchutkin, Viktor Holubec, Frank Cichos

A hallmark of bacteria is their so-called "run-and-tumble" motion, consisting of a sequence of linear directed "runs" and random rotations that constantly alternate due to biochemical feedback. It plays a crucial role in the ability of bacteria to move through chemical gradients and inspired a fundamental active particle model. Nevertheless, synthetic active particles generally do not exhibit run-and-tumble motion but rather active Brownian motion. We show in experiments that ellipsoidal thermophoretic Janus particles, propelling along their short axis, can yield run-and-tumble-like motion even without feedback. Their hydrodynamic wall interactions under strong confinement give rise to an effective double-well potential for the declination of the short axis. The geometry-induced timescale separation of the in-plane rotational dynamics and noise-induced transitions in the potential then yields run-and-tumble-like motion.

en cond-mat.soft
DOAJ Open Access 2023
Sediment Transport of Coastal Region Using Time-Series Unmanned Aerial Vehicle Spatial Data

Sulki Kim, Sungyeol Chang, Sungwon Shin et al.

Continuous monitoring of the varying topographical characteristics of shorelines is important for effective coastal management. Closed-circuit television (CCTV) cameras are installed to accumulate photographic data on coastal topographical changes. The overall change in the coastal waters can be intuitively understood from the images. However, the amount of three-dimensional (3D) changes that can be grasped is limited. To address this, studies have employed aerial photogrammetry, which is the use of unmanned aerial vehicles (UAVs) to capture aerial pictures, construct 3D models of target areas, and perform analysis through scale-invariant feature transform and structure from motion technologies. Although highly efficient, this technique requires several ground-control points (GCPs), which could corrupt the overall imagery. This study designs real-time kinematics—global navigation satellite system (RTK–GNSS) UAV, which requires few GCPs. To evaluate the positional accuracy of the captured UAV orthographic images and digital surface models (DSMs) used for precise coastal terrain measurements, a virtual reference service survey was performed to determine the vertical errors. The R-squared was 0.985, which is close to 1.0. Short-term and one-year topographic changes before and after a storm were investigated using time-series UAV image data after a coastal maintenance project. Analysis of the coefficient of variation in the beach volume for one year revealed that submerged breakwater reduced erosion during high wave resistance. The submerged breakwater located in the center exhibited variability similar to the opening. Hence, this method is more suitable for periodically monitoring coastal areas.

Naval architecture. Shipbuilding. Marine engineering, Oceanography
DOAJ Open Access 2023
Death as Film-Philosophy’s Muse: Deleuzian Observations on Moving Images and the Nature of Time

Susana Viegas

This article explores the affinities between film and philosophy by returning to a shared meditation on death and the nature of time. Death has been considered the muse of philosophy and can also be considered the muse of film-philosophy. But what does it mean to say that to film-philosophise is to learn to die, or a kind of training for dying? Film is an artistic object that reminds us of death’s inevitability; it is a meditation on the transient and finite nature of time. Films as diverse as Mizoguchi’s Tales of Ugetsu, Resnais’s Hiroshima mon Amour, and Guzmán’s Nostalgia for the Light take an uncanny approach to the subject, expressing the paradoxical coexistence of life and death and of different temporal dimensions. This article explores the philosophical concept of the death-image and time through a Deleuzian approach to cinema, meditating on the flashback, the coexistence of the present and the past, and the emergence of a new type of Lazarean character – one who returns from the dead. The article aims to clarify not simply death’s unquestionable omnipresence in film but also cinema’s role as a contemporary version of the trope of memento mori.

Motion pictures, Philosophy (General)
arXiv Open Access 2023
Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures

Riccardo Chimisso, Sathya Buršić, Paolo Marocco et al.

We present an exploration of machine learning architectures for predicting brain responses to realistic images on occasion of the Algonauts Challenge 2023. Our research involved extensive experimentation with various pretrained models. Initially, we employed simpler models to predict brain activity but gradually introduced more complex architectures utilizing available data and embeddings generated by large-scale pre-trained models. We encountered typical difficulties related to machine learning problems, e.g. regularization and overfitting, as well as issues specific to the challenge, such as difficulty in combining multiple input encodings, as well as the high dimensionality, unclear structure, and noisy nature of the output. To overcome these issues we tested single edge 3D position-based, multi-region of interest (ROI) and hemisphere predictor models, but we found that employing multiple simple models, each dedicated to a ROI in each hemisphere of the brain of each subject, yielded the best results - a single fully connected linear layer with image embeddings generated by CLIP as input. While we surpassed the challenge baseline, our results fell short of establishing a robust association with the data.

en q-bio.NC, cs.AI
arXiv Open Access 2023
NewMove: Customizing text-to-video models with novel motions

Joanna Materzynska, Josef Sivic, Eli Shechtman et al.

We introduce an approach for augmenting text-to-video generation models with customized motions, extending their capabilities beyond the motions depicted in the original training data. By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our contributions are threefold. First, to achieve our results, we finetune an existing text-to-video model to learn a novel mapping between the depicted motion in the input examples to a new unique token. To avoid overfitting to the new custom motion, we introduce an approach for regularization over videos. Second, by leveraging the motion priors in a pretrained model, our method can produce novel videos featuring multiple people doing the custom motion, and can invoke the motion in combination with other motions. Furthermore, our approach extends to the multimodal customization of motion and appearance of individualized subjects, enabling the generation of videos featuring unique characters and distinct motions. Third, to validate our method, we introduce an approach for quantitatively evaluating the learned custom motion and perform a systematic ablation study. We show that our method significantly outperforms prior appearance-based customization approaches when extended to the motion customization task.

en cs.CV
arXiv Open Access 2023
Truth in Motion: The Unprecedented Risks and Opportunities of Extended Reality Motion Data

Vivek Nair, Louis Rosenberg, James F. O'Brien et al.

Motion tracking "telemetry" data lies at the core of nearly all modern extended reality (XR) and metaverse experiences. While generally presumed innocuous, recent studies have demonstrated that motion data actually has the potential to profile and deanonymize XR users, posing a significant threat to security and privacy in the metaverse.

en cs.HC, cs.CR
arXiv Open Access 2023
MotionZero:Exploiting Motion Priors for Zero-shot Text-to-Video Generation

Sitong Su, Litao Guo, Lianli Gao et al.

Zero-shot Text-to-Video synthesis generates videos based on prompts without any videos. Without motion information from videos, motion priors implied in prompts are vital guidance. For example, the prompt "airplane landing on the runway" indicates motion priors that the "airplane" moves downwards while the "runway" stays static. Whereas the motion priors are not fully exploited in previous approaches, thus leading to two nontrivial issues: 1) the motion variation pattern remains unaltered and prompt-agnostic for disregarding motion priors; 2) the motion control of different objects is inaccurate and entangled without considering the independent motion priors of different objects. To tackle the two issues, we propose a prompt-adaptive and disentangled motion control strategy coined as MotionZero, which derives motion priors from prompts of different objects by Large-Language-Models and accordingly applies motion control of different objects to corresponding regions in disentanglement. Furthermore, to facilitate videos with varying degrees of motion amplitude, we propose a Motion-Aware Attention scheme which adjusts attention among frames by motion amplitude. Extensive experiments demonstrate that our strategy could correctly control motion of different objects and support versatile applications including zero-shot video edit.

en cs.CV
arXiv Open Access 2022
An Identity-Preserved Framework for Human Motion Transfer

Jingzhe Ma, Xiaoqing Zhang, Shiqi Yu

Human motion transfer (HMT) aims to generate a video clip for the target subject by imitating the source subject's motion. Although previous methods have achieved good results in synthesizing good-quality videos, they lose sight of individualized motion information from the source and target motions, which is significant for the realism of the motion in the generated video. To address this problem, we propose a novel identity-preserved HMT network, termed \textit{IDPres}. This network is a skeleton-based approach that uniquely incorporates the target's individualized motion and skeleton information to augment identity representations. This integration significantly enhances the realism of movements in the generated videos. Our method focuses on the fine-grained disentanglement and synthesis of motion. To improve the representation learning capability in latent space and facilitate the training of \textit{IDPres}, we introduce three training schemes. These schemes enable \textit{IDPres} to concurrently disentangle different representations and accurately control them, ensuring the synthesis of ideal motions. To evaluate the proportion of individualized motion information in the generated video, we are the first to introduce a new quantitative metric called Identity Score (\textit{ID-Score}), motivated by the success of gait recognition methods in capturing identity information. Moreover, we collect an identity-motion paired dataset, $Dancer101$, consisting of solo-dance videos of 101 subjects from the public domain, providing a benchmark to prompt the development of HMT methods. Extensive experiments demonstrate that the proposed \textit{IDPres} method surpasses existing state-of-the-art techniques in terms of reconstruction accuracy, realistic motion, and identity preservation.

arXiv Open Access 2022
From Low to High Order Motion Planners: Safe Robot Navigation using Motion Prediction and Reference Governor

Aykut İşleyen, Nathan van de Wouw, Ömür Arslan

Safe navigation around obstacles is a fundamental challenge for highly dynamic robots. The state-of-the-art approach for adapting simple reference path planners to complex robot dynamics using trajectory optimization and tracking control is brittle and requires significant replanning cycles. In this paper, we introduce a novel feedback motion planning framework that extends the applicability of low-order (e.g. position-/velocity-controlled) reference motion planners to high-order (e.g., acceleration-/jerk-controlled) robot models using motion prediction and reference governors. We use predicted robot motion range for safety assessment and establish a bidirectional interface between high-level planning and low-level control via a reference governor. We describe the generic fundamental building blocks of our feedback motion planning framework and give specific example constructions for motion control, prediction, and reference planning. We prove the correctness of our planning framework and demonstrate its performance in numerical simulations. We conclude that accurate motion prediction is crucial for closing the gap between high-level planning and low-level control.

en cs.RO, eess.SY
arXiv Open Access 2022
Efficient Motion Modelling with Variable-sized blocks from Hierarchical Cuboidal Partitioning

Priyabrata Karmakar, Manzur Murshed, Manoranjan Paul et al.

Motion modelling with block-based architecture has been widely used in video coding where a frame is divided into fixed-sized blocks that are motion compensated independently. This often leads to coding inefficiency as fixed-sized blocks hardly align with the object boundaries. Although hierarchical block-partitioning has been introduced to address this, the increased number of motion vectors limits the benefit. Recently, approximate segmentation of images with cuboidal partitioning has gained popularity. Not only are the variable-sized rectangular segments (cuboids) readily amenable to block-based image/video coding techniques, but they are also capable of aligning well with the object boundaries. This is because cuboidal partitioning is based on a homogeneity constraint, minimising the sum of squared errors (SSE). In this paper, we have investigated the potential of cuboids in motion modelling against the fixed-sized blocks used in scalable video coding. Specifically, we have constructed motion-compensated current frame using the cuboidal partitioning information of the anchor frame in a group-of-picture (GOP). The predicted current frame has then been used as the base layer while encoding the current frame as an enhancement layer using the scalable HEVC encoder. Experimental results confirm 6.71%-10.90% bitrate savings on 4K video sequences.

en cs.CV, cs.MM

Halaman 47 dari 111190