Masked autoencoders (MAE) have become a dominant paradigm in 3D representation learning, setting new performance benchmarks across various downstream tasks. Existing methods with fixed mask ratio neglect multi-level representational correlations and intrinsic geometric structures, while relying on point-wise reconstruction assumptions that conflict with the diversity of point cloud. To address these issues, we propose a 3D representation learning method, termed Point-SRA, which aligns representations through self-distillation and probabilistic modeling. Specifically, we assign different masking ratios to the MAE to capture complementary geometric and semantic information, while the MeanFlow Transformer (MFT) leverages cross-modal conditional embeddings to enable diverse probabilistic reconstruction. Our analysis further reveals that representations at different time steps in MFT also exhibit complementarity. Therefore, a Dual Self-Representation Alignment mechanism is proposed at both the MAE and MFT levels. Finally, we design a Flow-Conditioned Fine-Tuning Architecture to fully exploit the point cloud distribution learned via MeanFlow. Point-SRA outperforms Point-MAE by 5.37% on ScanObjectNN. On intracranial aneurysm segmentation, it reaches 96.07% mean IoU for arteries and 86.87% for aneurysms. For 3D object detection, Point-SRA achieves 47.3% AP@50, surpassing MaskPoint by 5.12%.
Dengyang Jiang, Mengmeng Wang, Liuzhuozheng Li
et al.
Recent studies have demonstrated that learning a meaningful internal representation can accelerate generative training. However, existing approaches necessitate to either introduce an off-the-shelf external representation task or rely on a large-scale, pre-trained external representation encoder to provide representation guidance during the training process. In this study, we posit that the unique discriminative process inherent to diffusion transformers enables them to offer such guidance without requiring external representation components. We propose SelfRepresentation Alignment (SRA), a simple yet effective method that obtains representation guidance using the internal representations of learned diffusion transformer. SRA aligns the latent representation of the diffusion transformer in the earlier layer conditioned on higher noise to that in the later layer conditioned on lower noise to progressively enhance the overall representation learning during only the training process. Experimental results indicate that applying SRA to DiTs and SiTs yields consistent performance improvements, and largely outperforms approaches relying on auxiliary representation task. Our approach achieves performance comparable to methods that are dependent on an external pre-trained representation encoder, which demonstrates the feasibility of acceleration with representation alignment in diffusion transformers themselves.
The present article discusses Ravel’s Gaspard de la nuit (1908) and L’Enfant et les sortilèges (1925) by by allowing a section of each of these works – namely ‘Le gibet’ and ‘Préambule féerique’ – to enter into resonance with the notion of risk-taking and the dictates of the Surrealist movement. While Ravel carries out experiments of an unusual and perilous character with a consummate art, the result is far removed from the Neoclassical perspective from which his works are usually considered.
The topic of this article is the depiction of the fall of the family in Luchino Visconti’s film The Damned and Thomas Mann’s novel Buddenbrooks, which served as a major inspiration for the former. The comparison of both works centers on two pairs of opposite attributes: the fall of the Buddenbrooks is characterised as gradual and internal, whereas the von Essenbeck family from The Damned declines in a rapid and external manner. These differences are studied in connection with such issues as the role of historical events in both works, the similiarities and dissimilarities between certain characters, as well as the impact of Shakespeare’s Macbeth on the narrative structure of The Damned. The nature of Visconti’s inspiration with Mann’s work and the connection between both artists are also discussed.
Diffusion Models are popular generative modeling methods in various vision tasks, attracting significant attention. They can be considered a unique instance of self-supervised learning methods due to their independence from label annotation. This survey explores the interplay between diffusion models and representation learning. It provides an overview of diffusion models' essential aspects, including mathematical foundations, popular denoising network architectures, and guidance methods. Various approaches related to diffusion models and representation learning are detailed. These include frameworks that leverage representations learned from pre-trained diffusion models for subsequent recognition tasks and methods that utilize advancements in representation and self-supervised learning to enhance diffusion models. This survey aims to offer a comprehensive overview of the taxonomy between diffusion models and representation learning, identifying key areas of existing concerns and potential exploration. Github link: https://github.com/dongzhuoyao/Diffusion-Representation-Learning-Survey-Taxonomy
Supersolvable hyperplane arrangements and matroids are known to give rise to certain Koszul algebras, namely their Orlik-Solomon algebras and graded Varchenko-Gel'fand algebras. We explore how this interacts with group actions, particularly for the braid arrangement and the action of the symmetric group, where the Hilbert functions of the algebras and their Koszul duals are given by Stirling numbers of the first and second kinds, respectively. The corresponding symmetric group representations exhibit branching rules that interpret Stirling number recurrences, which are shown to apply to all supersolvable arrangements. They also enjoy representation stability properties that follow from Koszul duality.
Skander Moalla, Andrea Miele, Daniil Pyatko
et al.
Reinforcement learning (RL) is inherently rife with non-stationarity since the states and rewards the agent observes during training depend on its changing policy. Therefore, networks in deep RL must be capable of adapting to new observations and fitting new targets. However, previous works have observed that networks trained under non-stationarity exhibit an inability to continue learning, termed loss of plasticity, and eventually a collapse in performance. For off-policy deep value-based RL methods, this phenomenon has been correlated with a decrease in representation rank and the ability to fit random targets, termed capacity loss. Although this correlation has generally been attributed to neural network learning under non-stationarity, the connection to representation dynamics has not been carefully studied in on-policy policy optimization methods. In this work, we empirically study representation dynamics in Proximal Policy Optimization (PPO) on the Atari and MuJoCo environments, revealing that PPO agents are also affected by feature rank deterioration and capacity loss. We show that this is aggravated by stronger non-stationarity, ultimately driving the actor's performance to collapse, regardless of the performance of the critic. We ask why the trust region, specific to methods like PPO, cannot alleviate or prevent the collapse and find a connection between representation collapse and the degradation of the trust region, one exacerbating the other. Finally, we present Proximal Feature Optimization (PFO), a novel auxiliary loss that, along with other interventions, shows that regularizing the representation dynamics mitigates the performance collapse of PPO agents.
Unsupervised representation learning presents new opportunities for advancing Quantum Architecture Search (QAS) on Noisy Intermediate-Scale Quantum (NISQ) devices. QAS is designed to optimize quantum circuits for Variational Quantum Algorithms (VQAs). Most QAS algorithms tightly couple the search space and search algorithm, typically requiring the evaluation of numerous quantum circuits, resulting in high computational costs and limiting scalability to larger quantum circuits. Predictor-based QAS algorithms mitigate this issue by estimating circuit performance based on structure or embedding. However, these methods often demand time-intensive labeling to optimize gate parameters across many circuits, which is crucial for training accurate predictors. Inspired by the classical neural architecture search algorithm Arch2vec, we investigate the potential of unsupervised representation learning for QAS without relying on predictors. Our framework decouples unsupervised architecture representation learning from the search process, enabling the learned representations to be applied across various downstream tasks. Additionally, it integrates an improved quantum circuit graph encoding scheme, addressing the limitations of existing representations and enhancing search efficiency. This predictor-free approach removes the need for large labeled datasets. During the search, we employ REINFORCE and Bayesian Optimization to explore the latent representation space and compare their performance against baseline methods. We further validate our approach by executing the best-discovered MaxCut circuits on IBM's ibm_sherbrooke quantum processor, confirming that the architectures retain optimal performance even under real hardware noise. Our results demonstrate that the framework efficiently identifies high-performing quantum circuits with fewer search iterations.
Probabilistic representation spaces convey information about a dataset and are shaped by factors such as the training data, network architecture, and loss function. Comparing the information content of such spaces is crucial for understanding the learning process, yet most existing methods assume point-based representations, neglecting the distributional nature of probabilistic spaces. To address this gap, we propose two information-theoretic measures to compare general probabilistic representation spaces by extending classic methods to compare the information content of hard clustering assignments. Additionally, we introduce a lightweight method of estimation that is based on fingerprinting a representation space with a sample of the dataset, designed for scenarios where the communicated information is limited to a few bits. We demonstrate the utility of these measures in three case studies. First, in the context of unsupervised disentanglement, we identify recurring information fragments within individual latent dimensions of VAE and InfoGAN ensembles. Second, we compare the full latent spaces of models and reveal consistent information content across datasets and methods, despite variability during training. Finally, we leverage the differentiability of our measures to perform model fusion, synthesizing the information content of weak learners into a single, coherent representation. Across these applications, the direct comparison of information content offers a natural basis for characterizing the processing of information.
Thanh Sang Nguyen, Jooho Lee, Van Thuy Hoang
et al.
Graph representation learning models aim to represent the graph structure and its features into low-dimensional vectors in a latent space, which can benefit various downstream tasks, such as node classification and link prediction. Due to its powerful graph data modelling capabilities, various graph embedding models and libraries have been proposed to learn embeddings and help researchers ease conducting experiments. In this paper, we introduce a novel graph representation framework covering various graph embedding models, ranging from shallow to state-of-the-art models, namely Connector. First, we consider graph generation by constructing various types of graphs with different structural relations, including homogeneous, signed, heterogeneous, and knowledge graphs. Second, we introduce various graph representation learning models, ranging from shallow to deep graph embedding models. Finally, we plan to build an efficient open-source framework that can provide deep graph embedding models to represent structural relations in graphs. The framework is available at https://github.com/NSLab-CUK/Connector.
Luciana Hartmann, Débora Cristina Sales da Cruz Vieira
Este artigo parte de um evento narrativo ocorrido entre crianças de 5 e 6 anos numa turma de Educação Infantil de uma escola pública do Distrito Federal, no qual foram evocados personagens como Deus e Zé Pelinda (entre outros), para promover um debate sobre como crianças pequenas acionam marcadores sociais da diferença. Questões religiosas, étnico-raciais e de gênero emergem de suas performances narrativas num cruzamento interseccional de distinções, o que permite depreender como esses marcadores, e as políticas do medo que frequentemente os impõem, operam na vida social das crianças desde uma idade muito precoce.
A considerable share of the literature on physics education and on education more broadly focuses on the principles which should guide the design of courses and of classroom activities. In this short article I wish to place more attention on the unplanned aspects of teaching: specifically, the spontaneous interactions that occur between instructors and students in settings like office hours, recitations, and when students ask questions during lecture. Because by their nature these interactions require thinking on one's feet, and depend upon the interplay between instructor and student, they share many characteristics with improvisational theater (improv). I document three foundational principles from improv literature (active listening, "yes-and," and the "button") and describe how they relate to established principles from physics education research. I provide examples of how each principle can be used to bring solid pedagogical practices to the unstructured space of instructor-student interactions.
Frances Babbage, Malaika Cunningham, Zelda Hannay
et al.
This co-authored article examines the ways in which The People’s Palace of Possibility, a live interactive installation by The Bare Project, was reinvented in the context of COVID-19 as a postal event engaging participants individually in their local areas. We show how the adapted dramaturgy sought to address the deep disruptions instilled by the pandemic, seeking to build a dynamic, reparative structure that could tend to a shattered and isolating present. Applying perspectives from adaptation, psychology, democratic theory and dramaturgy, we argue that home, neighbourhood and online environments afforded opportunities for individual and collective engagement with political ideas, generating multiple visions of utopia.
We provide the first nationally representative estimates of sexual minority representation in STEM fields by studying 142,641 men and women in same-sex couples from the 2009-2018 American Community Surveys. These data indicate that men in same-sex couples are 12 percentage points less likely to have completed a bachelor's degree in a STEM field compared to men in different-sex couples; there is no gap observed for women in same-sex couples compared to women in different-sex couples. The STEM gap between men in same-sex and different-sex couples is larger than the STEM gap between white and black men but is smaller than the gender STEM gap. We also document a gap in STEM occupations between men in same-sex and different-sex couples, and we replicate this finding using independently drawn data from the 2013-2018 National Health Interview Surveys. These differences persist after controlling for demographic characteristics, location, and fertility. Our findings further the call for interventions designed at increasing representation of sexual minorities in STEM.
We conduct the first study of its kind to generate and evaluate vector representations for chess pieces. In particular, we uncover the latent structure of chess pieces and moves, as well as predict chess moves from chess positions. We share preliminary results which anticipate our ongoing work on a neural network architecture that learns these embeddings directly from supervised feedback.
Il contributo di Roberto Alonge prende atto del fatto che la drammaturgia di Pirandello costituisce ormai un punto fisso dei cartelloni dei teatri italiani, ed esamina tre spettacoli pirandelliani: Il piacere dell’onestà, Così è (se vi pare) e Sei personaggi in cerca d’autore. Il primo è un buon prodotto, costruito sull’abilità consumata di un grandattore di vaglia, Geppy Gleijeses. Il secondo, diretto da Filippo Dini, ha l’ambizione di aprire nuove chiavi interpretative di un testo molto studiato. Per certi tratti Dini confessa il suo debito alla celebre messinscena di Massimo Castri che, per il momento, resta la più convincente testimonianza di una lettura originale del testo, fondata sull’ipotesi che l’incesto tra padre e figlia sia il segreto nascosto nello strato profondo della commedia. Per concludere, i Sei personaggi di Spiro Scimone e Francesco Sframeli, presentati con il titolo Sei, sono un esempio di adattamento drammaturgico molto interessante. La famosa commedia di Pirandello è sfoltita drasticamente, ridotta a una rappresentazione di poco più di un’ora, in molti punti sottoposta a un’operazione di riscrittura, spesso personale ma mai libera e gratuita, perché i due teatranti messinesi sono artisti di valore.