mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer
Linting Xue, Noah Constant, Adam Roberts
et al.
The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. We also describe a simple technique to prevent “accidental translation” in the zero-shot setting, where a generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model checkpoints used in this work are publicly available.
3093 sitasi
en
Computer Science
Tracking Objects as Points
Xingyi Zhou, V. Koltun, Philipp Krähenbühl
Tracking has traditionally been the art of following interest points through space and time. This changed with the rise of powerful deep networks. Nowadays, tracking is dominated by pipelines that perform object detection followed by temporal association, also known as tracking-by-detection. In this paper, we present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art. Our tracker, CenterTrack, applies a detection model to a pair of images and detections from the prior frame. Given this minimal input, CenterTrack localizes objects and predicts their associations with the previous frame. That's it. CenterTrack is simple, online (no peeking into the future), and real-time. It achieves 67.3% MOTA on the MOT17 challenge at 22 FPS and 89.4% MOTA on the KITTI tracking benchmark at 15 FPS, setting a new state of the art on both datasets. CenterTrack is easily extended to monocular 3D tracking by regressing additional 3D attributes. Using monocular video input, it achieves 28.3% AMOTA@0.2 on the newly released nuScenes 3D tracking benchmark, substantially outperforming the monocular baseline on this benchmark while running at 28 FPS.
1303 sitasi
en
Computer Science
Contrastive Multi-View Representation Learning on Graphs
Kaveh Hassani, Amir Hosein Khas Ahmadi
We introduce a self-supervised approach for learning node and graph level representations by contrasting structural views of graphs. We show that unlike visual representation learning, increasing the number of views to more than two or contrasting multi-scale encodings do not improve performance, and the best performance is achieved by contrasting encodings from first-order neighbors and a graph diffusion. We achieve new state-of-the-art results in self-supervised learning on 8 out of 8 node and graph classification benchmarks under the linear evaluation protocol. For example, on Cora (node) and Reddit-Binary (graph) classification benchmarks, we achieve 86.8% and 84.5% accuracy, which are 5.5% and 2.4% relative improvements over previous state-of-the-art. When compared to supervised baselines, our approach outperforms them in 4 out of 8 benchmarks. Source code is released at: this https URL
1640 sitasi
en
Computer Science, Mathematics
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun, Fady Baly, Hazem M. Hajj
The Arabic language is a morphologically rich language with relatively few resources and a less explored syntax compared to English. Given these limitations, Arabic Natural Language Processing (NLP) tasks like Sentiment Analysis (SA), Named Entity Recognition (NER), and Question Answering (QA), have proven to be very challenging to tackle. Recently, with the surge of transformers based models, language-specific BERT based models have proven to be very efficient at language understanding, provided they are pre-trained on a very large corpus. Such models were able to set new standards and achieve state-of-the-art results for most NLP tasks. In this paper, we pre-trained BERT specifically for the Arabic language in the pursuit of achieving the same success that BERT did for the English language. The performance of AraBERT is compared to multilingual BERT from Google and other state-of-the-art approaches. The results showed that the newly developed AraBERT achieved state-of-the-art performance on most tested Arabic NLP tasks. The pretrained araBERT models are publicly available on https://github.com/aub-mind/araBERT hoping to encourage research and applications for Arabic NLP.
1301 sitasi
en
Computer Science
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
Alexander Rives, Siddharth Goyal, Joshua Meier
et al.
Significance Learning biological properties from sequence data is a logical step toward generative and predictive artificial intelligence for biology. Here, we propose scaling a deep contextual language model with unsupervised learning to sequences spanning evolutionary diversity. We find that without prior knowledge, information emerges in the learned representations on fundamental properties of proteins such as secondary structure, contacts, and biological activity. We show the learned representations are useful across benchmarks for remote homology detection, prediction of secondary structure, long-range residue–residue contacts, and mutational effect. Unsupervised representation learning enables state-of-the-art supervised prediction of mutational effect and secondary structure and improves state-of-the-art features for long-range contact prediction. In the field of artificial intelligence, a combination of scale in data and model capacity enabled by unsupervised learning has led to major advances in representation learning and statistical generation. In the life sciences, the anticipated growth of sequencing promises unprecedented data on natural sequence diversity. Protein language modeling at the scale of evolution is a logical step toward predictive and generative artificial intelligence for biology. To this end, we use unsupervised learning to train a deep contextual language model on 86 billion amino acids across 250 million protein sequences spanning evolutionary diversity. The resulting model contains information about biological properties in its representations. The representations are learned from sequence data alone. The learned representation space has a multiscale organization reflecting structure from the level of biochemical properties of amino acids to remote homology of proteins. Information about secondary and tertiary structure is encoded in the representations and can be identified by linear projections. Representation learning produces features that generalize across a range of applications, enabling state-of-the-art supervised prediction of mutational effect and secondary structure and improving state-of-the-art features for long-range contact prediction.
2914 sitasi
en
Biology, Computer Science
ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data
F. Diakogiannis, F. Waldner, P. Caccetta
et al.
Scene understanding of high resolution aerial images is of great importance for the task of automated monitoring in various remote sensing applications. Due to the large within-class and small between-class variance in pixel values of objects of interest, this remains a challenging task. In recent years, deep convolutional neural networks have started being used in remote sensing applications and demonstrate state of the art performance for pixel level classification of objects. Here we present a novel deep learning architecture, \resuneta, that combines ideas from various state of the art modules used in computer vision for semantic segmentation tasks. We analyse the performance of several flavours of the Generalized Dice loss for semantic segmentation, and we introduce a novel variant loss function for semantic segmentation of objects that has better convergence properties and behaves well even under the presence of highly imbalanced classes. The performance of our modeling framework is evaluated on the ISPRS 2D Potsdam dataset. Results show state-of-the-art performance with an average F1 score of 92.9% over all classes for our best model.
1793 sitasi
en
Computer Science
Ensemble learning: A survey
Omer Sagi, L. Rokach
2886 sitasi
en
Computer Science
Resnet in Resnet: Generalizing Residual Architectures
S. Targ, Diogo Almeida, Kevin Lyman
Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100.
1047 sitasi
en
Computer Science, Mathematics
Introduction to the recommendations from the National Institute on Aging‐Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease
Clifford R. Jack, M. Albert, D. Knopman
et al.
The FERET database and evaluation procedure for face-recognition algorithms
P. Phillips, H. Wechsler, Jeff Huang
et al.
2604 sitasi
en
Computer Science
A model of aesthetic appreciation and aesthetic judgments.
H. Leder, B. Belke, A. Oeberst
et al.
1734 sitasi
en
Psychology, Medicine
Truth and Method
H. Gadamer, J. Weinsheimer, D. Marshall
7486 sitasi
en
Philosophy
The Principles of Art
A. McMahon, R. Collingwood
867 sitasi
en
Psychology, Art
Toward Emotional Design Feature Evaluation in Craft Development: Understanding the Top-Selling Chinese and Japanese Ceramic Teapots
Xianghui Li, Takaya Yuizono, Van-Nam Huynh
et al.
Crafts are products with both functional and aesthetic properties. Through emotional design, we can enhance not only the functionality of craft products but also their aesthetic appeal. This study focuses on ceramic teapots as a case study, selecting 20 samples from top-selling Chinese and Japanese ceramic teapots on JD.com. Employing machine learning methods, this study explores the relationship between the design features of ceramic teapots and the emotional preferences of young consumers. A classification model was developed, and new teapots were designed for validation. The main findings are as follows: 1) The shape of the teapot lid, spout, handle, and body, as well as the color, decoration, and usability of the teapot, significantly influence young people’s emotional preferences. 2) Model evaluation revealed that the accuracies of the binary and ternary classification models established through random forest reached 91.0% and 81.3%, respectively. 3) The proportion of newly designed teapots that aligned with young people’s emotional preferences reached 91.1% (binary classification) and 64.5% (ternary classification), which are higher than those for the original 20 teapots. In conclusion, this study demonstrates that the relationship between the perceived image and design features of top-selling Chinese and Japanese ceramic teapots can effectively guide the design of ceramic teapot forms and successfully convey the intended image. The established model can be used to evaluate the emotional quality of newly designed teapots, providing data-driven support for designers to create ceramic teapots that resonate with market preferences.
Electrical engineering. Electronics. Nuclear engineering
How (not) to spread Communist propaganda behind the Iron Curtain: exhibitions of Polish folk art in Paris, Brussels, and Amsterdam in the late 1940s
Michał Wenderski
This paper explores the 1948–1949 exhibitions showcasing Polish folk art, initially held in Poland and then sent westwards: to Paris, Brussels, and Amsterdam. They were organised by Polish Communist authorities under the ‘Recovered Territories’ campaign that aimed to assert Poland’s historical and ethnographic ties to its new territories that before World War II had belonged to Germany. This propagandist effort sought to reinforce the post-war European status quo, with the exhibitions in question serving as an important element of the campaign. This case study constitutes therefore one of the earliest manifestations of the East-West Cultural Cold War. Analysing exhibition catalogues, press reviews, and archival documents, this study investigates the extent to which the propagandist objectives were achieved in the West. Additionally, it examines how French, Belgian, and Dutch representatives of the field of culture responded to Communist propaganda and mitigated its effects.
Fine Arts, Arts in general
A Pad-Focused PCB Routing Algorithm Using Polygon-Based Dynamic Partitioning
Youbiao He, Hebi Li, Ge Luo
et al.
Routing plays a pivotal role in the design of printed circuit boards (PCBs). Existing automated routers typically tackle the routing problem by dividing it into two separate phases: escape routing and area routing. However, this approach often leads to suboptimal solutions or even the absence of solutions when transitioning from escape routing to area routing. In this paper, we propose a novel pad-focused, net-by-net, two-stage PCB routing approach comprising of a Monte Carlo tree search (MCTS)-based global routing stage, followed by an A*-based detailed routing stage. To bridge the gap between the global and detailed routing stages, we introduce a polygon-based dynamic routable region partitioning mechanism, ensuring that a detailed routing solution exists when a global routing solution is present. Experimental results demonstrate that our approach outperforms state-of-the-art routers in terms of the success rate and total wirelength on the test set.
Electrical engineering. Electronics. Nuclear engineering
Boundary-Aware Transformer for Optic Cup and Disc Segmentation in Fundus Images
Soohyun Wang, Byoungkug Kim, Doo-Seop Eom
Segmentation of the Optic Disc (OD) and Optic Cup (OC) boundaries in fundus images is a critical step for early glaucoma diagnosis, but accurate segmentation is challenging due to low boundary contrast and significant anatomical variability. To address these challenges, this study proposes a novel segmentation framework that integrates structure-preserving data augmentation, Boundary-aware Transformer Attention (BAT), and Geometry-aware Loss. We enhance data diversity while preserving vascular and tissue structures through truncated Gaussian-based sampling and colormap transformations. BAT strengthens boundary recognition by globally learning the inclusion relationship between the OD and OC within the skip connection paths of U-Net. Additionally, Geometry-aware Loss, which combines the normalized Hausdorff Distance with the Dice Loss, reduces fine-grained boundary errors and improves boundary precision. The proposed model outperforms existing state-of-the-art models across five public datasets—DRIONS-DB, Drishti-GS, REFUGE, G1020, and ORIGA—and achieves Dice scores of 0.9127 on Drishti-GS and 0.9014 on REFUGE for OC segmentation. For joint segmentation of the OD and OC, it attains high Dice scores of 0.9892 on REFUGE, 0.9782 on G1020, and 0.9879 on ORIGA. Ablation studies validate the independent contributions of each component and demonstrate their synergistic effect when combined. Furthermore, the proposed model more accurately captures the relative size and spatial alignment of the OD and OC and produces smooth and consistent boundary predictions in clinically significant regions such as the region of interest (ROI). These results support the clinical applicability of the proposed method in medical image analysis tasks requiring precise, boundary-focused segmentation.
Technology, Engineering (General). Civil engineering (General)
Oceanic states of consciousness—an existential-neuroscience perspective
Human-Friedrich Unterrainer, Human-Friedrich Unterrainer, Human-Friedrich Unterrainer
et al.
Oceanic states of consciousness—characterized by ego dissolution, unity, and timelessness—have long occupied a liminal space between psychopathology and transcendence. This paper explores these states through the interdisciplinary lens of existential neuroscience, integrating insights from psychoanalysis, existentialism, affective neuroscience, and psychedelic research. Starting with the psychoanalytic tension between Freud’s view of the oceanic feeling as a regressive illusion and Jung’s framing of it as a transformative encounter with the unconscious, this paper examines how creative and mystical experiences often arise from this dissolution of self-boundaries. Drawing on art theorist Anton Ehrenzweig and examples from figures like Vincent van Gogh and Antonin Artaud, I highlight how oceanic states may catalyze both visionary insight and psychological disintegration. Neuroscientific models, including the REBUS theory and studies of the Default Mode Network (DMN), suggest that ego dissolution reflects a flexible reorganization of brain function rather than dysfunction. The Peri-Aqueductal Gray (PAG), a midbrain structure associated with affect regulation and spiritual experience, emerges as a key neural hub linking primal affective states with mystical awareness. Existential thinkers such as Sartre, Heidegger, and Merleau-Ponty provide a philosophical framework for interpreting these phenomena as moments of existential rupture and potential authenticity. Oceanic states thus challenge conventional notions of the self as fixed and bounded. Rather than categorizing them as pathological or purely mystical, it is proposed here that these states represent affectively charged boundary experiences - ones that require contextual integration and offer deep insight into the nature of selfhood, meaning, and transformation.
Neurosciences. Biological psychiatry. Neuropsychiatry
Understanding Comics: The Invisible Art
A. D. Manning
612 sitasi
en
Computer Science
Machine eye for defects: Machine learning-based solution to identify and characterize topological defects in textured images of nematic materials
Haijie Ren, Weiqiang Wang, Wentao Tang
et al.
Topological defects play a key role in the structures and dynamics of liquid crystals and other ordered systems. There is a recent interest in studying defects in different biological systems with distinct textures. However, a robust method to directly recognize defects and extract their structural features from various traditional and nontraditional nematic systems remains challenging to date. Here we present a machine learning solution, termed machine eye for defects (MED), for automated defect analysis in images with diverse nematic textures. MED seamlessly integrates state-of-the-art object detection networks, segment anything model, and vision transformer algorithms with tailored computer vision techniques. We show that MED can accurately identify the positions, winding numbers, and orientations of ±1/2 defects across distinct cellular contours, sparse vector fields of nematic directors, actin filaments, microtubules, and simulation images of Gay-Berne particles. MED performs faster than conventional defect detection methods and can achieve over 90% accuracy on recognizing ±1/2 defects and their orientations from vector fields and experimental tissue images. We further demonstrate that MED can identify defect types that are not included in the training data, such as giant-core defects and defects with higher winding numbers. Remarkably, MED provides correct structural information about ±1 defects, i.e., the phase angle for +1 defects and the orientation angle for −1 defects. As such, MED stands poised to transform studies of diverse ordered systems by providing automated, rapid, accurate, and insightful defect analysis.