We propose a generalization of transformer neural network architecture for arbitrary graphs. The original transformer was designed for Natural Language Processing (NLP), which operates on fully connected graphs representing all connections between the words in a sequence. Such architecture does not leverage the graph connectivity inductive bias, and can perform poorly when the graph topology is important and has not been encoded into the node features. We introduce a graph transformer with four new properties compared to the standard model. First, the attention mechanism is a function of the neighborhood connectivity for each node in the graph. Second, the positional encoding is represented by the Laplacian eigenvectors, which naturally generalize the sinusoidal positional encodings often used in NLP. Third, the layer normalization is replaced by a batch normalization layer, which provides faster training and better generalization performance. Finally, the architecture is extended to edge feature representation, which can be critical to tasks s.a. chemistry (bond type) or link prediction (entity relationship in knowledge graphs). Numerical experiments on a graph benchmark demonstrate the performance of the proposed graph transformer architecture. This work closes the gap between the original transformer, which was designed for the limited case of line graphs, and graph neural networks, that can work with arbitrary graphs. As our architecture is simple and generic, we believe it can be used as a black box for future applications that wish to consider transformer and graphs.
Abstract SMART (Simple Modular Architecture Research Tool) is a web resource (https://smart.embl.de) for the identification and annotation of protein domains and the analysis of protein domain architectures. SMART version 9 contains manually curated models for more than 1300 protein domains, with a topical set of 68 new models added since our last update article (1). All the new models are for diverse recombinase families and subfamilies and as a set they provide a comprehensive overview of mobile element recombinases namely transposase, integrase, relaxase, resolvase, cas1 casposase and Xer like cellular recombinase. Further updates include the synchronization of the underlying protein databases with UniProt (2), Ensembl (3) and STRING (4), greatly increasing the total number of annotated domains and other protein features available in architecture analysis mode. Furthermore, SMART’s vector-based protein display engine has been extended and updated to use the latest web technologies and the domain architecture analysis components have been optimized to handle the increased number of protein features available.
Abstract Historically, Industrial Automation and Control Systems (IACS) were largely isolated from conventional digital networks such as enterprise ICT environments. Where connectivity was required, a zoned architecture was adopted, with firewalls and/or demilitarized zones used to protect the core control system components. The adoption and deployment of ‘Internet of Things’ (IoT) technologies is leading to architectural changes to IACS, including greater connectivity to industrial systems. This paper reviews what is meant by Industrial IoT (IIoT) and relationships to concepts such as cyber-physical systems and Industry 4.0. The paper develops a definition of IIoT and analyses related partial IoT taxonomies. It develops an analysis framework for IIoT that can be used to enumerate and characterise IIoT devices when studying system architectures and analysing security threats and vulnerabilities. The paper concludes by identifying some gaps in the literature.
Abstract Internet of Things is a platform where every day devices become smarter, every day processing becomes intelligent, and every day communication becomes informative. While the Internet of Things is still seeking its own shape, its effects have already stared in making incredible strides as a universal solution media for the connected scenario. Architecture specific study does always pave the conformation of related field. The lack of overall architectural knowledge is presently resisting the researchers to get through the scope of Internet of Things centric approaches. This literature surveys Internet of Things oriented architectures that are capable enough to improve the understanding of related tool, technology, and methodology to facilitate developer’s requirements. Directly or indirectly, the presented architectures propose to solve real-life problems by building and deployment of powerful Internet of Things notions. Further, research challenges have been investigated to incorporate the lacuna inside the current trends of architectures to motivate the academics and industries get involved into seeking the possible way outs to apt the exact power of Internet of Things. A main contribution of this survey paper is that it summarizes the current state-of-the-art of Internet of Things architectures in various domains systematically.
SMART (Simple Modular Architecture Research Tool) is an online resource (http://smart.embl.de/) for the identification and annotation of protein domains and the analysis of protein domain architectures. SMART version 7 contains manually curated models for 1009 protein domains, 200 more than in the previous version. The current release introduces several novel features and a streamlined user interface resulting in a faster and more comfortable workflow. The underlying protein databases were greatly expanded, resulting in a 2-fold increase in number of annotated domains and features. The database of completely sequenced genomes now includes 1133 species, compared to 630 in the previous release. Domain architecture analysis results can now be exported and visualized through the iTOL phylogenetic tree viewer. ‘metaSMART’ was introduced as a novel subresource dedicated to the exploration and analysis of domain architectures in various metagenomics data sets. An advanced full text search engine was implemented, covering the complete annotations for SMART and Pfam domains, as well as the complete set of protein descriptions, allowing users to quickly find relevant information.
Self-attention in Transformers generates dynamic operands that force conventional Compute-in-Memory (CIM) accelerators into costly non-volatile memory (NVM) reprogramming cycles, degrading throughput and stressing device endurance. Existing solutions either reduce but retain NVM writes through matrix decomposition or sparsity, or move attention computation to digital CMOS at the expense of NVM density. We present TrilinearCIM, a Double-Gate FeFET (DG-FeFET)-based architecture that uses back-gate modulation to realize a three-operand multiply-accumulate primitive for in-memory attention computation without dynamic ferroelectric reprogramming. Evaluated on BERT-base (GLUE) and ViT-base (ImageNet and CIFAR), TrilinearCIM outperforms conventional CIM on seven of nine GLUE tasks while achieving up to 46.6\% energy reduction and 20.4\% latency improvement over conventional FeFET CIM at 37.3\% area overhead. To our knowledge, this is the first architecture to perform complete Transformer attention computation exclusively in NVM cores without runtime reprogramming.
Abstract Despite the ubiquity of recurrent connections in the brain, their role in visual processing is less understood than that of feedforward connections. Occluded object recognition, an important cognitive capacity, is thought to rely on recurrent processing of visual information, but it remains unclear whether and how recurrent processing improves recognition of occluded objects. Using convolutional models of the visual system, we demonstrate how a distinct form of computation arises in recurrent–but not feedforward–networks that leverages information about the occluder to “explain-away” the occlusion—i.e., recognition of the occluder provides an account for missing or altered features, potentially rescuing recognition of occluded objects. This occurs without any constraint placed on the computation and is observed both across a systematic architecture sweep of convolutional models and in a model explicitly constructed to approximate the primate visual system. In line with these results, we find evidence consistent with explaining-away in a human psychophysics experiment. Finally, we developed an experimentally inspired recurrent model that recovers fine-grained features of occluded stimuli by explaining-away. Recurrent connections’ capability to explain-away may extend to more general cases where undoing context-dependent changes in representations benefits perception.
This systematic literature review investigates the factors influencing indoor environmental satisfaction in vernacular architecture, with particular attention to sustainability and sociocultural contexts. Drawing on 105 peer-reviewed studies published over the past two decades, the analysis employed thematic synthesis and cluster analysis to identify key design features, theoretical underpinnings, and variables affecting occupant satisfaction. Five major theories emerged, with Sustainability Theory, Bioclimatic Architecture Theory, and Ecological Systems Theory most frequently applied. Cluster analysis of 62 variables produced eight thematic categories, offering a structured basis for hypothesis development and integrative model formulation. The review further identified critical research gaps, including limited empirical validation, methodological inconsistencies, and underutilization of theory in explaining outcomes. Findings reveal that vernacular design features such as courtyards, shading devices, and materiality strongly contribute to SIEQ, while contemporary transitions risk diminishing comfort. This review highlights critical research gaps, particularly evaluation voids and theoretical underuse, and proposes integrative directions for architects and policymakers.
Flexibility and customization are key strengths of Field-Programmable Gate Arrays (FPGAs) when compared to other computing devices. For instance, FPGAs can efficiently implement arbitrary-precision arithmetic operations, and can perform aggressive synthesis optimizations to eliminate ineffectual operations. Motivated by sparsity and mixed-precision in deep neural networks (DNNs), we investigate how to optimize the current logic block architecture to increase its arithmetic density. We find that modern FPGA logic block architectures prevent the independent use of adder chains, and instead only allow adder chain inputs to be fed by look-up table (LUT) outputs. This only allows one of the two primitives -- either adders or LUTs -- to be used independently in one logic element and prevents their concurrent use, hampering area optimizations. In this work, we propose the Double Duty logic block architecture to enable the concurrent use of the adders and LUTs within a logic element. Without adding expensive logic cluster inputs, we use 4 of the existing inputs to bypass the LUTs and connect directly to the adder chain inputs. We accurately model our changes at both the circuit and CAD levels using open-source FPGA development tools. Our experimental evaluation on a Stratix-10-like architecture demonstrates area reductions of 21.6% on adder-intensive circuits from the Kratos benchmarks, and 9.3% and 8.2% on the more general Koios and VTR benchmarks respectively. These area improvements come without an impact to critical path delay, demonstrating that higher density is feasible on modern FPGA architectures by adding more flexibility in how the adder chain is used. Averaged across all circuits from our three evaluated benchmark set, our Double Duty FPGA architecture improves area-delay product by 9.7%.
Power efficiency is a critical design objective in modern CPU design. Architects need a fast yet accurate architecture-level power evaluation tool to perform early-stage power estimation. However, traditional analytical architecture-level power models are inaccurate. The recently proposed machine learning (ML)-based architecture-level power model requires sufficient data from known configurations for training, making it unrealistic. In this work, we propose AutoPower targeting fully automated architecture-level power modeling with limited known design configurations. We have two key observations: (1) The clock and SRAM dominate the power consumption of the processor, and (2) The clock and SRAM power correlate with structural information available at the architecture level. Based on these two observations, we propose the power group decoupling in AutoPower. First, AutoPower decouples across power groups to build individual power models for each group. Second, AutoPower designs power models by further decoupling the model into multiple sub-models within each power group. In our experiments, AutoPower can achieve a low mean absolute percentage error (MAPE) of 4.36\% and a high $R^2$ of 0.96 even with only two known configurations for training. This is 5\% lower in MAPE and 0.09 higher in $R^2$ compared with McPAT-Calib, the representative ML-based power model.
Thermoelastic damping (TED) is a primary source of energy dissipation in oscillating structures. The classical TED model is inadequate for micro- and nanoscale structures due to small-scale effects. In this paper, TED in transversely isotropic piezoelectric micro- and nanobeams is investigated using the Lord-Shulman heat conduction model and the Mindlin Form II strain gradient theory of elasticity with the micro-inertia effect. To account for small-scale effects, the Helmholtz free energy density is defined as a function of the strain tensor, strain gradient tensor, electric field vector, and temperature field, considering these parameters as independent state variables. The kinetic energy depends on both the velocity vector and the velocity gradient tensor. As an illustrative example, the free vibration of simply supported piezothermoelastic Euler-Bernoulli beams with isothermal boundary conditions is analyzed. The governing equations and boundary conditions are derived using a variational formulation. The results are compared with those obtained from the classical TED model. The influence of beam thickness, thermal relaxation time, vibration mode number, initial temperature, and micro-stiffness and micro-inertia length-scale coefficients on the TED in piezoelectric microbeam resonators is discussed. This study is of practical significance for the design of high-efficiency thermoelastic devices and systems.