We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, e.g., edges, depth, segmentation, human pose, etc., with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.
Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time architectures such as linear attention, gated convolution and recurrent models, and structured state space models (SSMs) have been developed to address Transformers' computational inefficiency on long sequences, but they have not performed as well as attention on important modalities such as language. We identify that a key weakness of such models is their inability to perform content-based reasoning, and make several improvements. First, simply letting the SSM parameters be functions of the input addresses their weakness with discrete modalities, allowing the model to selectively propagate or forget information along the sequence length dimension depending on the current token. Second, even though this change prevents the use of efficient convolutions, we design a hardware-aware parallel algorithm in recurrent mode. We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even MLP blocks (Mamba). Mamba enjoys fast inference (5$\times$ higher throughput than Transformers) and linear scaling in sequence length, and its performance improves on real data up to million-length sequences. As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics. On language modeling, our Mamba-3B model outperforms Transformers of the same size and matches Transformers twice its size, both in pretraining and downstream evaluation.
نطنز یکی از قدیمیترین زیستگاههای کویری ایران، شهری با سازمان فضایی مبتنیبر باغشهر بوده که بهدلایل مختلفی ازجمله دورافتادن از آزادراه جدید شرق اصفهان، خشکسالی، تغییرات اقلیمی و توسعة صنعتی غیرمکانمند مبتنیبر سود اقتصادی صرف، رونق خود را از دست داده است. این پژوهش قصد دارد با روش مطالعة اسناد کتابخانهای و بهرهمندی از بازدید میدانی و مصاحبة آزاد با کارشناسان، شهروندان و مدیران شهری، دلیل ناسازگاری توسعه با بستر میزبان آن را بررسی کند. یافتة حاصل نشان میدهد در صورتی میتوان توسعة صنعتی پایدار و متوازن داشت که به تطبیق مقیاسِ توسعة صنعتی با مقیاس پتانسیلها و ظرفیتهای بستر آن، اعم از ظرفیتهای کالبدی-مادی بستر و هم ظرفیت اذهان جامعة محلی در پذیرابودن توسعه، بهعنوان یکی از ارکان اصلی توسعة مکانمند توجه ویژه شود، در غیر این صورت توسعة زیانهای جبران ناپذیری به بستر خود خواهد زد. آنچه در نطنز باعث تخریب باغات و بهتبع آن سازمان فضایی مبتنیبر ساختار باغشهری آن شد.
Economic growth, development, planning, Ethnology. Social and cultural anthropology
Deep-learning generative AI promises to transform architectural design, yet its potential employment and ready-to-use capacity for professional workflows are unclear. This study presents a systematic review conducted in accordance with PRISMA 2020 guidelines, synthesizing peer-reviewed work from 2015 to 2025 to assess how GenAI methods align with architectural practice. A total of 1566 records were initially retrieved across databases, of which 42 studies met eligibility criteria after structured screening and selection. Each was evaluated using five indicators with a three-tier rubric: Output Representation Type, Pipeline Integration, Workflow Standardization, Tool Readiness, and Technical Skillset. Results show that most outputs are raster images or non-editable objects, with only a minority producing CAD/BIM-ready geometry. Workflow pipelines are often fragmented with manual hand-offs and most GenAI methods map only onto the early conceptual design stage. Prototypes frequently require bespoke coding and advanced expertise. These findings indicate a persistent gap between experimentation with ideation-oriented GenAI and the pragmatism of CAD/BIM-centered delivery. By framing the proposed rubric as a workflow maturity model, this review contributes a replicable benchmark for assessing practice readiness and identifying pathways toward mainstream adoption. For GenAI to move from prototypes to mainstream architectural design practice, it is essential to address not only technical barriers, but also cultural issues such as professional skepticism and reliability concerns, as well as ecosystem challenges of data sharing, authorship, and liability.
This article discusses the role of images in archaeological disciplines and the contribution that graphic sciences can make to research in this subject area. In archaeology, and not only, ‘visualization’ differs significantly from the more commonly used noun ‘representation.’ In this sense, archaeological visualization is a practice of reconstructing and understanding the past rather than documenting and representing only the material remains that have come down to us. From archaeological drawing to virtual reality, numerous techniques and tools from the graphic sciences are applied in archaeology. Some of these can now be ascribed to the disciplinary tools, while others fall outside the specific skills of the archaeologist and require interaction with the disciplines deputed to visualization and, thus, with the graphic sciences. In order to better understand the difference between visualization and representation in archaeology, the article uses prenuragic altar of Monte d’Accoddi as a case study to focus on the creation of different graphic-visual products starting from the same model, in order to demonstrate the role of different graphic artefacts.
The recognition of oracle bone script is of significant importance for understanding the evolution of Chinese characters, their morphological features, and semantic changes. However, traditional methods and some deep learning models have limited ability to capture the complex forms and fine details of oracle bone script, which makes it difficult to fully detect subtle differences between characters. Additionally, models trained on such data tend to struggle with recognizing rare or unseen characters, often leading to recognition errors. Therefore, improving the robustness of these models is essential. This paper presents a novel recognition algorithm based on YOLOv5, incorporating BiFPN-SDI, C3-DAttention, and Detect_Efficient to significantly enhance detection performance. BiFPN-SDI enables more precise feature fusion and attention mechanisms, improving the detection of small targets. C3-DAttention combines channel and spatial attention mechanisms to enhance feature extraction in deep convolutional neural networks. Detect_Efficient further improves the model’s detection and recognition capabilities. Experimental results show that the proposed improvements lead to a 0.7% increase in precision, a 1.1% increase in recall, and a 0.3% improvement in MAP@50. Furthermore, the model’s parameter count is reduced to 1,009,668, and its processing speed is increased to 90 fps, significantly improving the ability to extract and recognize features in oracle bone script.
Il daylighting design rappresenta oggi una componente essenziale nella progettazione architettonica, in relazione a obiettivi ambientali, qualitativi e percettivi. Tuttavia, nonostante il crescente interesse verso la luce naturale, permangono lacune significative nella formazione e nella pratica progettuale, spesso legata a logiche prestazionali o standardizzate. Il progetto SOLARIA (Sistema Operativo di supporto alle scelte progettuali per LA progettazione ARchitettonica basata sull’Intelligenza ArtificIAle) nasce per supportare la costruzione di una consapevolezza progettuale più articolata, attraverso un sistema di supporto decisionale (DSS) basato su intelligenza artificiale non-generativa. Il sistema restituisce informazioni qualitative e contestuali, organizzate in due repertori (strategie e dispositivi), e guida il progettista nella fase preliminare mediante un’interfaccia a query. L’obiettivo è restituire centralità alla luce come materia progettuale, attivando un uso critico e creativo dell’informazione, e contribuendo a una cultura del progetto più sensibile, informata e sostenibile.
Mohammad Aldossary, Ibrahim Alzamil, Jaber Almutairi
Due to Internet of Drones (IoD) technology, drone networks have proliferated, transforming surveillance, logistics, and disaster management. Distributed Denial of Service (DDoS) attacks, malware infections, and communication abnormalities increase cybersecurity dangers to these networks, threatening operational safety and efficiency. Current Intrusion Detection Systems (IDSs) fail to handle drone transmission data’s dynamic, high-dimensional nature, resulting in inadequate real-time anomaly identification and mitigation. This study presents the Cross-Layer Convolutional Attention Network (CLCAN), a new IDS architecture for IoD networks. CLCAN accurately detects complex cyber threats using multi-scale convolutional processing, hierarchical contextual attention, and dynamic feature fusion. Preprocessing methods like weighted differential scaling and gradient-based adaptive resampling improve data quality and reduce class imbalances. Contextual attribute transformation captures the nuanced network behaviors needed for anomaly identification. The proposed technique is shown to be necessary and effective by real-world drone communication dataset evaluations. CLCAN outperforms CNN, LSTM, and XGBoost with 98.4% accuracy, 98.7% recall, and 98.1% F1-score. The model has a remarkable AUC of 0.991. CLCAN can handle datasets of over 118,000 balanced data records in 85 s, compared to 180 s for comparable frameworks. This study pioneers a unified security solution for Drone-to-Drone (D2D) and Drone-to-Base Station (D2BS) communications, filling a crucial IoD security gap. It protects mission-critical drone operations with a strong, efficient, and scalable IDS from emerging cyber threats.
Huinan Kang, Yunsen Hu, Sakdirat Kaewunruen
et al.
Geometric and mechanical analyses were performed on 82 selenium-rich eggs, which underwent hydrostatic testing as 2 raw eggs, 60 steamed eggs, and 20 emptied eggshells. By analyzing the geometric and mechanical properties of the egg, we can draw inspiration from its structural design to create a pressure shell capable of effectively withstanding the immense water pressure in deep-sea environments. The major axis, minor axis, egg-shape coefficient, weight, thickness, volume, superficial area, and ultimate compressive strength were measured, and their correlations were analyzed. The thickness, egg-shape coefficient, and ultimate compressive strength were normally distributed, and many parameters were strongly correlated. Moreover, finite element analysis was conducted to evaluate the compressive resistance of egg-like pressure shells made from different materials, including metal, ceramic, resin, and selenium-enriched eggshell materials. The performance ratio of the ceramic shells was 2.6 times higher than that of eggshells, and eggshells outperformed metal and resin shells by factors of 2.14 and 4.49, respectively. The eggshells had excellent compression resistance. These findings offer novel insights into the design and optimization of egg-like pressure shells.
We investigated the determinants of awareness, utilization, and satisfaction regarding financial aid programs for single-person households in South Korea and proposed policy enhancements. Our analysis employed logistic regression on microdata from the “2020 Housing Survey” by Statistics Korea, covering the nation and all age groups. We categorized single-person household traits affecting program awareness, utilization, and satisfaction into demographic, socio-economic, housing, and housing perception factors. The dependent variables included awareness, utilization status, and satisfaction levels of government-sponsored financial support programs, which were measured on a four-point Likert scale. The independent variables encompassed demographic, socio-economic, and housing characteristics, which were analyzed comprehensively. We identified factors that influenced awareness, utilization, and satisfaction and recommended tailored policy measures. The findings revealed lower awareness among elderly individuals, women, rural residents, and rental households. Moreover, older age, lower income, rental, and one-room dwelling households exhibited lower utilization rates, with decreased housing and residential environment satisfaction correlating with diminished program satisfaction. Due to the diverse characteristics of single-person households, strategic interventions are crucial. Measures to bridge information gaps, establish comprehensive long-term support systems, and develop differentiated policies tailored to single-person household traits are imperative for improving financial aid programs for this demographic.