Contrastive Multiview Coding
Yonglong Tian, Dilip Krishnan, Phillip Isola
Humans view the world through many sensory channels, e.g., the long-wavelength light channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right ear. Each view is noisy and incomplete, but important factors, such as physics, geometry, and semantics, tend to be shared between all views (e.g., a "dog" can be seen, heard, and felt). We investigate the classic hypothesis that a powerful representation is one that models view-invariant factors. We study this hypothesis under the framework of multiview contrastive learning, where we learn a representation that aims to maximize mutual information between different views of the same scene but is otherwise compact. Our approach scales to any number of views, and is view-agnostic. We analyze key properties of the approach that make it work, finding that the contrastive loss outperforms a popular alternative based on cross-view prediction, and that the more views we learn from, the better the resulting representation captures underlying scene semantics. Our approach achieves state-of-the-art results on image and video unsupervised learning benchmarks. Code is released at: this http URL.
2683 sitasi
en
Computer Science
Using NLG for speech synthesis of mathematical sentences
Alessandro Mazzei, Michele Monticone, Cristian Bernareggi
People with sight impairments can access to a mathematical expression by using its LaTeX source. However, this mechanisms have several drawbacks: (1) it assumes the knowledge of the LaTeX, (2) it is slow, since LaTeX is verbose and (3) it is error-prone since LATEX is a typographical language. In this paper we study the design of a natural language generation system for producing a mathematical sentence, i.e. a natural language sentence expressing the semantics of a mathematical expression. Moreover, we describe the main results of a first human based evaluation experiment of the system for Italian language.
2132 sitasi
en
Computer Science
The Semantic Web
G. Goos, J. Hartmanis, J. Leeuwen
et al.
The Berkeley FrameNet Project
Collin F. Baker, C. Fillmore, John B. Lowe
FrameNet is a three-year NSF-supported project in corpus-based computational lexicography, now in its second year (NSF IRI-9618838, "Tools for Lexicon Building"). The project's key features are (a) a commitment to corpus evidence for semantic and syntactic generalizations, and (b) the representation of the valences of its target words (mostly nouns, adjectives, and verbs) in which the semantic portion makes use of frame semantics. The resulting database will contain (a) descriptions of the semantic frames underlying the meanings of the words described, and (b) the valence representation (semantic and syntactic) of several thousand words and phrases, each accompanied by (c) a representative collection of annotated corpus attestations, which jointly exemplify the observed linkings between "frame elements" and their syntactic realizations (e.g. grammatical function, phrase type, and other syntactic traits). This report will present the project's goals and workflow, and information about the computational tools that have been adapted or created in-house for this work.
3307 sitasi
en
Computer Science
OpenMP: an industry standard API for shared-memory programming
L. Dagum, R. Menon
3767 sitasi
en
Computer Science
Temporal Logic of Programs
F. Kröger
5726 sitasi
en
Mathematics, Computer Science
Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar
Jean-Pierre Koenig
4004 sitasi
en
Computer Science
The Generative Lexicon
J. Pustejovsky
4108 sitasi
en
Computer Science
The syntax and semantics of complex nominals
J. Levi
777 sitasi
en
Computer Science
On the Trap Space Semantics of Normal Logic Programs
Van-Giang Trinh, Sylvain Soliman, François Fages
et al.
The logical semantics of normal logic programs has traditionally been based on the notions of Clark's completion and two-valued or three-valued canonical models, including supported, stable, regular, and well-founded models. Two-valued interpretations can also be seen as states evolving under a program's update operator, producing a transition graph whose fixed points and cycles capture stable and oscillatory behaviors, respectively. We refer to this view as dynamical semantics since it characterizes the program's meaning in terms of state-space trajectories, as first introduced in the stable (supported) class semantics. Recently, we have established a formal connection between Datalog^\neg programs (i.e., normal logic programs without function symbols) and Boolean networks, leading to the introduction of the trap space concept for Datalog^\neg programs. In this paper, we generalize the trap space concept to arbitrary normal logic programs, introducing trap space semantics as a new approach to their interpretation. This new semantics admits both model-theoretic and dynamical characterizations, providing a comprehensive approach to understanding program behavior. We establish the foundational properties of the trap space semantics and systematically relate it to the established model-theoretic semantics, including the stable (supported), stable (supported) partial, regular, and L-stable model semantics, as well as to the dynamical stable (supported) class semantics. Our results demonstrate that the trap space semantics offers a unified and precise framework for proving the existence of supported classes, strict stable (supported) classes, and regular models, in addition to uncovering and formalizing deeper relationships among the existing semantics of normal logic programs.
Dual-Region Encryption Model Based on a 3D-MNFC Chaotic System and Logistic Map
Jingyan Li, Yan Niu, Dan Yu
et al.
Facial information carries key personal privacy, and it is crucial to ensure its security through encryption. Traditional encryption for portrait images typically processes the entire image, despite the fact that most regions lack sensitive facial information. This approach is notably inefficient and imposes unnecessary computational burdens. To address this inefficiency while maintaining security, we propose a novel dual-region encryption model for portrait images. Firstly, a Multi-task Cascaded Convolutional Network (MTCNN) was adopted to efficiently segment facial images into two regions: facial and non-facial. Subsequently, given the high sensitivity of facial regions, a robust encryption scheme was designed by integrating a CNN-based key generator, the proposed three-dimensional Multi-module Nonlinear Feedback-coupled Chaotic System (3D-MNFC), DNA encoding, and bit reversal. The 3D-MNFC incorporating time-varying parameters, nonlinear terms and state feedback terms and coupling mechanisms has been proven to exhibit excellent chaotic performance. As for non-facial regions, the Logistic map combined with XOR operations is used to balance efficiency and basic security. Finally, the encrypted image is obtained by restoring the two ciphertext images to their original positions. Comprehensive security analyses confirm the exceptional performance of the regional model: large key space (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mn>2</mn><mn>536</mn></msup></semantics></math></inline-formula>) and near-ideal information entropy (7.9995), NPCR and UACI values of 99.6055% and 33.4599%. It is worth noting that the model has been verified to improve efficiency by at least 37.82%.
Extension-ranking Semantics for Abstract Argumentation Preprint
Kenneth Skiba, Tjitze Rienstra, Matthias Thimm
et al.
In this paper, we present a general framework for ranking sets of arguments in abstract argumentation based on their plausibility of acceptance. We present a generalisation of Dung's extension semantics as extension-ranking semantics, which induce a preorder over the power set of all arguments, allowing us to state that one set is "closer" to being acceptable than another. To evaluate the extension-ranking semantics, we introduce a number of principles that a well-behaved extension-ranking semantics should satisfy. We consider several simple base relations, each of which models a single central aspect of argumentative reasoning. The combination of these base relations provides us with a family of extension-ranking semantics. We also adapt a number of approaches from the literature for ranking extensions to be usable in the context of extension-ranking semantics, and evaluate their behaviour.
Mixture of Semantics Transmission for Generative AI-Enabled Semantic Communication Systems
Junjie Ni, Tong Wu, Zhiyong Chen
et al.
In this paper, we propose a mixture of semantics (MoS) transmission strategy for wireless semantic communication systems based on generative artificial intelligence (AI). At the transmitter, we divide an image into regions of interest (ROI) and reigons of non-interest (RONI) to extract their semantic information respectively. Semantic information of ROI can be allocated more bandwidth, while RONI can be represented in a compact form for transmission. At the receiver, a diffusion model reconstructs the full image using the received semantic information of ROI and RONI. Compared to existing generative AI-based methods, MoS enables more efficient use of channel resources by balancing visual fidelity and semantic relevance. Experimental results demonstrate that appropriate ROI-RONI allocation is critical. The MoS achieves notable performance gains in peak signal-to-noise ratio (PSNR) of ROI and CLIP score of RONI.
General Runge–Kutta–Nyström Methods for Linear Inhomogeneous Second-Order Initial Value Problems
Nadiyah Hussain Alharthi, Rubayyi T. Alqahtani, Theodore E. Simos
et al.
In this paper, general Runge–Kutta–Nyström (GRKN) methods are developed and analyzed, tailored for second-order initial value problems of the form <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msup><mi>y</mi><mrow><mo>″</mo></mrow></msup><mo>=</mo><mi>L</mi><msup><mi>y</mi><mo>′</mo></msup><mo>+</mo><mi>M</mi><mi>y</mi><mo>+</mo><mi>g</mi><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow></mrow></semantics></math></inline-formula>, where <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>L</mi><mo>,</mo><mi>M</mi><mo>∈</mo><msup><mi mathvariant="double-struck">R</mi><mrow><mi>n</mi><mo>×</mo><mi>n</mi></mrow></msup></mrow></semantics></math></inline-formula> are constant matrices with <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>n</mi><mo>≥</mo><mn>1</mn></mrow></semantics></math></inline-formula>. The construction of embedded pairs of orders <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>6</mn><mo>(</mo><mn>4</mn><mo>)</mo></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>7</mn><mo>(</mo><mn>5</mn><mo>)</mo></mrow></semantics></math></inline-formula>, suitable for adaptive integration strategies, is emphasized. By utilizing rooted tree theory and recent simplifications for linear inhomogeneous systems, symbolic order conditions are derived, and efficient schemes are designed through algebraic and evolutionary techniques. Numerical tests verify the superiority of our new derived pairs. In particular, this work introduces novel embedded GRKN pairs with reduced-order conditions that exploit the linearity and structure of the underlying system, enabling the construction of low-stage, high-accuracy integrators. The methods incorporate FSAL (First Same As Last) formulations, making them computationally efficient. They are tested on representative physical systems in one, two, and three dimensions, demonstrating notable improvements in efficiency and accuracy over existing high-order RKN methods.
Validation of Replicable Pipeline 3D Surface Reconstruction for Patient-Specific Abdominal Aortic Lumen Diagnostics
Edoardo Ugolini, Giorgio La Civita, Moad Al Aidroos
et al.
<b>Background:</b> Accurate prognoses are challenging in high-risk vascular conditions, such as abdominal aortic aneurysms, and limited diagnostic standards, decision-making criteria, and data semantics often hinder clinical reliability and impede diagnostics’ digital transition. This study aims to evaluate the performance, robustness, and usability of an automatic, replicable pipeline for aortic lumen surface reconstruction for pathological vessels. The goal is to provide a solid tool for geometric reconstruction to a more complex enhanced diagnostic framework. <b>Methods:</b> A U-Net convolutional neural network is trained using preoperative CTA scans, with 101 for model training and 14 for model testing, covering a wide anatomical and aortoiliac pathology spectrum. Validation included segmentation metric, robustness, reliability, and usability assessments. Performances are investigated by means of the test set’s prediction metrics for several instances of the model’s input. Clinical reliability is evaluated based on manual measurements performed by a vascular surgeon on the obtained 3D aortic lumen surfaces. <b>Results:</b> The test set is selected to cover a wide portion of aortoiliac pathologies. The algorithm demonstrated robustness with an average F1-score of 0.850 ± 0.120 and an intersection over union score of 0.760 ± 0.150 in the test set. Clinical reliability is assessed using the mean absolute errors for diameter and length measurements, respectively, of 1.73 mm and 2.27 mm. The 3D surface reconstruction demonstrated reliability, low processing times, and clinically valid reconstructions. <b>Conclusions:</b> The proposed algorithm can correctly reconstruct pathological vessels. Secondary aortoiliac pathologies are detected properly for challenging anatomies. To conclude, the proposed 3D reconstruction application to a digital, patient-specific diagnostic tool is, therefore, possible. Automatic replicable pipelines ensured the usability of the model’s outputs.
Synthesizing Formal Semantics from Executable Interpreters
Jiangyi Liu, Charlie Murphy, Anvay Grover
et al.
Program verification and synthesis frameworks that allow one to customize the language in which one is interested typically require the user to provide a formally defined semantics for the language. Because writing a formal semantics can be a daunting and error-prone task, this requirement stands in the way of such frameworks being adopted by non-expert users. We present an algorithm that can automatically synthesize inductively defined syntax-directed semantics when given (i) a grammar describing the syntax of a language and (ii) an executable (closed-box) interpreter for computing the semantics of programs in the language of the grammar. Our algorithm synthesizes the semantics in the form of Constrained-Horn Clauses (CHCs), a natural, extensible, and formal logical framework for specifying inductively defined relations that has recently received widespread adoption in program verification and synthesis. The key innovation of our synthesis algorithm is a Counterexample-Guided Synthesis (CEGIS) approach that breaks the hard problem of synthesizing a set of constrained Horn clauses into small, tractable expression-synthesis problems that can be dispatched to existing SyGuS synthesizers. Our tool Synantic synthesized inductively-defined formal semantics from 14 interpreters for languages used in program-synthesis applications. When synthesizing formal semantics for one of our benchmarks, Synantic unveiled an inconsistency in the semantics computed by the interpreter for a language of regular expressions; fixing the inconsistency resulted in a more efficient semantics and, for some cases, in a 1.2x speedup for a synthesizer solving synthesis problems over such a language.
Shape Sensing and Kinematic Control of a Cable-Driven Continuum Robot Based on Stretchable Capacitive Sensors
Wenjun Shen, Jianhui He, Guilin Yang
et al.
A Cable-Driven Continuum Robot (CDCR) that consists of a set of identical Cable-Driven Continuum Joint Modules (CDCJMs) is proposed in this paper. The CDCJMs merely produce 2-DOF bending motions by controlling driving cable lengths. In each CDCJM, a pattern-based flexible backbone is employed as a passive compliant joint to generate 2-DOF bending deflections, which can be characterized by two joint variables, i.e., the bending direction angle and the bending angle. However, as the bending deflection is determined by not only the lengths of the driving cables but also the gravity and payload, it will be inaccurate to compute the two joint variables with its kinematic model. In this work, two stretchable capacitive sensors are employed to measure the bending shape of the flexible backbone so as to accurately determine the two joint variables. Compared with FBG-based and vision-based shape-sensing methods, the proposed method with stretchable capacitive sensors has the advantages of high sensitivity to the bending deflection of the backbone, ease of implementation, and cost effectiveness. The initial location of a stretchable sensor is generally defined by its two endpoint positions on the surface of the backbone without bending. A generic shape-sensing model, i.e., the relationship between the sensor reading and the two joint variables, is formulated based on the 2-DOF bending deflection of the backbone. To further improve the accuracy of the shape-sensing model, a calibration method is proposed to compensate for the location errors of stretchable sensors. Based on the calibrated shape-sensing model, a sliding-mode-based closed-loop control method is implemented for the CDCR. In order to verify the effectiveness of the proposed closed-loop control method, the trajectory tracking accuracy experiments of the CDCR are conducted based on a circle trajectory, in which the radius of the circle is <inline-formula><math display="inline"><semantics><mrow><mn>55</mn><mspace width="0.166667em"/><mi>mm</mi></mrow></semantics></math></inline-formula>. The average tracking errors of the CDCR measured by the Qualisys motion capture system under the open-loop and the closed-loop control are 49.23 and <inline-formula><math display="inline"><semantics><mrow><mn>8.40</mn><mspace width="0.166667em"/><mi>mm</mi></mrow></semantics></math></inline-formula>, respectively, which is reduced by 82.94%.
A Semi-Supervised Active Learning Method for Structured Data Enhancement with Small Samples
Fangling Leng, Fan Li, Wei Lv
et al.
In order to solve the problems of the small capacity of structured data and uneven distribution among classes in machine learning tasks, a supervised generation method for structured data called WAGAN and a cyclic sampling method named SACS (Semi-supervised and Active-learning Cyclic Sampling), based on semi-supervised active learning, are proposed. The loss function and neural network structure are optimized, and the quantity and quality of the small sample set are enhanced. To enhance the reliability of generating pseudo-labels, a Semi-supervised Active learning Framework (SAF) is designed. This framework redistributes class labels to samples, which not only enhances the reliability of generated samples but also reduces the influence of noise and uncertainty on the generation of false labels. To mine the diversity information of generated samples, an uncertain sampling strategy based on spatial overlap is designed. This strategy incorporates the idea of spatial overlap and uses global and local sampling methods to calculate the information content of generated samples. Experimental results show that the proposed method performs better than other data enhancement methods on three different datasets. Compared to the original data, the average <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mrow><mi>F</mi><mn>1</mn></mrow><mrow><mi>m</mi><mi>a</mi><mi>c</mi><mi>r</mi><mi>o</mi></mrow></msub></semantics></math></inline-formula> value of the classification model is improved by 11.5%, 16.1%, and 19.6% relative to compared methods.
An extensible and unifying approach to retrospective clinical data modeling: the BrainTeaser Ontology
Guglielmo Faggioli, Laura Menotti, Stefano Marchesin
et al.
Abstract Automatic disease progression prediction models require large amounts of training data, which are seldom available, especially when it comes to rare diseases. A possible solution is to integrate data from different medical centres. Nevertheless, various centres often follow diverse data collection procedures and assign different semantics to collected data. Ontologies, used as schemas for interoperable knowledge bases, represent a state-of-the-art solution to homologate the semantics and foster data integration from various sources. This work presents the BrainTeaser Ontology (BTO), an ontology that models the clinical data associated with two brain-related rare diseases (ALS and MS) in a comprehensive and modular manner. BTO assists in organizing and standardizing the data collected during patient follow-up. It was created by harmonizing schemas currently used by multiple medical centers into a common ontology, following a bottom-up approach. As a result, BTO effectively addresses the practical data collection needs of various real-world situations and promotes data portability and interoperability. BTO captures various clinical occurrences, such as disease onset, symptoms, diagnostic and therapeutic procedures, and relapses, using an event-based approach. Developed in collaboration with medical partners and domain experts, BTO offers a holistic view of ALS and MS for supporting the representation of retrospective and prospective data. Furthermore, BTO adheres to Open Science and FAIR (Findable, Accessible, Interoperable, and Reusable) principles, making it a reliable framework for developing predictive tools to aid in medical decision-making and patient care. Although BTO is designed for ALS and MS, its modular structure makes it easily extendable to other brain-related diseases, showcasing its potential for broader applicability. Database URL https://zenodo.org/records/7886998 .
Computer applications to medicine. Medical informatics
CFNet: Cross-scale fusion network for medical image segmentation
Amina Benabid, Jing Yuan, Mohammed A.M. Elhassan
et al.
Learning multi-scale feature representations is essential for medical image segmentation. Most existing frameworks are based on U-shape architecture in which the high-resolution representation is recovered progressively by connecting different levels of the decoder with the low-resolution representation from the encoder. However, intrinsic defects in complementary feature fusion inhibit the U-shape from aggregating efficient global and discriminative features along object boundaries. While Transformer can help model the global features, their computation complexity limits the application in real-time medical scenarios. To address these issues, we propose a Cross-scale Fusion Network (CFNet), combining a cross-scale attention module and pyramidal module to fuse multi-stage/global context information. Specifically, we first utilize large kernel convolution to design the basic building block capable of extracting global and local information. Then, we propose a Bidirectional Atrous Spatial Pyramid Pooling (BiASPP), which employs atrous convolution in the bidirectional paths to capture various shapes and sizes of brain tumors. Furthermore, we introduce a cross-stage attention mechanism to reduce redundant information when merging features from two stages with different semantics. Extensive evaluation was performed on five medical image segmentation datasets: a 3D volumetric dataset, namely Brats benchmarks. CFNet-L achieves 85.74% and 90.98% dice score for Enhanced Tumor and Whole Tumor on Brats2018, respectively. Furthermore, our largest model CFNet-L outperformed other methods on 2D medical image. It achieved 71.95%, 82.79%, and 80.79% SE for STARE, DRIVE, and CHASEDB1, respectively. The code will be available at https://github.com/aminabenabid/CFNet
Electronic computers. Computer science