3D/2D Registration of Angiograms using Silhouette-based Differentiable Rendering
Taewoong Lee, Sarah Frisken, Nazim Haouchine
We present a method for 3D/2D registration of Digital Subtraction Angiography (DSA) images to provide valuable insight into brain hemodynamics and angioarchitecture. Our approach formulates the registration as a pose estimation problem, leveraging both anteroposterior and lateral DSA views and employing differentiable rendering. Preliminary experiments on real and synthetic datasets demonstrate the effectiveness of our method, with both qualitative and quantitative evaluations highlighting its potential for clinical applications. The code is available at https://github.com/taewoonglee17/TwoViewsDSAReg.
AnimateDiff-Lightning: Cross-Model Diffusion Distillation
Shanchuan Lin, Xiao Yang
We present AnimateDiff-Lightning for lightning-fast video generation. Our model uses progressive adversarial diffusion distillation to achieve new state-of-the-art in few-step video generation. We discuss our modifications to adapt it for the video modality. Furthermore, we propose to simultaneously distill the probability flow of multiple base diffusion models, resulting in a single distilled motion module with broader style compatibility. We are pleased to release our distilled AnimateDiff-Lightning model for the community's use.
DataViz3D: An Novel Method Leveraging Online Holographic Modeling for Extensive Dataset Preprocessing and Visualization
Jinli Duan
DataViz3D is an innovative online software that transforms complex datasets into interactive 3D spatial models using holographic technology. This tool enables users to generate scatter plot within a 3D space, accurately mapped to the XYZ coordinates of the dataset, providing a vivid and intuitive understanding of the spatial relationships inherent in the data. DataViz3D's user friendly interface makes advanced 3D modeling and holographic visualization accessible to a wide range of users, fostering new opportunities for collaborative research and education across various disciplines.
Enhancing Multimodal Understanding with CLIP-Based Image-to-Text Transformation
Chang Che, Qunwei Lin, Xinyu Zhao
et al.
The process of transforming input images into corresponding textual explanations stands as a crucial and complex endeavor within the domains of computer vision and natural language processing. In this paper, we propose an innovative ensemble approach that harnesses the capabilities of Contrastive Language-Image Pretraining models.
DeMansia: Mamba Never Forgets Any Tokens
Ricky Fang
This paper examines the mathematical foundations of transformer architectures, highlighting their limitations particularly in handling long sequences. We explore prerequisite models such as Mamba, Vision Mamba (ViM), and LV-ViT that pave the way for our proposed architecture, DeMansia. DeMansia integrates state space models with token labeling techniques to enhance performance in image classification tasks, efficiently addressing the computational challenges posed by traditional transformers. The architecture, benchmark, and comparisons with contemporary models demonstrate DeMansia's effectiveness. The implementation of this paper is available on GitHub at https://github.com/catalpaaa/DeMansia
MitoSeg: Mitochondria Segmentation Tool
Faris Serdar Taşel, Efe Çiftci
Recent studies suggest a potential link between the physical structure of mitochondria and neurodegenerative diseases. With advances in Electron Microscopy techniques, it has become possible to visualize the boundary and internal membrane structures of mitochondria in detail. It is crucial to automatically segment mitochondria from these images to investigate the relationship between mitochondria and diseases. In this paper, we present a software solution for mitochondrial segmentation, highlighting mitochondria boundaries in electron microscopy tomography images and generating corresponding 3D meshes.
A Fast HOG Descriptor Using Lookup Table and Integral Image
Chunde Huang, Jiaxiang Huang
The histogram of oriented gradients (HOG) is a widely used feature descriptor in computer vision for the purpose of object detection. In the paper, a modified HOG descriptor is described, it uses a lookup table and the method of integral image to speed up the detection performance by a factor of 5~10. By exploiting the special hardware features of a given platform(e.g. a digital signal processor), further improvement can be made to the HOG descriptor in order to have real-time object detection and tracking.
A filter based approach for inbetweening
Yuichi Yagi
We present a filter based approach for inbetweening. We train a convolutional neural network to generate intermediate frames. This network aim to generate smooth animation of line drawings. Our method can process scanned images directly. Our method does not need to compute correspondence of lines and topological changes explicitly. We experiment our method with real animation production data. The results show that our method can generate intermediate frames partially.
Research on Bi-mode Biometrics Based on Deep Learning
Hao Jiang
In view of the fact that biological characteristics have excellent independent distinguishing characteristics,biometric identification technology involves almost all the relevant areas of human distinction. Fingerprints, iris, face, voice-print and other biological features have been widely used in the public security departments to detect detection, mobile equipment unlock, target tracking and other fields. With the use of electronic devices more and more widely and the frequency is getting higher and higher. Only the Biometrics identification technology with excellent recognition rate can guarantee the long-term development of these fields.
Collaborative Low-Rank Subspace Clustering
Stephen Tierney, Yi Guo, Junbin Gao
In this paper we present Collaborative Low-Rank Subspace Clustering. Given multiple observations of a phenomenon we learn a unified representation matrix. This unified matrix incorporates the features from all the observations, thus increasing the discriminative power compared with learning the representation matrix on each observation separately. Experimental evaluation shows that our method outperforms subspace clustering on separate observations and the state of the art collaborative learning algorithm.
DimensionApp : android app to estimate object dimensions
Suriya Singh, Vijay Kumar
In this project, we develop an android app that uses on computer vision techniques to estimate an object dimension present in field of view. The app while having compact size, is accurate upto +/- 5 mm and robust towards touch inputs. We use single-view metrology to compute accurate measurement. Unlike previous approaches, our technique does not rely on line detection and can be generalize to any object shape easily.
A Bayesian approach to type-specific conic fitting
Matthew Collett
A perturbative approach is used to quantify the effect of noise in data points on fitted parameters in a general homogeneous linear model, and the results applied to the case of conic sections. There is an optimal choice of normalisation that minimises bias, and iteration with the correct reweighting significantly improves statistical reliability. By conditioning on an appropriate prior, an unbiased type-specific fit can be obtained. Error estimates for the conic coefficients may also be used to obtain both bias corrections and confidence intervals for other curve parameters.
Motion trails from time-lapse video
Camille Goudeseune
From an image sequence captured by a stationary camera, background subtraction can detect moving foreground objects in the scene. Distinguishing foreground from background is further improved by various heuristics. Then each object's motion can be emphasized by duplicating its positions as a motion trail. These trails clarify the objects' spatial relationships. Also, adding motion trails to a video before previewing it at high speed reduces the risk of overlooking transient events.
Efficient Topology-Controlled Sampling of Implicit Shapes
Jason Chang, John W. Fisher
Sampling from distributions of implicitly defined shapes enables analysis of various energy functionals used for image segmentation. Recent work describes a computationally efficient Metropolis-Hastings method for accomplishing this task. Here, we extend that framework so that samples are accepted at every iteration of the sampler, achieving an order of magnitude speed up in convergence. Additionally, we show how to incorporate topological constraints.
A Fuzzy View on k-Means Based Signal Quantization with Application in Iris Segmentation
Nicolaie Popescu-Bodorin
This paper shows that the k-means quantization of a signal can be interpreted both as a crisp indicator function and as a fuzzy membership assignment describing fuzzy clusters and fuzzy boundaries. Combined crisp and fuzzy indicator functions are defined here as natural generalizations of the ordinary crisp and fuzzy indicator functions, respectively. An application to iris segmentation is presented together with a demo program.
Iterative exact global histogram specification and SSIM gradient ascent: a proof of convergence, step size and parameter selection
Alireza Avanaki
The SSIM-optimized exact global histogram specification (EGHS) is shown to converge in the sense that the first order approximation of the result's quality (i.e., its structural similarity with input) does not decrease in an iteration, when the step size is small. Each iteration is composed of SSIM gradient ascent and basic EGHS with the specified target histogram. Selection of step size and other parameters is also discussed.
Real-time Texture Error Detection
Dan Laurentiu Lacrama, Florin Alexa, Adriana Balta
This paper advocates an improved solution for real-time error detection of texture errors that occurs in the production process in textile industry. The research is focused on the mono-color products with 3D texture model (Jaquard fabrics). This is a more difficult task than, for example, 2D multicolor textures.
Color Dipole Moments for Edge Detection
Amelia Sparavigna
Dipole and higher moments are physical quantities used to describe a charge distribution. In analogy with electromagnetism, it is possible to define the dipole moments for a gray-scale image, according to the single aspect of a gray-tone map. In this paper we define the color dipole moments for color images. For color maps in fact, we have three aspects, the three primary colors, to consider. Associating three color charges to each pixel, color dipole moments can be easily defined and used for edge detection.
A dyadic solution of relative pose problems
Patrick Erik Bradley
A hierarchical interval subdivision is shown to lead to a $p$-adic encoding of image data. This allows in the case of the relative pose problem in computer vision and photogrammetry to derive equations having 2-adic numbers as coefficients, and to use Hensel's lifting method to their solution. This method is applied to the linear and non-linear equations coming from eight, seven or five point correspondences. An inherent property of the method is its robustness.
Parametrical Neural Networks and Some Other Similar Architectures
Leonid B. Litinskii
A review of works on associative neural networks accomplished during last four years in the Institute of Optical Neural Technologies RAS is given. The presentation is based on description of parametrical neural networks (PNN). For today PNN have record recognizing characteristics (storage capacity, noise immunity and speed of operation). Presentation of basic ideas and principles is accentuated.