Hasil untuk "cs.CV"

Menampilkan 20 dari ~116413 hasil · dari arXiv, DOAJ, CrossRef

JSON API
arXiv Open Access 2025
3D/2D Registration of Angiograms using Silhouette-based Differentiable Rendering

Taewoong Lee, Sarah Frisken, Nazim Haouchine

We present a method for 3D/2D registration of Digital Subtraction Angiography (DSA) images to provide valuable insight into brain hemodynamics and angioarchitecture. Our approach formulates the registration as a pose estimation problem, leveraging both anteroposterior and lateral DSA views and employing differentiable rendering. Preliminary experiments on real and synthetic datasets demonstrate the effectiveness of our method, with both qualitative and quantitative evaluations highlighting its potential for clinical applications. The code is available at https://github.com/taewoonglee17/TwoViewsDSAReg.

en cs.CV
arXiv Open Access 2024
AnimateDiff-Lightning: Cross-Model Diffusion Distillation

Shanchuan Lin, Xiao Yang

We present AnimateDiff-Lightning for lightning-fast video generation. Our model uses progressive adversarial diffusion distillation to achieve new state-of-the-art in few-step video generation. We discuss our modifications to adapt it for the video modality. Furthermore, we propose to simultaneously distill the probability flow of multiple base diffusion models, resulting in a single distilled motion module with broader style compatibility. We are pleased to release our distilled AnimateDiff-Lightning model for the community's use.

en cs.CV, cs.AI
arXiv Open Access 2024
DataViz3D: An Novel Method Leveraging Online Holographic Modeling for Extensive Dataset Preprocessing and Visualization

Jinli Duan

DataViz3D is an innovative online software that transforms complex datasets into interactive 3D spatial models using holographic technology. This tool enables users to generate scatter plot within a 3D space, accurately mapped to the XYZ coordinates of the dataset, providing a vivid and intuitive understanding of the spatial relationships inherent in the data. DataViz3D's user friendly interface makes advanced 3D modeling and holographic visualization accessible to a wide range of users, fostering new opportunities for collaborative research and education across various disciplines.

en cs.CV
arXiv Open Access 2024
Enhancing Multimodal Understanding with CLIP-Based Image-to-Text Transformation

Chang Che, Qunwei Lin, Xinyu Zhao et al.

The process of transforming input images into corresponding textual explanations stands as a crucial and complex endeavor within the domains of computer vision and natural language processing. In this paper, we propose an innovative ensemble approach that harnesses the capabilities of Contrastive Language-Image Pretraining models.

en cs.CV, cs.AI
arXiv Open Access 2024
DeMansia: Mamba Never Forgets Any Tokens

Ricky Fang

This paper examines the mathematical foundations of transformer architectures, highlighting their limitations particularly in handling long sequences. We explore prerequisite models such as Mamba, Vision Mamba (ViM), and LV-ViT that pave the way for our proposed architecture, DeMansia. DeMansia integrates state space models with token labeling techniques to enhance performance in image classification tasks, efficiently addressing the computational challenges posed by traditional transformers. The architecture, benchmark, and comparisons with contemporary models demonstrate DeMansia's effectiveness. The implementation of this paper is available on GitHub at https://github.com/catalpaaa/DeMansia

en cs.CV, cs.AI
arXiv Open Access 2024
MitoSeg: Mitochondria Segmentation Tool

Faris Serdar Taşel, Efe Çiftci

Recent studies suggest a potential link between the physical structure of mitochondria and neurodegenerative diseases. With advances in Electron Microscopy techniques, it has become possible to visualize the boundary and internal membrane structures of mitochondria in detail. It is crucial to automatically segment mitochondria from these images to investigate the relationship between mitochondria and diseases. In this paper, we present a software solution for mitochondrial segmentation, highlighting mitochondria boundaries in electron microscopy tomography images and generating corresponding 3D meshes.

arXiv Open Access 2017
A Fast HOG Descriptor Using Lookup Table and Integral Image

Chunde Huang, Jiaxiang Huang

The histogram of oriented gradients (HOG) is a widely used feature descriptor in computer vision for the purpose of object detection. In the paper, a modified HOG descriptor is described, it uses a lookup table and the method of integral image to speed up the detection performance by a factor of 5~10. By exploiting the special hardware features of a given platform(e.g. a digital signal processor), further improvement can be made to the HOG descriptor in order to have real-time object detection and tracking.

en cs.CV
arXiv Open Access 2017
A filter based approach for inbetweening

Yuichi Yagi

We present a filter based approach for inbetweening. We train a convolutional neural network to generate intermediate frames. This network aim to generate smooth animation of line drawings. Our method can process scanned images directly. Our method does not need to compute correspondence of lines and topological changes explicitly. We experiment our method with real animation production data. The results show that our method can generate intermediate frames partially.

en cs.CV, cs.GR
arXiv Open Access 2017
Research on Bi-mode Biometrics Based on Deep Learning

Hao Jiang

In view of the fact that biological characteristics have excellent independent distinguishing characteristics,biometric identification technology involves almost all the relevant areas of human distinction. Fingerprints, iris, face, voice-print and other biological features have been widely used in the public security departments to detect detection, mobile equipment unlock, target tracking and other fields. With the use of electronic devices more and more widely and the frequency is getting higher and higher. Only the Biometrics identification technology with excellent recognition rate can guarantee the long-term development of these fields.

en cs.CV
arXiv Open Access 2017
Collaborative Low-Rank Subspace Clustering

Stephen Tierney, Yi Guo, Junbin Gao

In this paper we present Collaborative Low-Rank Subspace Clustering. Given multiple observations of a phenomenon we learn a unified representation matrix. This unified matrix incorporates the features from all the observations, thus increasing the discriminative power compared with learning the representation matrix on each observation separately. Experimental evaluation shows that our method outperforms subspace clustering on separate observations and the state of the art collaborative learning algorithm.

en cs.CV
arXiv Open Access 2016
DimensionApp : android app to estimate object dimensions

Suriya Singh, Vijay Kumar

In this project, we develop an android app that uses on computer vision techniques to estimate an object dimension present in field of view. The app while having compact size, is accurate upto +/- 5 mm and robust towards touch inputs. We use single-view metrology to compute accurate measurement. Unlike previous approaches, our technique does not rely on line detection and can be generalize to any object shape easily.

en cs.CV
arXiv Open Access 2016
A Bayesian approach to type-specific conic fitting

Matthew Collett

A perturbative approach is used to quantify the effect of noise in data points on fitted parameters in a general homogeneous linear model, and the results applied to the case of conic sections. There is an optimal choice of normalisation that minimises bias, and iteration with the correct reweighting significantly improves statistical reliability. By conditioning on an appropriate prior, an unbiased type-specific fit can be obtained. Error estimates for the conic coefficients may also be used to obtain both bias corrections and confidence intervals for other curve parameters.

en cs.CV
arXiv Open Access 2015
Motion trails from time-lapse video

Camille Goudeseune

From an image sequence captured by a stationary camera, background subtraction can detect moving foreground objects in the scene. Distinguishing foreground from background is further improved by various heuristics. Then each object's motion can be emphasized by duplicating its positions as a motion trail. These trails clarify the objects' spatial relationships. Also, adding motion trails to a video before previewing it at high speed reduces the risk of overlooking transient events.

en cs.CV
arXiv Open Access 2012
Efficient Topology-Controlled Sampling of Implicit Shapes

Jason Chang, John W. Fisher

Sampling from distributions of implicitly defined shapes enables analysis of various energy functionals used for image segmentation. Recent work describes a computationally efficient Metropolis-Hastings method for accomplishing this task. Here, we extend that framework so that samples are accepted at every iteration of the sampler, achieving an order of magnitude speed up in convergence. Additionally, we show how to incorporate topological constraints.

en cs.CV
arXiv Open Access 2011
A Fuzzy View on k-Means Based Signal Quantization with Application in Iris Segmentation

Nicolaie Popescu-Bodorin

This paper shows that the k-means quantization of a signal can be interpreted both as a crisp indicator function and as a fuzzy membership assignment describing fuzzy clusters and fuzzy boundaries. Combined crisp and fuzzy indicator functions are defined here as natural generalizations of the ordinary crisp and fuzzy indicator functions, respectively. An application to iris segmentation is presented together with a demo program.

en cs.CV
arXiv Open Access 2010
Iterative exact global histogram specification and SSIM gradient ascent: a proof of convergence, step size and parameter selection

Alireza Avanaki

The SSIM-optimized exact global histogram specification (EGHS) is shown to converge in the sense that the first order approximation of the result's quality (i.e., its structural similarity with input) does not decrease in an iteration, when the step size is small. Each iteration is composed of SSIM gradient ascent and basic EGHS with the specified target histogram. Selection of step size and other parameters is also discussed.

en cs.CV, cs.MM
arXiv Open Access 2009
Real-time Texture Error Detection

Dan Laurentiu Lacrama, Florin Alexa, Adriana Balta

This paper advocates an improved solution for real-time error detection of texture errors that occurs in the production process in textile industry. The research is focused on the mono-color products with 3D texture model (Jaquard fabrics). This is a more difficult task than, for example, 2D multicolor textures.

en cs.CV
arXiv Open Access 2009
Color Dipole Moments for Edge Detection

Amelia Sparavigna

Dipole and higher moments are physical quantities used to describe a charge distribution. In analogy with electromagnetism, it is possible to define the dipole moments for a gray-scale image, according to the single aspect of a gray-tone map. In this paper we define the color dipole moments for color images. For color maps in fact, we have three aspects, the three primary colors, to consider. Associating three color charges to each pixel, color dipole moments can be easily defined and used for edge detection.

en cs.CV
arXiv Open Access 2009
A dyadic solution of relative pose problems

Patrick Erik Bradley

A hierarchical interval subdivision is shown to lead to a $p$-adic encoding of image data. This allows in the case of the relative pose problem in computer vision and photogrammetry to derive equations having 2-adic numbers as coefficients, and to use Hensel's lifting method to their solution. This method is applied to the linear and non-linear equations coming from eight, seven or five point correspondences. An inherent property of the method is its robustness.

en cs.CV
arXiv Open Access 2006
Parametrical Neural Networks and Some Other Similar Architectures

Leonid B. Litinskii

A review of works on associative neural networks accomplished during last four years in the Institute of Optical Neural Technologies RAS is given. The presentation is based on description of parametrical neural networks (PNN). For today PNN have record recognizing characteristics (storage capacity, noise immunity and speed of operation). Presentation of basic ideas and principles is accentuated.

en cs.CV, cs.NE

Halaman 11 dari 5821