Hasil untuk "cs.CV"

Menampilkan 20 dari ~116476 hasil Β· dari CrossRef, DOAJ, arXiv

JSON API
arXiv Open Access 2023
A Feature-based Approach for the Recognition of Image Quality Degradation in Automotive Applications

Florian Bauer

Cameras play a crucial role in modern driver assistance systems and are an essential part of the sensor technology for automated driving. The quality of images captured by in-vehicle cameras highly influences the performance of visual perception systems. This paper presents a feature-based algorithm to detect certain effects that can degrade image quality in automotive applications. The algorithm is based on an intelligent selection of significant features. Due to the small number of features, the algorithm performs well even with small data sets. Experiments with different data sets show that the algorithm can detect soiling adhering to camera lenses and classify different types of image degradation.

en cs.CV
arXiv Open Access 2023
Dual PatchNorm

Manoj Kumar, Mostafa Dehghani, Neil Houlsby

We propose Dual PatchNorm: two Layer Normalization layers (LayerNorms), before and after the patch embedding layer in Vision Transformers. We demonstrate that Dual PatchNorm outperforms the result of exhaustive search for alternative LayerNorm placement strategies in the Transformer block itself. In our experiments, incorporating this trivial modification, often leads to improved accuracy over well-tuned Vision Transformers and never hurts.

en cs.CV, cs.LG
arXiv Open Access 2023
SSSegmenation: An Open Source Supervised Semantic Segmentation Toolbox Based on PyTorch

Zhenchao Jin

This paper presents SSSegmenation, which is an open source supervised semantic image segmentation toolbox based on PyTorch. The design of this toolbox is motivated by MMSegmentation while it is easier to use because of fewer dependencies and achieves superior segmentation performance under a comparable training and testing setup. Moreover, the toolbox also provides plenty of trained weights for popular and contemporary semantic segmentation methods, including Deeplab, PSPNet, OCRNet, MaskFormer, \emph{etc}. We expect that this toolbox can contribute to the future development of semantic segmentation. Codes and model zoos are available at \href{https://github.com/SegmentationBLWX/sssegmentation/}{SSSegmenation}.

en cs.CV
arXiv Open Access 2023
Instruct-Video2Avatar: Video-to-Avatar Generation with Instructions

Shaoxu Li

We propose a method for synthesizing edited photo-realistic digital avatars with text instructions. Given a short monocular RGB video and text instructions, our method uses an image-conditioned diffusion model to edit one head image and uses the video stylization method to accomplish the editing of other head images. Through iterative training and update (three times or more), our method synthesizes edited photo-realistic animatable 3D neural head avatars with a deformable neural radiance field head synthesis method. In quantitative and qualitative studies on various subjects, our method outperforms state-of-the-art methods.

en cs.CV
arXiv Open Access 2023
A review of UAV Visual Detection and Tracking Methods

Raed Abu Zitar, Mohammad Al-Betar, Mohamad Ryalat et al.

This paper presents a review of techniques used for the detection and tracking of UAVs or drones. There are different techniques that depend on collecting measurements of the position, velocity, and image of the UAV and then using them in detection and tracking. Hybrid detection techniques are also presented. The paper is a quick reference for a wide spectrum of methods that are used in the drone detection process.

en cs.CV, eess.SP
arXiv Open Access 2023
When ChatGPT for Computer Vision Will Come? From 2D to 3D

Chenghao Li, Chaoning Zhang

ChatGPT and its improved variant GPT4 have revolutionized the NLP field with a single model solving almost all text related tasks. However, such a model for computer vision does not exist, especially for 3D vision. This article first provides a brief view on the progress of deep learning in text, image and 3D fields from the model perspective. Moreover, this work further discusses how AIGC evolves from the data perspective. On top of that, this work presents an outlook on the development of AIGC in 3D from the data perspective.

en cs.CV
arXiv Open Access 2022
Nuclei instance segmentation and classification in histopathology images with StarDist

Martin Weigert, Uwe Schmidt

Instance segmentation and classification of nuclei is an important task in computational pathology. We show that StarDist, a deep learning nuclei segmentation method originally developed for fluorescence microscopy, can be extended and successfully applied to histopathology images. This is substantiated by conducting experiments on the Lizard dataset, and through entering the Colon Nuclei Identification and Counting (CoNIC) challenge 2022, where our approach achieved the first spot on the leaderboard for the segmentation and classification task for both the preliminary and final test phase.

CrossRef Open Access 2020
Get Better: CV Clinic

Stephen Royle

Part of a series on the development of Early Career Researchers in the lab. The idea for the CV clinic came from the lab themselves. We had previously had a session on creating a research profile and a large part of that session was spent looking at CVs.

arXiv Open Access 2019
Contrastive Learning for Lifted Networks

Christopher Zach, Virginia Estellers

In this work we address supervised learning of neural networks via lifted network formulations. Lifted networks are interesting because they allow training on massively parallel hardware and assign energy models to discriminatively trained neural networks. We demonstrate that the training methods for lifted networks proposed in the literature have significant limitations and show how to use a contrastive loss to address those limitations. We demonstrate that this contrastive training approximates back-propagation in theory and in practice and that it is superior to the training objective regularly used for lifted networks.

en cs.CV
arXiv Open Access 2018
Dense Scene Flow from Stereo Disparity and Optical Flow

RenΓ© Schuster, Oliver WasenmΓΌller, Didier Stricker

Scene flow describes 3D motion in a 3D scene. It can either be modeled as a single task, or it can be reconstructed from the auxiliary tasks of stereo depth and optical flow estimation. While the second method can achieve real-time performance by using real-time auxiliary methods, it will typically produce non-dense results. In this representation of a basic combination approach for scene flow estimation, we will tackle the problem of non-density by interpolation.

en cs.CV
arXiv Open Access 2016
On Recognizing Transparent Objects in Domestic Environments Using Fusion of Multiple Sensor Modalities

Alexander Hagg, Frederik Hegger, Paul PlΓΆger

Current object recognition methods fail on object sets that include both diffuse, reflective and transparent materials, although they are very common in domestic scenarios. We show that a combination of cues from multiple sensor modalities, including specular reflectance and unavailable depth information, allows us to capture a larger subset of household objects by extending a state of the art object recognition method. This leads to a significant increase in robustness of recognition over a larger set of commonly used objects.

en cs.CV
arXiv Open Access 2015
A Deep Siamese Network for Scene Detection in Broadcast Videos

Lorenzo Baraldi, Costantino Grana, Rita Cucchiara

We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.

en cs.CV, cs.MM
arXiv Open Access 2015
Where To Look: Focus Regions for Visual Question Answering

Kevin J. Shih, Saurabh Singh, Derek Hoiem

We present a method that learns to answer visual questions by selecting image regions relevant to the text-based query. Our method exhibits significant improvements in answering questions such as "what color," where it is necessary to evaluate a specific location, and "what room," where it selectively identifies informative image regions. Our model is tested on the VQA dataset which is the largest human-annotated visual question answering dataset to our knowledge.

en cs.CV
arXiv Open Access 2015
Effects of GIMP Retinex Filtering Evaluated by the Image Entropy

A. C. Sparavigna, R. Marazzato

A GIMP Retinex filtering can be used for enhancing images, with good results on foggy images, as recently discussed. Since this filter has some parameters that can be adjusted to optimize the output image, several approaches can be decided according to desired results. Here, as a criterion for optimizing the filtering parameters, we consider the maximization of the image entropy. We use, besides the Shannon entropy, also a generalized entropy.

en cs.CV
arXiv Open Access 2015
On the Relation between two Rotation Metrics

Thomas Ruland

In their work "Global Optimization through Rotation Space Search", Richard Hartley and Fredrik Kahl introduce a global optimization strategy for problems in geometric computer vision, based on rotation space search using a branch-and-bound algorithm. In its core, Lemma 2 of their publication is the important foundation for a class of global optimization algorithms, which is adopted over a wide range of problems in subsequent publications. This lemma relates a metric on rotations represented by rotation matrices with a metric on rotations in axis-angle representation. This work focuses on a proof for this relationship, which is based on Rodrigues' Rotation Theorem for the composition of rotations in axis-angle representation.

en cs.CV
arXiv Open Access 2014
Gabor Filter and Rough Clustering Based Edge Detection

Chandranath Adak

This paper introduces an efficient edge detection method based on Gabor filter and rough clustering. The input image is smoothed by Gabor function, and the concept of rough clustering is used to focus on edge detection with soft computational approach. Hysteresis thresholding is used to get the actual output, i.e. edges of the input image. To show the effectiveness, the proposed technique is compared with some other edge detection methods.

en cs.CV, cs.AI
arXiv Open Access 2014
Fuzzy and entropy facial recognition

Jaejun Lee, Taeseon Yun

This paper suggests an effective method for facial recognition using fuzzy theory and Shannon entropy. Combination of fuzzy theory and Shannon entropy eliminates the complication of other methods. Shannon entropy calculates the ratio of an element between faces, and fuzzy theory calculates the member ship of the entropy with 1. More details will be mentioned in Section 3. The learning performance is better than others as it is very simple, and only need two data per learning. By using factors that don't usually change during the life, the method will have a high accuracy.

en cs.CV
arXiv Open Access 2014
Multidimensional Digital Smoothing Filters for Target Detection

Hugh L. Kennedy

Recursive, causal and non-causal, multidimensional digital filters, with infinite impulse responses and maximally flat magnitude and delay responses in the low-frequency region, are designed to negate correlated clutter and interference in the background and to accumulate power due to dim targets in the foreground of a surveillance sensor. Expressions relating mean impulse-response duration, frequency selectivity and group delay, to low-order linear-difference-equation coefficients are derived using discrete Laguerre polynomials and discounted least-squares regression, then verified through simulation.

arXiv Open Access 2012
Resolution Enhancement of Range Images via Color-Image Segmentation

Arnav Bhavsar

We report a method for super-resolution of range images. Our approach leverages the interpretation of LR image as sparse samples on the HR grid. Based on this interpretation, we demonstrate that our recently reported approach, which reconstructs dense range images from sparse range data by exploiting a registered colour image, can be applied for the task of resolution enhancement of range images. Our method only uses a single colour image in addition to the range observation in the super-resolution process. Using the proposed approach, we demonstrate super-resolution results for large factors (e.g. 4) with good localization accuracy.

en cs.CV

Halaman 16 dari 5824