Hasil untuk "cs.CV"

Menampilkan 20 dari ~116474 hasil · dari CrossRef, arXiv, DOAJ

JSON API
arXiv Open Access 2025
ViewPCL: a point cloud based active learning method for multi-view segmentation

Christian Hilaire, Sima Didari

We propose a novel active learning framework for multi-view semantic segmentation. This framework relies on a new score that measures the discrepancy between point cloud distributions generated from the extra geometrical information derived from the model's prediction across different views. Our approach results in a data efficient and explainable active learning method. The source code is available at https://github.com/chilai235/viewpclAL.

en cs.CV
arXiv Open Access 2023
Improving Knowledge Distillation via Transferring Learning Ability

Long Liu, Tong Li, Hui Cheng

Existing knowledge distillation methods generally use a teacher-student approach, where the student network solely learns from a well-trained teacher. However, this approach overlooks the inherent differences in learning abilities between the teacher and student networks, thus causing the capacity-gap problem. To address this limitation, we propose a novel method called SLKD.

en cs.CV
arXiv Open Access 2020
Supporting large-scale image recognition with out-of-domain samples

Christof Henkel, Philipp Singer

This article presents an efficient end-to-end method to perform instance-level recognition employed to the task of labeling and ranking landmark images. In a first step, we embed images in a high dimensional feature space using convolutional neural networks trained with an additive angular margin loss and classify images using visual similarity. We then efficiently re-rank predictions and filter noise utilizing similarity to out-of-domain images. Using this approach we achieved the 1st place in the 2020 edition of the Google Landmark Recognition challenge.

en cs.CV, cs.LG
arXiv Open Access 2020
Deep Convolutional Neural Network Based Facial Expression Recognition in the Wild

Hafiq Anas, Bacha Rehman, Wee Hong Ong

This paper describes the proposed methodology, data used and the results of our participation in the ChallengeTrack 2 (Expr Challenge Track) of the Affective Behavior Analysis in-the-wild (ABAW) Competition 2020. In this competition, we have used a proposed deep convolutional neural network (CNN) model to perform automatic facial expression recognition (AFER) on the given dataset. Our proposed model has achieved an accuracy of 50.77% and an F1 score of 29.16% on the validation set.

en cs.CV
arXiv Open Access 2020
TACO: Trash Annotations in Context for Litter Detection

Pedro F Proença, Pedro Simões

TACO is an open image dataset for litter detection and segmentation, which is growing through crowdsourcing. Firstly, this paper describes this dataset and the tools developed to support it. Secondly, we report instance segmentation performance using Mask R-CNN on the current version of TACO. Despite its small size (1500 images and 4784 annotations), our results are promising on this challenging problem. However, to achieve satisfactory trash detection in the wild for deployment, TACO still needs much more manual annotations. These can be contributed using: http://tacodataset.org/

en cs.CV
arXiv Open Access 2020
Background Matting

Hossein Javidnia, François Pitié

The current state of the art alpha matting methods mainly rely on the trimap as the secondary and only guidance to estimate alpha. This paper investigates the effects of utilising the background information as well as trimap in the process of alpha calculation. To achieve this goal, a state of the art method, AlphaGan is adopted and modified to process the background information as an extra input channel. Extensive experiments are performed to analyse the effect of the background information in image and video matting such as training with mildly and heavily distorted backgrounds. Based on the quantitative evaluations performed on Adobe Composition-1k dataset, the proposed pipeline significantly outperforms the state of the art methods using AlphaMatting benchmark metrics.

en cs.CV
arXiv Open Access 2020
Aggregation and Finetuning for Clothes Landmark Detection

Tzu-Heng Lin

Landmark detection for clothes is a fundamental problem for many applications. In this paper, a new training scheme for clothes landmark detection: $\textit{Aggregation and Finetuning}$, is proposed. We investigate the homogeneity among landmarks of different categories of clothes, and utilize it to design the procedure of training. Extensive experiments show that our method outperforms current state-of-the-art methods by a large margin. Our method also won the 1st place in the DeepFashion2 Challenge 2020 - Clothes Landmark Estimation Track with an AP of 0.590 on the test set, and 0.615 on the validation set. Code will be publicly available at https://github.com/lzhbrian/deepfashion2-kps-agg-finetune .

en cs.CV
arXiv Open Access 2020
Leveraging Temporal Information for 3D Detection and Domain Adaptation

Cunjun Yu, Zhongang Cai, Daxuan Ren et al.

Ever since the prevalent use of the LiDARs in autonomous driving, tremendous improvements have been made to the learning on the point clouds. However, recent progress largely focuses on detecting objects in a single 360-degree sweep, without extensively exploring the temporal information. In this report, we describe a simple way to pass such information in the learning pipeline by adding timestamps to the point clouds, which shows consistent improvements across all three classes.

en cs.CV
arXiv Open Access 2020
A Comprehensive Review of Skeleton-based Movement Assessment Methods

Tal Hakim

The raising availability of 3D cameras and dramatic improvement of computer vision algorithms in the recent decade, accelerated the research of automatic movement assessment solutions. Such solutions can be implemented at home, using affordable equipment and dedicated software. In this paper, we divide the movement assessment task into secondary tasks and explain why they are needed and how they can be addressed. We review the recent solutions for automatic movement assessment from skeleton videos, comparing them by their objectives, features, movement domains and algorithmic approaches. In addition, we discuss the status of the research on this topic in a high level.

arXiv Open Access 2019
Decomposing multispectral face images into diffuse and specular shading and biophysical parameters

Sarah Alotaibi, William A. P. Smith

We propose a novel biophysical and dichromatic reflectance model that efficiently characterises spectral skin reflectance. We show how to fit the model to multispectral face images enabling high quality estimation of diffuse and specular shading as well as biophysical parameter maps (melanin and haemoglobin). Our method works from a single image without requiring complex controlled lighting setups yet provides quantitatively accurate reconstructions and qualitatively convincing decomposition and editing.

en cs.CV
arXiv Open Access 2018
Viewpoint Estimation-Insights & Model

Gilad Divon, Ayellet Tal

This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights that should be taken into consideration when designing a CNN that solves the problem. Based on these insights, the paper proposes a network in which (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network improves the state-of-the-art results for this problem by 9.8%.

en cs.CV
arXiv Open Access 2018
Advanced Image Processing for Astronomical Images

Diganta Misra, Sparsha Mishra, Bhargav Appasani

Image Processing in Astronomy is a major field of research and involves a lot of techniques pertaining to improve analyzing the properties of the celestial objects or obtaining preliminary inference from the image data. In this paper, we provide a comprehensive case study of advanced image processing techniques applied to Astronomical Galaxy Images for improved analysis, accurate inferences and faster analysis.

en cs.CV, cs.IR
arXiv Open Access 2013
Heat kernel coupling for multiple graph analysis

Michael M. Bronstein, Klaus Glashoff

In this paper, we introduce heat kernel coupling (HKC) as a method of constructing multimodal spectral geometry on weighted graphs of different size without vertex-wise bijective correspondence. We show that Laplacian averaging can be derived as a limit case of HKC, and demonstrate its applications on several problems from the manifold learning and pattern recognition domain.

en cs.CV
arXiv Open Access 2012
The watershed concept and its use in segmentation : a brief history

Fernand Meyer

The watershed is one of the most used tools in image segmentation. We present how its concept is born and developed over time. Its implementation as an algorithm or a hardwired device evolved together with the technology which allowed it. We present also how it is used in practice, first together with markers, and later introduced in a multiscale framework, in order to produce not a unique partition but a complete hierarchy.

en cs.CV
arXiv Open Access 2011
Multimodal diff-hash

Michael M. Bronstein

Many applications require comparing multimodal data with different structure and dimensionality that cannot be compared directly. Recently, there has been increasing interest in methods for learning and efficiently representing such multimodal similarity. In this paper, we present a simple algorithm for multimodal similarity-preserving hashing, trying to map multimodal data into the Hamming space while preserving the intra- and inter-modal similarities. We show that our method significantly outperforms the state-of-the-art method in the field.

en cs.CV

Halaman 14 dari 5824