Hasil "eess.IV" - JURNALIN

arXiv Open Access 2025

Photometric Stereo using Gaussian Splatting and inverse rendering

Matéo Ducastel, David Tschumperlé, Yvain Quéau

Recent state-of-the-art algorithms in photometric stereo rely on neural networks and operate either through prior learning or inverse rendering optimization. Here, we revisit the problem of calibrated photometric stereo by leveraging recent advances in 3D inverse rendering using the Gaussian Splatting formalism. This allows us to parameterize the 3D scene to be reconstructed and optimize it in a more interpretable manner. Our approach incorporates a simplified model for light representation and demonstrates the potential of the Gaussian Splatting rendering engine for the photometric stereo problem.

en eess.IV, cs.AI

Detail Sumber

arXiv Open Access 2025

Encoding of Demographic and Anatomical Information in Chest X-Ray-based Severe Left Ventricular Hypertrophy Classifiers

Basudha Pal, Rama Chellappa, Muhammad Umair

While echocardiography and MRI are clinical standards for evaluating cardiac structure, their use is limited by cost and accessibility.We introduce a direct classification framework that predicts severe left ventricular hypertrophy from chest X-rays, without relying on anatomical measurements or demographic inputs. Our approach achieves high AUROC and AUPRC, and employs Mutual Information Neural Estimation to quantify feature expressivity. This reveals clinically meaningful attribute encoding and supports transparent model interpretation.

en eess.IV, cs.AI

Detail Sumber

arXiv Open Access 2025

Image-based Facial Rig Inversion

Tianxiang Yang, Marco Volino, Armin Mustafa et al.

We present an image-based rig inversion framework that leverages two modalities: RGB appearance and RGB-encoded normal maps. Each modality is processed by an independent Hiera transformer backbone, and the extracted features are fused to regress 102 rig parameters derived from the Facial Action Coding System (FACS). Experiments on synthetic and scanned datasets demonstrate that the method generalizes to scanned data, producing faithful reconstructions.

en eess.IV

Detail Sumber

arXiv Open Access 2024

GAN with Skip Patch Discriminator for Biological Electron Microscopy Image Generation

Nishith Ranjon Roy, Nailah Rawnaq, Tulin Kaman

Generating realistic electron microscopy (EM) images has been a challenging problem due to their complex global and local structures. Isola et al. proposed pix2pix, a conditional Generative Adversarial Network (GAN), for the general purpose of image-to-image translation; which fails to generate realistic EM images. We propose a new architecture for the discriminator in the GAN providing access to multiple patch sizes using skip patches and generating realistic EM images.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2023

Modified watershed approach for segmentation of complex optical coherence tomographic images

Maryam Viqar, Violeta Madjarova, Elena Stoykova

Watershed segmentation method has been used in various applications. But many a times, due to its over-segmentation attributes, it underperforms in several tasks where noise is a dominant source. In this study, Optical Coherence Tomography images have been acquired, and segmentation has been performed to analyse the different regions of fluid filled sacs in a lemon. A modified watershed algorithm has been proposed which gives promising results for segmentation of internal lemon structures.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2023

Aorta Segmentation from 3D CT in MICCAI SEG.A. 2023 Challenge

Andriy Myronenko, Dong Yang, Yufan He et al.

Aorta provides the main blood supply of the body. Screening of aorta with imaging helps for early aortic disease detection and monitoring. In this work, we describe our solution to the Segmentation of the Aorta (SEG.A.231) from 3D CT challenge. We use automated segmentation method Auto3DSeg available in MONAI. Our solution achieves an average Dice score of 0.920 and 95th percentile of the Hausdorff Distance (HD95) of 6.013, which ranks first and wins the SEG.A. 2023 challenge.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2023

Optimal operating MR contrast for brain ventricle parcellation

Savannah P. Hays, Lianrui Zuo, Yuli Wang et al.

Development of MR harmonization has enabled different contrast MRIs to be synthesized while preserving the underlying anatomy. In this paper, we use image harmonization to explore the impact of different T1-w MR contrasts on a state-of-the-art ventricle parcellation algorithm VParNet. We identify an optimal operating contrast (OOC) for ventricle parcellation; by showing that the performance of a pretrained VParNet can be boosted by adjusting contrast to the OOC.

en eess.IV

Detail Sumber

arXiv Open Access 2022

Grapes disease detection using transfer learning

Bhavya Jain, Sasikumar Periyasamy

Early and precise diagnosis of diseases in plants can help to develop an early treatment technique. Plant diseases degrade both the quantity and quality of crops, thus posing a threat to food security and resulting in huge economic losses. Traditionally identification is performed manually, which is inaccurate, time-consuming, and expensive. This paper presents a simple and efficient model to detect grapes leaf diseases using transfer learning. A pre-trained deep convolutional neural network is used as a feature extractor and random forest as a classifier. The performance of the model is interpreted in terms of accuracy, precision, recall, and f1 score. Total 1003 images of four different classes are used and 91.66% accuracy is obtained.

en eess.IV

Detail Sumber

arXiv Open Access 2022

Segmentation of the Carotid Lumen and Vessel Wall using Deep Learning and Location Priors

Florian Thamm, Felix Denzinger, Leonhard Rist et al.

In this report we want to present our method and results for the Carotid Artery Vessel Wall Segmentation Challenge. We propose an image-based pipeline utilizing the U-Net architecture and location priors to solve the segmentation problem at hand.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2022

Reusing the H.264/AVC deblocking filter for efficient spatio-temporal prediction in video coding

Jürgen Seiler, André Kaup

The prediction step is a very important part of hybrid video codecs for effectively compressing video sequences. While existing video codecs predict either in temporal or in spatial direction only, the compression efficiency can be increased by a combined spatio-temporal prediction. In this paper we propose an algorithm for reusing the H.264/AVC deblocking filter for spatio-temporal prediction. Reusing this highly op timized filter allows for a very low computational complexity of this prediction mode and an average rate reduction of up to 7.2% can be achieved.

en eess.IV, cs.MM

Detail Sumber

arXiv Open Access 2022

Track Before Detect of Low SNR Objects in a Sequence of Image Frames Using Particle Filter

Reza Rezaie

A multiple model track-before-detect (TBD) particle filter-based approach for detection and tracking of low signal to noise ratio (SNR) objects based on a sequence of image frames in the presence of noise and clutter is briefly studied in this letter. At each time instance after receiving a frame of image, first, some preprocessing approaches are applied to the image. Then, it is sent to the multiple model TBD particle filter for detection and tracking of an object. Performance of the approach is evaluated for detection and tracking of an object in different scenarios including noise and clutter.

en eess.IV, cs.AI

Detail Sumber

arXiv Open Access 2022

Efficient Feature Compression for Edge-Cloud Systems

Zhihao Duan, Fengqing Zhu

Optimizing computation in an edge-cloud system is an important yet challenging problem. In this paper, we consider a three-way trade-off between bit rate, classification accuracy, and encoding complexity in an edge-cloud image classification system. Our method includes a new training strategy and an efficient encoder architecture to improve the rate-accuracy performance. Our design can also be easily scaled according to different computation resources on the edge device, taking a step towards achieving a rate-accuracy-complexity (RAC) trade-off. Under various settings, our feature coding system consistently outperforms previous methods in terms of the RAC performance.

en eess.IV

Detail Sumber

arXiv Open Access 2021

Attention! Stay Focus!

Tu Vo

We develop a deep convolutional neural networks(CNNs) to deal with the blurry artifacts caused by the defocus of the camera using dual-pixel images. Specifically, we develop a double attention network which consists of attentional encoders, triple locals and global local modules to effectively extract useful information from each image in the dual-pixels and select the useful information from each image and synthesize the final output image. We demonstrate the effectiveness of the proposed deblurring algorithm in terms of both qualitative and quantitative aspects by evaluating on the test set in the NTIRE 2021 Defocus Deblurring using Dual-pixel Images Challenge. The code, and trained models are available at https://github.com/tuvovan/ATTSF.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2021

Perineural Invasion Detection in Multiple Organ Cancer Based on Deep Convolutional Neural Network

Ramin Nateghi, Fattaneh Pourakpour

Perineural invasion (PNI) by malignant tumor cells has been reported as an independent indicator of poor prognosis in various cancers. Assessment of PNI in small nerves on glass slides is a labor-intensive task. In this study, we propose an algorithm to detect the perineural invasions in colon, prostate, and pancreas cancers based on a convolutional neural network (CNN).

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2021

Related Work on Image Quality Assessment

Dongxu Wang

Due to the existence of quality degradations introduced in various stages of visual signal acquisition, compression, transmission and display, image quality assessment (IQA) plays a vital role in image-based applications. According to whether the reference image is complete and available, image quality evaluation can be divided into three categories: Full-Reference(FR), Reduced- Reference(RR), and Non- Reference(NR). This article will review the state-of-the-art image quality assessment algorithms.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2019

Single-pixel imaging with origami pattern construction

Wen-Kai Yu, Yi-Ming Liu

Single-pixel compressive imaging can recover images from a small amount of measurements, offering many benefits especially for the scenes where the array detection is unavailable. However, the widely used random patterns fail to explore internal relations between the patterns and the image reconstruction. Here we propose a single-pixel imaging method based on origami pattern construction with a better imaging quality, but with less uncertainty of the pattern sequence. It can decrease the sampling ratio even to 0.5\%, really realizing super sub-Nyquist sampling. The experimental realization of this approach is a big step forward toward the real-time compressive video applications.

en eess.IV, physics.optics

Detail DOI Sumber

arXiv Open Access 2019

Adversarial Test on Learnable Image Encryption

MaungMaung AprilPyone, Warit Sirichotedumrong, Hitoshi Kiya

Data for deep learning should be protected for privacy preserving. Researchers have come up with the notion of learnable image encryption to satisfy the requirement. However, existing privacy preserving approaches have never considered the threat of adversarial attacks. In this paper, we ran an adversarial test on learnable image encryption in five different scenarios. The results show different behaviors of the network in the variable key scenarios and suggest learnable image encryption provides certain level of adversarial robustness.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2019

Automated Mammogram Analysis with a Deep Learning Pipeline

Azam Hamidinekoo, Erika Denton, Reyer Zwiggelaar

Current deep learning based detection models tackle detection and segmentation tasks by casting them to pixel or patch-wise classification. To automate the initial mass lesion detection and segmentation on the whole mammographic images and avoid the computational redundancy of patch-based and sliding window approaches, the conditional generative adversarial network (cGAN) was used in this study. Subsequently, feeding the detected regions to the trained densely connected network (DenseNet), the binary classification of benign versus malignant was predicted. We used a combination of publicly available mammographic data repositories to train the pipeline, while evaluating the model's robustness toward our clinically collected repository, which was unseen to the pipeline.

en eess.IV

Detail Sumber

arXiv Open Access 2019

Performance Evaluation of Two-layer lossless HDR Coding using Histogram Packing Technique under Various Tone-mapping Operators

Hiroyuki Kobayashi, Hitoshi Kiya

We proposed a lossless two-layer HDR coding method using a histogram packing technique. The proposed method was demonstrated to outperform the normative JPEG XT encoder, under the use of the default tone-mapping operator. However, the performance under various tone-mapping operators has not been discussed. In this paper, we aim to compare the performance of the proposed method with that of the JPEG XT encoder under the use of various tone-mapping operators to clearly show the characteristic difference between them.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2018

Effect of High Frame Rates on 3D Video Quality of Experience

Amin Banitalebi-Dehkordi, Mahsa T. Pourazad, Panos Nasiopoulos

In this paper, we study the effect of 3D videos with increased frame rates on the viewers quality of experience. We performed a series of subjective tests to seek the subjects preferences among videos of the same scene at four different frame rates: 24, 30, 48, and 60 frames per second (fps). Results revealed that subjects clearly prefer higher frame rates. In particular, Mean Opinion Score (MOS) values associated with the 60 fps 3D videos were 55% greater than MOS values of the 24 fps 3D videos.

en eess.IV

Detail Sumber

Hasil untuk "eess.IV"