Accurate Detection of Mediastinal Lesions with nnDetection
Michael Baumgartner, Peter M. Full, Klaus H. Maier-Hein
The accurate detection of mediastinal lesions is one of the rarely explored medical object detection problems. In this work, we applied a modified version of the self-configuring method nnDetection to the Mediastinal Lesion Analysis (MELA) Challenge 2022. By incorporating automatically generated pseudo masks, training high capacity models with large patch sizes in a multi GPU setup and an adapted augmentation scheme to reduce localization errors caused by rotations, our method achieved an excellent FROC score of 0.9922 at IoU 0.10 and 0.9880 at IoU 0.3 in our cross-validation experiments. The submitted ensemble ranked third in the competition with a FROC score of 0.9897 on the MELA challenge leaderboard.
GPU-Net: Lightweight U-Net with more diverse features
Heng Yu, Di Fan, Weihu Song
Image segmentation is an important task in the medical image field and many convolutional neural networks (CNNs) based methods have been proposed, among which U-Net and its variants show promising performance. In this paper, we propose GP-module and GPU-Net based on U-Net, which can learn more diverse features by introducing Ghost module and atrous spatial pyramid pooling (ASPP). Our method achieves better performance with more than 4 times fewer parameters and 2 times fewer FLOPs, which provides a new potential direction for future research. Our plug-and-play module can also be applied to existing segmentation methods to further improve their performance.
Field Distortion Model Based on Fredholm Integral
Yunqi Sun, Jianfeng Zhou
Field distortion is widespread in imaging systems. If it cannot be measured and corrected well, it will affect the accuracy of photogrammetry. To this end, we proposed a general field distortion model based on Fredholm integration, which uses a reconstructed high-resolution reference point spread function (PSF) and two sets of 4-variable polynomials to describe an imaging system. The model includes the point-to-point positional distortion from the object space to the image space and the deformation of the PSF so that we can measure an actual field distortion with arbitrary accuracy. We also derived the formula required for correcting the sampling effect of the image sensor. Through numerical simulation, we verify the effectiveness of the model and reconstruction algorithm. This model will have potential application in high-precision image calibration, photogrammetry and astrometry.
Sparse InSAR Data 3D Inpainting for Ground Deformation Detection Along the Rail Corridor
Odysseas Pappas, Juliet Biggs, David Bull
et al.
Monitoring of ground movement close to the rail corridor, such as that associated with landslips caused by ground subsidence and/or uplift, is of great interest for the detection and prevention of possible railway faults. Interferometric synthetic-aperture radar (InSAR) data can be used to measure ground deformation, but its use poses distinct challenges, as the data is highly sparse and can be particularly noisy. Here we present a scheme for processing and interpolating noisy, sparse InSAR data into a dense spatio-temporal stack, helping suppress noise and opening up the possibility for treatment with deep learning and other image processing methods.
High-resolution Coastline Extraction in SAR Images via MISP-GGD Superpixel Segmentation
Odysseas Pappas, Nantheera Anantrasirichai, Byron Adams
et al.
High accuracy coastline/shoreline extraction from SAR imagery is a crucial step in a number of maritime and coastal monitoring applications. We present a method based on image segmentation using the Generalised Gamma Mixture Model superpixel algorithm (MISP-GGD). MISP-GGD produces superpixels adhering with great accuracy to object edges in the image, such as the coastline. Unsupervised clustering of the generated superpixels according to textural and radiometric features allows for generation of a land/water mask from which a highly accurate coastline can be extracted. We present results of our proposed method on a number of SAR images of varying characteristics.
A direct geometry processing cartilage generation method using segmented bone models from datasets with poor cartilage visibility
Faezeh Moshfeghifar, Max Kragballe Nielsen, José D. Tascón-Vidarte
et al.
We present a method to generate subject-specific cartilage for the hip joint. Given bone geometry, our approach is agnostic to image modality, creates conforming interfaces, and is well suited for finite element analysis. We demonstrate our method on ten hip joints showing anatomical shape consistency and well-behaved stress patterns. Our method is fast and may assist in large-scale biomechanical population studies of the hip joint when manual segmentation or training data is not feasible.
Physics-Inspired Unsupervised Classification for Region of Interest in X-Ray Ptychography
Dergan Lin, Yi Jiang, Junjing Deng
et al.
X-ray ptychography allows for large fields to be imaged at high resolution at the cost of additional computational expense due to the large volume of data. Given limited information regarding the object, the acquired data often has an excessive amount of information that is outside the region of interest (RoI). In this work we propose a physics-inspired unsupervised learning algorithm to identify the RoI of an object using only diffraction patterns from a ptychography dataset before committing computational resources to reconstruction. Obtained diffraction patterns that are automatically identified as not within the RoI are filtered out, allowing efficient reconstruction by focusing only on important data within the RoI while preserving image quality.
Adaptive joint spatio-temporal error concealment for video communication
Jürgen Seiler, André Kaup
In the past years, video communication has found its application in an increasing number of environments. Unfortunately, some of them are error-prone and the risk of block losses caused by transmission errors is ubiquitous. To reduce the effects of these block losses, a new spatio-temporal error concealment algorithm is presented. The algorithm uses spatial as well as temporal information for extrapolating the signal into the lost areas. The extrapolation is carried out in two steps, first a preliminary temporal extrapolation is performed which then is used to generate a model of the original signal, using the spatial neighborhood of the lost block. By applying the spatial refinement a significantly higher concealment quality can be achieved resulting in a gain of up to 5.2 dB in PSNR compared to the unrefined underlying pure temporal extrapolation.
A Neural-Network-Based Convex Regularizer for Inverse Problems
Alexis Goujon, Sebastian Neumayer, Pakshal Bohra
et al.
The emergence of deep-learning-based methods to solve image-reconstruction problems has enabled a significant increase in reconstruction quality. Unfortunately, these new methods often lack reliability and explainability, and there is a growing interest to address these shortcomings while retaining the boost in performance. In this work, we tackle this issue by revisiting regularizers that are the sum of convex-ridge functions. The gradient of such regularizers is parameterized by a neural network that has a single hidden layer with increasing and learnable activation functions. This neural network is trained within a few minutes as a multistep Gaussian denoiser. The numerical experiments for denoising, CT, and MRI reconstruction show improvements over methods that offer similar reliability guarantees.
Spatially Exclusive Pasting: A General Data Augmentation for the Polyp Segmentation
Lei Zhou
Automated polyp segmentation technology plays an important role in diagnosing intestinal diseases, such as tumors and precancerous lesions. Previous works have typically trained convolution-based U-Net or Transformer-based neural network architectures with labeled data. However, the available public polyp segmentation datasets are too small to train the network sufficiently, suppressing each network's potential performance. To alleviate this issue, we propose a universal data augmentation technology to synthesize more data from the existing datasets. Specifically, we paste the polyp area into the same image's background in a spatial-exclusive manner to obtain a combinatorial number of new images. Extensive experiments on various networks and datasets show that the proposed method enhances the data efficiency and achieves consistent improvements over baselines. Finally, we hit a new state of the art in this task. We will release the code soon.
NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Dataset and Study
Ren Yang, Radu Timofte
This paper introduces a novel dataset for video enhancement and studies the state-of-the-art methods of the NTIRE 2021 challenge on quality enhancement of compressed video. The challenge is the first NTIRE challenge in this direction, with three competitions, hundreds of participants and tens of proposed solutions. Our newly collected Large-scale Diverse Video (LDV) dataset is employed in the challenge. In our study, we analyze the proposed methods of the challenge and several methods in previous works on the proposed LDV dataset. We find that the NTIRE 2021 challenge advances the state-of-the-art of quality enhancement on compressed video. The proposed LDV dataset is publicly available at the homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh
Patch-Based Cervical Cancer Segmentation using Distance from Boundary of Tissue
Kengo Araki, Mariyo Rokutan-Kurata, Kazuhiro Terada
et al.
Pathological diagnosis is used for examining cancer in detail, and its automation is in demand. To automatically segment each cancer area, a patch-based approach is usually used since a Whole Slide Image (WSI) is huge. However, this approach loses the global information needed to distinguish between classes. In this paper, we utilized the Distance from the Boundary of tissue (DfB), which is global information that can be extracted from the original image. We experimentally applied our method to the three-class classification of cervical cancer, and found that it improved the total performance compared with the conventional method.
Self-Supervised Transformers for fMRI representation
Itzik Malkiel, Gony Rosenman, Lior Wolf
et al.
We present TFF, which is a Transformer framework for the analysis of functional Magnetic Resonance Imaging (fMRI) data. TFF employs a two-phase training approach. First, self-supervised training is applied to a collection of fMRI scans, where the model is trained to reconstruct 3D volume data. Second, the pre-trained model is fine-tuned on specific tasks, utilizing ground truth labels. Our results show state-of-the-art performance on a variety of fMRI tasks, including age and gender prediction, as well as schizophrenia recognition. Our code for the training, network architecture, and results is attached as supplementary material.
A Case for the Score: Identifying Image Anomalies using Variational Autoencoder Gradients
David Zimmerer, Jens Petersen, Simon A. A. Kohl
et al.
Through training on unlabeled data, anomaly detection has the potential to impact computer-aided diagnosis by outlining suspicious regions. Previous work on deep-learning-based anomaly detection has primarily focused on the reconstruction error. We argue instead, that pixel-wise anomaly ratings derived from a Variational Autoencoder based score approximation yield a theoretically better grounded and more faithful estimate. In our experiments, Variational Autoencoder gradient-based rating outperforms other approaches on unsupervised pixel-wise tumor detection on the BraTS-2017 dataset with a ROC-AUC of 0.94.
Long Short-Term Memory Spatial Transformer Network
Shiyang Feng, Tianyue Chen, Hao Sun
Spatial transformer network has been used in a layered form in conjunction with a convolutional network to enable the model to transform data spatially. In this paper, we propose a combined spatial transformer network (STN) and a Long Short-Term Memory network (LSTM) to classify digits in sequences formed by MINST elements. This LSTM-STN model has a top-down attention mechanism profit from LSTM layer, so that the STN layer can perform short-term independent elements for the statement in the process of spatial transformation, thus avoiding the distortion that may be caused when the entire sequence is spatially transformed. It also avoids the influence of this distortion on the subsequent classification process using convolutional neural networks and achieves a single digit error of 1.6\% compared with 2.2\% of Convolutional Neural Network with STN layer.
Tensor-based subspace learning for tracking salt-dome boundaries
Zhen Wang, Zhiling Long, Ghassan AlRegib
The exploration of petroleum reservoirs has a close relationship with the identification of salt domes. To efficiently interpret salt-dome structures, in this paper, we propose a method that tracks salt-dome boundaries through seismic volumes using a tensor-based subspace learning algorithm. We build texture tensors by classifying image patches acquired along the boundary regions of seismic sections and contrast maps. With features extracted from the subspaces of texture tensors, we can identify tracked points in neighboring sections and label salt-dome boundaries by optimally connecting these points. Experimental results show that the proposed method outperforms the state-of-the-art salt-dome detection method by employing texture information and tensor-based analysis.
Extreme Image Coding via Multiscale Autoencoders With Generative Adversarial Optimization
Chao Huang, Haojie Liu, Tong Chen
et al.
We propose a MultiScale AutoEncoder(MSAE) based extreme image compression framework to offer visually pleasing reconstruction at a very low bitrate. Our method leverages the "priors" at different resolution scale to improve the compression efficiency, and also employs the generative adversarial network(GAN) with multiscale discriminators to perform the end-to-end trainable rate-distortion optimization. We compare the perceptual quality of our reconstructions with traditional compression algorithms using High-Efficiency Video Coding(HEVC) based Intra Profile and JPEG2000 on the public Cityscapes and ADE20K datasets, demonstrating the significant subjective quality improvement.
Regularized Inverse Holographic Volume Reconstruction for 3D Particle Tracking
Kevin Mallery, Jiarong Hong
The key limitations of digital inline holography (DIH) for particle tracking applications are poor longitudinal resolution, particle concentration limits, and case-specific processing. We utilize an inverse problem method with fused lasso regularization to perform full volumetric reconstructions of particle fields. By exploiting data sparsity in the solution and utilizing GPU processing, we dramatically reduce the computational cost usually associated with inverse reconstruction approaches. We demonstrate the accuracy of the proposed method using synthetic and experimental holograms. Finally, we present two practical applications (high concentration microorganism swimming and microfiber rotation) to extend the capabilities of DIH beyond what was possible using prior methods.
en
eess.IV, physics.flu-dyn
Single-pixel imaging with sampling distributed over simplex vertices
Krzysztof M. Czajkowski, Anna Pastuszczak, Rafal Kotynski
We propose a method of reduction of experimental noise in single-pixel imaging by expressing the subsets of sampling patterns as linear combinations of vertices of a multidimensional regular simplex. This method may be also directly extended to complementary sampling. The modified sampling consists only of non-negative patterns. The measurement becomes theoretically independent of the ambient illumination, and in practice becomes more robust to the varying conditions of the experiment. We show how the optimal dimensionality of the simplex depends on the level of measurement noise. We present experimental results of single-pixel imaging using binarized sampling and a real-time reconstruction with the Fourier domain regularized inversion method.
Non-local Operational Anisotropic Diffusion Filter
Fábio A. M. Cappabianco, Petrus P. C. E. da Silva
High-frequency noise is present in several modalities of medical images. It originates from the acquisition process and may be related to the scanner configurations, the scanned body, or to other external factors. This way, prospective filters are an important tool to improve the image quality. In this paper, we propose a non-local weighted operational anisotropic diffusion filter and evaluate its effect on magnetic resonance images and on kV/CBCT radiotherapy images. We also provide a detailed analysis of non-local parameter settings. Results show that the new filter enhances previous local implementations and has potential application in radiotherapy treatments.