Hasil "Photography"

arXiv Open Access 2026

Coarse-to-Fine Non-rigid Multi-modal Image Registration for Historical Panel Paintings based on Crack Structures

Aline Sindel, Andreas Maier, Vincent Christlein

Art technological investigations of historical panel paintings rely on acquiring multi-modal image data, including visual light photography, infrared reflectography, ultraviolet fluorescence photography, x-radiography, and macro photography. For a comprehensive analysis, the multi-modal images require pixel-wise alignment, which is still often performed manually. Multi-modal image registration can reduce this laborious manual work, is substantially faster, and enables higher precision. Due to varying image resolutions, huge image sizes, non-rigid distortions, and modality-dependent image content, registration is challenging. Therefore, we propose a coarse-to-fine non-rigid multi-modal registration method efficiently relying on sparse keypoints and thin-plate-splines. Historical paintings exhibit a fine crack pattern, called craquelure, on the paint layer, which is captured by all image systems and is well-suited as a feature for registration. In our one-stage non-rigid registration approach, we employ a convolutional neural network for joint keypoint detection and description based on the craquelure and a graph neural network for descriptor matching in a patch-based manner, and filter matches based on homography reprojection errors in local areas. For coarse-to-fine registration, we introduce a novel multi-level keypoint refinement approach to register mixed-resolution images up to the highest resolution. We created a multi-modal dataset of panel paintings with a high number of keypoint annotations, and a large test set comprising five multi-modal domains and varying image resolutions. The ablation study demonstrates the effectiveness of all modules of our refinement method. Our proposed approaches achieve the best registration results compared to competing keypoint and dense matching methods and refinement methods.

en cs.CV

Detail Sumber

arXiv Open Access 2025

Fundus Image Quality Assessment and Enhancement: a Systematic Review

Heng Li, Haojin Li, Mingyang Ou et al.

As an affordable and convenient eye scan, fundus photography holds the potential for preventing vision impairment, especially in resource-limited regions. However, fundus image degradation is common under intricate imaging environments, impacting following diagnosis and treatment. Consequently, image quality assessment (IQA) and enhancement (IQE) are essential for ensuring the clinical value and reliability of fundus images. While existing reviews offer some overview of this field, a comprehensive analysis of the interplay between IQA and IQE, along with their clinical deployment challenges, is lacking. This paper addresses this gap by providing a thorough review of fundus IQA and IQE algorithms, research advancements, and practical applications. We outline the fundamentals of the fundus photography imaging system and the associated interferences, and then systematically summarize the paradigms in fundus IQA and IQE. Furthermore, we discuss the practical challenges and solutions in deploying IQA and IQE, as well as offer insights into potential future research directions.

en eess.IV, cs.CV

Detail Sumber

arXiv Open Access 2025

Time-Aware Auto White Balance in Mobile Photography

Mahmoud Afifi, Luxi Zhao, Abhijith Punnappurath et al.

Cameras rely on auto white balance (AWB) to correct undesirable color casts caused by scene illumination and the camera's spectral sensitivity. This is typically achieved using an illuminant estimator that determines the global color cast solely from the color information in the camera's raw sensor image. Mobile devices provide valuable additional metadata-such as capture timestamp and geolocation-that offers strong contextual clues to help narrow down the possible illumination solutions. This paper proposes a lightweight illuminant estimation method that incorporates such contextual metadata, along with additional capture information and image colors, into a compact model (~5K parameters), achieving promising results, matching or surpassing larger models. To validate our method, we introduce a dataset of 3,224 smartphone images with contextual metadata collected at various times of day and under diverse lighting conditions. The dataset includes ground-truth illuminant colors, determined using a color chart, and user-preferred illuminants validated through a user study, providing a comprehensive benchmark for AWB evaluation.

en cs.CV

Detail Sumber

DOAJ Open Access 2025

Enhancing U-Net Segmentation Accuracy Through Comprehensive Data Preprocessing

Talshyn Sarsembayeva, Madina Mansurova, Assel Abdildayeva et al.

The accurate segmentation of lung regions in computed tomography (CT) scans is critical for the automated analysis of lung diseases such as chronic obstructive pulmonary disease (COPD) and COVID-19. This paper focuses on enhancing the accuracy of U-Net segmentation models through a robust preprocessing pipeline. The pipeline includes CT image normalization, binarization to extract lung regions, and morphological operations to remove artifacts. Additionally, the proposed method applies region-of-interest (ROI) filtering to isolate lung areas effectively. The dataset preprocessing significantly improves segmentation quality by providing clean and consistent input data for the U-Net model. Experimental results demonstrate that the Intersection over Union (IoU) and Dice coefficient exceeded 0.95 on training datasets. This work highlights the importance of preprocessing as a standalone step for optimizing deep learning-based medical image analysis.

Photography, Computer applications to medicine. Medical informatics

Detail DOI Sumber

DOAJ Open Access 2025

Semantic-Enhanced and Temporally Refined Bidirectional BEV Fusion for LiDAR–Camera 3D Object Detection

Xiangjun Qu, Kai Qin, Yaping Li et al.

In domains such as autonomous driving, 3D object detection is a key technology for environmental perception. By integrating multimodal information from sensors such as LiDAR and cameras, the detection accuracy can be significantly improved. However, the current multimodal fusion perception framework still suffers from two problems: first, due to the inherent physical limitations of LiDAR detection, the number of point clouds of distant objects is sparse, resulting in small target objects being easily overwhelmed by the background; second, the cross-modal information interaction is insufficient, and the complementarity and correlation between the LiDAR point cloud and the camera image are not fully exploited and utilized. Therefore, we propose a new multimodal detection strategy, Semantic-Enhanced and Temporally Refined Bidirectional BEV Fusion (SETR-Fusion). This method integrates three key components: the Discriminative Semantic Saliency Activation (DSSA) module, the Temporally Consistent Semantic Point Fusion (TCSP) module, and the Bilateral Cross-Attention Fusion (BCAF) module. The DSSA module fully utilizes image semantic features to capture more discriminative foreground and background cues; the TCSP module generates semantic LiDAR points and, after noise filtering, produces a more accurate semantic LiDAR point cloud; and the BCAF module’s cross-attention to camera and LiDAR BEV features in both directions enables strong interaction between the two types of modal information. SETR-Fusion achieves 71.2% mAP and 73.3% NDS values on the nuScenes test set, outperforming several state-of-the-art methods.

Photography, Computer applications to medicine. Medical informatics

Detail DOI Sumber

DOAJ Open Access 2025

Envisioning Multispecies Tropical Futurity: Image-Making in North Maluku’s Frontier Zone

Danishwara Nathaniel

In recent years, the name Alfred Russel Wallace, the 19th-century British naturalist who co-conceptualized the theory of natural selection and authored the book documenting species diversity throughout Indonesia, titled The Malay Archipelago (1859), has regained significance in the place where he did his research: Ternate, North Maluku (the Moluccas), Eastern Indonesia. His legacy and icon are being reclaimed by local communities, inserting themselves as authors of the region’s future, one that is centered on multispecies stewardship. Based on visual anthropology ethnographic fieldwork spanning over 15 months since the beginning of 2021, the materials presented in this article explore the perspectives of local cultural activists/practitioners in making visible their concerns, advocating for the rich multispecies existence on their island acknowledged globally since Wallace. Working with a team of university students, photography clubs, journalists, and heritage and environmental activists based in Ternate, I engage with everyday socio-cultural and visual media practices that treat images as modes of address/redress mobilizing affective engagement and political effects (Spyer & Steedly, 2013), contesting possible tropical futurities. Discussing three sites of image-making—a mural, wildlife photography, and drone-afforded reportage—I argue that these practices play a crucial role in intervening in and shaping how this tropical region is imagined at various scales, globally and nationally. Oscillating between utopian and dystopian scenarios, the images produced make a demand for a more just and livable future across species.

Social Sciences

Detail DOI Sumber

DOAJ Open Access 2025

Comparison of toric implantable collamer lens alignment accuracy: VERION image-guided system versus manual marking

Xiao-Ying He, Jun Wang, Min-Jie Yuan et al.

AIM: To compare the accuracy of manual marking versus an image-guided system for toric implantable collamer lens (TICL) implantation and evaluate the short-term postoperative rotational stability of TICL and corneal surgically induced astigmatism vector (SIA). METHODS: Retrospective analysis was conducted on eyes with TICL alignment achieved through manual marking (n=75) or VERION image-guided system-assisted marking (n=83). Each group was further classified into horizontal and vertical subgroups based on implant orientation. Additionally, patients were categorized into superior and temporal incision subgroups according to the position of main corneal incision. The misalignment and rotational stability of TICL were analyzed using slit-lamp anterior segment photography. Surgical predictability, efficacy, safety, and corneal SIA were also evaluated. RESULTS: In general, the TICL implantation with manual and digital image-guided systems all achieved robust predictability, efficacy, and safety. The misalignment of TICL was comparable between the manual and VERION groups (0.16°±3.97° vs 0.52°±5.59°, P=0.633), while a significant difference was observed in the absolute misalignment of TICL between the two groups (3.02°±2.55° vs 4.28°±3.61°, P=0.043). There were no significant differences in the distribution of TICL misalignment between the manual and VERION groups or between horizontal and vertical implant orientation groups (P>0.05). Furthermore, different orientations of TICL placement did not show statistically significant differences in rotational stability (P=0.46). Statistically significant differences were found in anterior corneal SIA between the manual and VERION groups (0.46±0.27 vs 0.33±0.21 D, P=0.001), especially for superior incision position (0.60±0.27 vs 0.35±0.23 D, P<0.0001). The anterior SIA exhibited a significant difference between superior and temporal incisions in the manual group (0.60±0.27 vs 0.35±0.20 D, P<0.0001). CONCLUSION: Compared with the conventional manual marking method, this study indicates that the digital image-guided system with VERION is safe and effective in TICL implantation. The digital system offers the advantage of minimizing corneal SIA compared to the manual method.

Ophthalmology

Detail DOI Sumber

S2 Open Access 1994

Improvements on Littmann's method of determining the size of retinal features by fundus photography

Bennett Ag, A. Rudnicka, D. Edgar

628 sitasi en Mathematics, Medicine

Detail DOI Sumber

S2 Open Access 1997

Real-time pickup method for a three-dimensional image based on integral photography.

F. Okano, H. Hoshino, J. Arai et al.

603 sitasi en Computer Science, Medicine

Detail DOI Sumber

arXiv Open Access 2024

Plenoptic microscopy and photography from intensity correlations

Francesco V. Pepe, Francesco Di Lena, Augusto Garuccio et al.

We present novel methods to perform plenoptic imaging at the diffraction limit by measuring intensity correlations of light. The first method is oriented towards plenoptic microscopy, a promising technique which allows refocusing and depth-of-field enhancement, in post-processing, as well as scanning free 3D imaging. To overcome the limitations of standard plenoptic microscopes, we propose an adaptation of Correlation Plenoptic Imaging (CPI) to the working conditions of microscopy. We consider and compare different architectures of CPI microscopes, and discuss the improved robustness with respect to previous protocols against turbulence around the sample. The second method is based on measuring correlations between the images of two reference planes, arbitrarily chosen within the tridimensional scene of interest, providing an unprecedented combination of image resolution and depth of field. The results lead the way towards the realization of compact designs for CPI devices.

en physics.optics

Detail DOI Sumber

arXiv Open Access 2024

Cross-Fundus Transformer for Multi-modal Diabetic Retinopathy Grading with Cataract

Fan Xiao, Junlin Hou, Ruiwei Zhao et al.

Diabetic retinopathy (DR) is a leading cause of blindness worldwide and a common complication of diabetes. As two different imaging tools for DR grading, color fundus photography (CFP) and infrared fundus photography (IFP) are highly-correlated and complementary in clinical applications. To the best of our knowledge, this is the first study that explores a novel multi-modal deep learning framework to fuse the information from CFP and IFP towards more accurate DR grading. Specifically, we construct a dual-stream architecture Cross-Fundus Transformer (CFT) to fuse the ViT-based features of two fundus image modalities. In particular, a meticulously engineered Cross-Fundus Attention (CFA) module is introduced to capture the correspondence between CFP and IFP images. Moreover, we adopt both the single-modality and multi-modality supervisions to maximize the overall performance for DR grading. Extensive experiments on a clinical dataset consisting of 1,713 pairs of multi-modal fundus images demonstrate the superiority of our proposed method. Our code will be released for public access.

en eess.IV, cs.AI

Detail Sumber

arXiv Open Access 2024

Deep Learning Ensemble for Predicting Diabetic Macular Edema Onset Using Ultra-Wide Field Color Fundus Image

Pengyao Qin, Arun J. Thirunavukarasu, Theodoros Arvanitis et al.

Diabetic macular edema (DME) is a severe complication of diabetes, characterized by thickening of the central portion of the retina due to accumulation of fluid. DME is a significant and common cause of visual impairment in diabetic patients. Center-involved DME (ci-DME) is the highest risk form of disease because fluid extends close to the fovea which is responsible for sharp central vision. Earlier diagnosis or prediction of ci-DME may improve treatment outcomes. Here, we propose an ensemble method to predict ci-DME onset within a year, after using synthetic ultra-wide field color fundus photography (UWF-CFP) images provided by the DIAMOND Challenge during development. We adopted a variety of baseline state-of-the-art classification networks including ResNet, DenseNet, EfficientNet, and VGG with the aim of enhancing model robustness. The best performing models were Densenet-121, Resnet-152 and EfficientNet-b7, and these were assembled into a definitive predictive model. The final ensemble model demonstrates a strong performance with an Area Under Curve (AUC) of 0.7017, an F1 score of 0.6512, and an Expected Calibration Error (ECE) of 0.2057 when deployed on the synthetic test dataset. Results from our ensemble model were superior/comparable to previous recorded results in highly curated settings using conventional fundus photography/ultra-wide field fundus photography. Optimal sensitivity in previous studies (using humans or computers to diagnose) ranges from 67.3%-98%, specificity from 47.8%-80%. Therefore, our method can be used safely and effectively in a range of settings may facilitate earlier diagnosis, better treatment decisions, and improved prognostication in ci-DME.

en eess.IV, cs.AI

Detail Sumber

DOAJ Open Access 2024

Sonidegib reduced tumor burden in patients with advanced basal cell carcinoma in the BOLT trial: Long-term analysis results

Michael R. Migden, Aaron S. Farberg, James Spencer et al.

Background: The Hedgehog inhibitor sonidegib had durable efficacy and a manageable safety profile in patients with locally advanced basal cell carcinoma (laBCC) through 42 months of the BOLT trial. In this analysis, we characterize the effects of 200-mg and 800-mg sonidegib on tumors in patients in the BOLT trial. Methods: Tumors were assessed using color photography and magnetic resonance imaging (MRI) by central and investigator review at baseline; Weeks 5, 9, and 17 after the start of treatment; then every 8 weeks during the first year; and every 12 weeks thereafter. Results: In patients with laBCC receiving sonidegib, the decrease in best percentage change from baseline in target lesions was 92.3 % per central review and 96.7 % per investigator review in the 200-mg treatment arm, and 90.1 % per central review and 95.2 % per investigator in the 800-mg treatment arm. The kinetics of response to treatment appeared to influence tumor reduction, with patients responding within the first 3 months of treatment experiencing a greater decrease in tumor size over time than later responders. Additionally, patients whose best overall response to treatment was complete response or partial response had a generally longer duration of response compared with patients who had stable disease as best overall response. Tumor reduction and duration of response were greater when assessed by investigator review compared with central review. Conclusion: Treatment with sonidegib for up to 42 months substantially reduced tumors in patients with laBCC.

Neoplasms. Tumors. Oncology. Including cancer and carcinogens

Detail DOI Sumber

DOAJ Open Access 2023

TTN‐FCN: A Tangut character classification framework by tree tensor network and fully connected neural network

Ziping Ma, Jinlin Ma

Abstract The classification of Tangut characters plays a significant role in Western Xia research, and yet it is still a great challenge to recognize Tangut characters accurately due to the less inter‐class similarity and smaller intra‐class variation of Tangut character images. The main reason is that Tangut characters possess an extremely intricate construct despite some other character recognition methods emerging. For this reason, the authors propose a novel framework for Tangut character classification, named tree tensor network‐fully connected neural network (TTN‐FCN), in which TTN is embedded to fully connect neural network. Firstly, Tangut images are encoded into quantum product states without entanglement in pre‐processing. Then the TTN is adopted to contract quantum product states to intermediate low dimensional quantum states. Finally, low dimensional quantum states are input to the FCN network to perform classification tasks. The Model is evaluated on the Tangut character dataset that is constructed from Tangut character‐related documents by scanning and consists of 30,293 Tangut character images with 6077 categories. Experimental results show that TTN‐FCN has a faster convergence speed and achieves classification precision (AC) of 99.98% and loss of 0.688% with the max batch size 2042, which outperforms 30 compared networks. Moreover, the proposed model can also be generalized to other character recognition, which enhances its potential for cultural relic research and development.

Photography, Computer software

Detail DOI Sumber

DOAJ Open Access 2023

Geographic atrophy: pathophysiology and current therapeutic strategies

Kalpana Rajanala, Farokh Dotiwala, Arun Upadhyay

Geographic atrophy (GA) is an advanced stage of age-related macular degeneration (AMD) that leads to gradual and permanent vision loss. GA is characterized by the loss of photoreceptor cells and retinal pigment epithelium (RPE), leading to distinct atrophic patches in the macula, which tends to increase with time. Patients with geographic atrophy often experience a gradual and painless loss of central vision, resulting in difficulty reading, recognizing faces, or performing activities that require detailed vision. The primary risk factor for the development of geographic atrophy is advanced age; however, other risk factors, such as family history, smoking, and certain genetic variations, are also associated with AMD. Diagnosis is usually based on a comprehensive eye examination, including imaging tests such as fundus photography, optical coherence tomography (OCT), and fluorescein angiography. Numerous clinical trials are underway, targeting identified molecular pathways associated with GA that are promising. Recent approvals of Syfovre and Izervay by the FDA for the treatment of GA provide hope to affected patients. Administration of these drugs resulted in slowing the rate of progression of the disease. Though these products provide treatment benefits to the patients, they do not offer a cure for geographic atrophy and are limited in efficacy. Considering these safety concerns and limited treatment benefits, there is still a significant need for therapeutics with improved efficacy, safety profiles, and better patient compliance. This comprehensive review discusses pathophysiology, currently approved products, their limitations, and potential future treatment strategies for GA.

Medicine

Detail DOI Sumber

DOAJ Open Access 2023

Notes on camera obscura: three contemporary artistic perspectives on the path of photography

Filippo De Tomasi

In 1971, Rockne Krebs presented an immersive artwork composed of the optical phenomenon of the camera obscura, among other elements. Although this artist appears to be the first interested in this phenomenon, it was only from the 1990s onwards that the artistic practice of camera obscura as room installation became widespread. In the decades since, several authors have included it in their production—developing projects of a photographic nature through different approaches, mainly focusing on projected image or spectator participation. Through the method of media archaeology, it is possible to find three lines of contemporary artistic research on the phenomenon: camera obscura related to meta-photography in Abelardo Morell’s works; immersive installation pieces as proposed by Zoe Leonard; and projections of other worlds in the work of the artistic duo João Maria Gusmão and Pedro Paiva. This paper aims at a comparative study that intends to analyze these three paths, underlining the characteristics of some contemporary artworks and contextualizing their elements to the history and use of camera obscura in the photographic context.

Fine Arts, Visual arts

Detail DOI Sumber

arXiv Open Access 2022

Cross-Field Transformer for Diabetic Retinopathy Grading on Two-field Fundus Images

Junlin Hou, Jilan Xu, Fan Xiao et al.

Automatic diabetic retinopathy (DR) grading based on fundus photography has been widely explored to benefit the routine screening and early treatment. Existing researches generally focus on single-field fundus images, which have limited field of view for precise eye examinations. In clinical applications, ophthalmologists adopt two-field fundus photography as the dominating tool, where the information from each field (i.e.,macula-centric and optic disc-centric) is highly correlated and complementary, and benefits comprehensive decisions. However, automatic DR grading based on two-field fundus photography remains a challenging task due to the lack of publicly available datasets and effective fusion strategies. In this work, we first construct a new benchmark dataset (DRTiD) for DR grading, consisting of 3,100 two-field fundus images. To the best of our knowledge, it is the largest public DR dataset with diverse and high-quality two-field images. Then, we propose a novel DR grading approach, namely Cross-Field Transformer (CrossFiT), to capture the correspondence between two fields as well as the long-range spatial correlations within each field. Considering the inherent two-field geometric constraints, we particularly define aligned position embeddings to preserve relative consistent position in fundus. Besides, we perform masked cross-field attention during interaction to flter the noisy relations between fields. Extensive experiments on our DRTiD dataset and a public DeepDRiD dataset demonstrate the effectiveness of our CrossFiT network. The new dataset and the source code of CrossFiT will be publicly available at https://github.com/FDU-VTS/DRTiD.

en cs.CV

Detail Sumber

DOAJ Open Access 2022

Investigarea criminalistica a accidentelor de trafic rutier // Forensic investigation of road traffic accidents

Petruț Ciobanu

The investigation of traffic accidents is carried out by the same bodies that have the task of gathering evidence regarding the existence of the crime, identifying the perpetrator and establishing his responsibility, in order to ascertain whether or not it is necessary to order the prosecution. The investigation of traffic accidents involves establishing elements or clarifying aspects than can serve to outline the legal nature of the event, to determine the criminal and civil liability of the person responsible for the accident, preventing future events of the same nature. The criminal investigation activity is carried out under of supervision of the prosecutor, who after completing the investigation, will proceed to verify the criminal investigation works, in order to rule on the legality and validity of obtaining, administering evidence and means of proof.

Law in general. Comparative and uniform law. Jurisprudence

Detail DOI Sumber

DOAJ Open Access 2022

Spatial and long–short temporal attention correlation filters for visual tracking

Jianwei Zhao, Fuyuan Wei, NingNing Chen et al.

Abstract Discriminative correlation filter is one of the quick and effective ways for studying visual tracking. However, discriminative correlation filter‐based methods still suffer from many challenging questions caused by environmental interferences, such as spatial boundary effect, temporal filter degradation, and tracking drift. A novel appearance optimisation model, named spatial and long–short temporal attention model, has been proposed based on a new spatial regularisation term and a long–short temporal regularisation term for learning the correlation filter to localise the target. On the one hand, our proposed method can improve the classical spatial regularisation term with a new weight matrix to alleviate the spatial boundary effect. On the other hand, two new temporal regularisation terms are designed: a short temporal regularisation term and a long temporal regularisation term. The short temporal regularisation term can enlarge the inner connections of the current frame and all foregoing frames to improve the tracking performances, and the long temporal regularisation term can address the influence of occlusion by using the similarity between the initial filter and the current one. Extensive experiments on various benchmarks illustrate that our proposed tracker performs favourably against several related popular trackers.

Photography, Computer software

Detail DOI Sumber

DOAJ Open Access 2022

SUBURBANIZATION OF BARNAUL CITY: A RETROSPECTIVE REVIEW AND DEVELOPMENT FORECAST

Ovcharova Diana A., Zhukovsky Roman S.

For the city of Barnaul, a study was carried out on the development of suburbanization –growth of a habitable suburban area with low-density building. Archival maps, up-to-date aerial photography, and materials of the master plan helped establish the periods of Barnaul suburbanization from the foundation of the city to the present time. Based on the research literature devoted to the current problems of suburbanization and selected scenarios of suburbanization according to the world urban planning experience, a forecast is made regarding further suburbanization of Barnaul over the next decade. It is assumed that in this large regional city with a compact planning structure, suburbanization will follow a hybrid “post-Soviet” scenario, in which low-intensity development of suburbanization will prceed along with “peripheral urbanization” partially replacing suburbanization through the construction of high-rise neighborhoods along the Zmeinogorsky and Pavlovsky tracts, mainly within the administrative boundaries of the city. An assumption is made about the limitations of the suburbanization processes in the city of Barnaul due to the exhaustion of migration and demographic resources compared to the situation in the 20th century.

Architecture

Detail DOI Sumber

Hasil untuk "Photography"