Enhanced Forest Inventories for Habitat Mapping: A Case Study in the Sierra Nevada Mountains of California
Maxime Turgeon, Michael Kieser, Dwight Wolfe
et al.
Traditional forest inventory systems, originally designed to quantify merchantable timber volume, often lack the spatial resolution and structural detail required for modern multi-resource ecosystem management. In this manuscript, we present an Enhanced Forest Inventory (EFI) and demonstrate its utility for high-resolution wildlife habitat mapping. The project area covers 270,000 acres of the Eldorado National Forest in California's Sierra Nevada. By integrating 118 ground-truth Forest Inventory and Analysis (FIA) plots with multi-modal remote sensing data (LiDAR, aerial photography, and Sentinel-2 satellite imagery), we developed predictive models for key forest attributes. Our methodology employed a two-tier segmentation approach, partitioning the landscape into approximately 575,000 reporting units with an average size of 0.5 acre to capture forest heterogeneity. We utilized an Elastic-Net Regression framework and automated feature selection to relate remote sensing metrics to ground-measured variables such as basal area, stems per acre, and canopy cover. These physical metrics were translated into functional habitat attributes to evaluate suitability for two focal species: the California Spotted Owl (Strix occidentalis occidentalis) and the Pacific Fisher (Pekania pennanti). Our analysis identified 25,630 acres of nesting and 26,622 acres of foraging habitat for the owl, and 25,636 acres of likely habitat for the fisher based on structural requirements like large-diameter trees and high canopy closure. The results demonstrate that EFIs provide a critical bridge between forestry and conservation ecology, offering forest managers a spatially explicit tool to monitor ecosystem health and manage vulnerable species in complex environments.
TubeMLLM: A Foundation Model for Topology Knowledge Exploration in Vessel-like Anatomy
Yaoyu Liu, Minghui Zhang, Xin You
et al.
Modeling medical vessel-like anatomy is challenging due to its intricate topology and sensitivity to dataset shifts. Consequently, task-specific models often suffer from topological inconsistencies, including artificial disconnections and spurious merges. Motivated by the promise of multimodal large language models (MLLMs) for zero-shot generalization, we propose TubeMLLM, a unified foundation model that couples structured understanding with controllable generation for medical vessel-like anatomy. By integrating topological priors through explicit natural language prompting and aligning them with visual representations in a shared-attention architecture, TubeMLLM significantly enhances topology-aware perception. Furthermore, we construct TubeMData, a pionner multimodal benchmark comprising comprehensive topology-centric tasks, and introduce an adaptive loss weighting strategy to emphasize topology-critical regions during training. Extensive experiments on fifteen diverse datasets demonstrate our superiority. Quantitatively, TubeMLLM achieves state-of-the-art out-of-distribution performance, substantially reducing global topological discrepancies on color fundus photography (decreasing the $β_{0}$ number error from 37.42 to 8.58 compared to baselines). Notably, TubeMLLM exhibits exceptional zero-shot cross-modality transferring ability on unseen X-ray angiography, achieving a Dice score of 67.50% while significantly reducing the $β_{0}$ error to 1.21. TubeMLLM also maintains robustness against degradations such as blur, noise, and low resolution. Furthermore, in topology-aware understanding tasks, the model achieves 97.38% accuracy in evaluating mask topological quality, significantly outperforming standard vision-language baselines.
Dismantling the master’s house with his own tools: the Wrinkles Jewelry Collection’s defiance of visual agism
Shlomit Aharoni Lir, Liat Ayalon
Abstract The prevailing antiaging beauty industry equates beauty and attractiveness with youth, especially in the case of women. Efforts to artistically challenge common conceptions of women’s representations are often based on undermining the mindset of the Western socio-economic system, with its emphasis on consumption and commodification. Reflecting Audre Lorde’s thought that “the master’s tools will never dismantle the master’s house,” various feminist artists have challenged oppressive beauty conventions through works that confront and undermine common representations of femininity and beauty in art, popular culture, and advertisements. This study explores this topic by examining how Noa Zilberman’s The Wrinkles Jewelry Collection, which is composed of photographs and video art of gold-plated brass objects that symbolize wrinkles, confronts the prevailing youth beauty ideal. The research employs a formal visual analysis of Wrinkles. We analyzed visual data, including still photography and video art, to identify patterns and themes that explore how the artist utilizes tools reflective of the commercialized beauty industry to defy the youth beauty ideal. The analysis allowed us to theorize a model of artistic ability to contest common oppressive hegemonic axioms that associate beauty with youth, while finding beauty in a place that has been marked out for shame – that is, in signs of aging.
History of scholarship and learning. The humanities, Social Sciences
Vision-threatening complications of injection sclerotherapy: case report, literature review, and FAERS database analysis
Dehai Liu, Xiaona Wang, Hongliang Dou
Abstract Background Injection sclerotherapy using sclerosants such as polidocanol has been widely employed for managing vascular disorders including chronic venous diseases and hemangiomas. Although sclerotherapy is considered a minimally invasive and generally safe procedure, rare but vision-threatening ocular complications have been reported. We present a unique scenario of progressive ophthalmic artery occlusion (OAO) associated with foamed polidocanol injection leading to irreversible blindness in a pediatric patient. Case presentation A 13-year-old male with a history of facial hemangioma underwent intralesional injection of foamed 1% polidocanol under ultrasound guidance at a dental clinic. He experienced transient monocular blindness in his left eye immediately after injection lasting for approximately 30 min, while best-corrected visual acuity at presentation recovered to 20/20. Despite systemic corticosteroids and anticoagulation, his vision deteriorated to no light perception within 1 month, accompanied by blepharoptosis and cutaneous necrosis. Multimodal imaging including fundus photography, optical coherence tomography, and fluorescein angiography demonstrated progressive retinal vascular occlusion, and Doppler ultrasonography at 1 month identified absent flow in the ophthalmic and central retinal arteries confirming OAO. Furthermore, a global pharmacovigilance analysis revealed that ocular complications represented only 3.9% of reported adverse events for sclerosing agents but were disproportionately severe, with 45.7% classified as death, life-threatening, disabling, or requiring hospitalization; permanent blindness occurred in 6.2% of the total cases. Conclusions This case underscores the potential for catastrophic ocular complications after polidocanol sclerotherapy. Given the limited therapeutic efficacy once iatrogenic OAO occurs, we emphasize caution when performing sclerosant injections particularly in the risky regions.
DiffCamera: Arbitrary Refocusing on Images
Yiyang Wang, Xi Chen, Xiaogang Xu
et al.
The depth-of-field (DoF) effect, which introduces aesthetically pleasing blur, enhances photographic quality but is fixed and difficult to modify once the image has been created. This becomes problematic when the applied blur is undesirable~(e.g., the subject is out of focus). To address this, we propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level. Specifically, we design a diffusion transformer framework for refocusing learning. However, the training requires pairs of data with different focus planes and bokeh levels in the same scene, which are hard to acquire. To overcome this limitation, we develop a simulation-based pipeline to generate large-scale image pairs with varying focus planes and bokeh levels. With the simulated data, we find that training with only a vanilla diffusion objective often leads to incorrect DoF behaviors due to the complexity of the task. This requires a stronger constraint during training. Inspired by the photographic principle that photos of different focus planes can be linearly blended into a multi-focus image, we propose a stacking constraint during training to enforce precise DoF manipulation. This constraint enhances model training by imposing physically grounded refocusing behavior that the focusing results should be faithfully aligned with the scene structure and the camera conditions so that they can be combined into the correct multi-focus image. We also construct a benchmark to evaluate the effectiveness of our refocusing model. Extensive experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
UOPSL: Unpaired OCT Predilection Sites Learning for Fundus Image Diagnosis Augmentation
Zhihao Zhao, Yinzheng Zhao, Junjie Yang
et al.
Significant advancements in AI-driven multimodal medical image diagnosis have led to substantial improvements in ophthalmic disease identification in recent years. However, acquiring paired multimodal ophthalmic images remains prohibitively expensive. While fundus photography is simple and cost-effective, the limited availability of OCT data and inherent modality imbalance hinder further progress. Conventional approaches that rely solely on fundus or textual features often fail to capture fine-grained spatial information, as each imaging modality provides distinct cues about lesion predilection sites. In this study, we propose a novel unpaired multimodal framework \UOPSL that utilizes extensive OCT-derived spatial priors to dynamically identify predilection sites, enhancing fundus image-based disease recognition. Our approach bridges unpaired fundus and OCTs via extended disease text descriptions. Initially, we employ contrastive learning on a large corpus of unpaired OCT and fundus images while simultaneously learning the predilection sites matrix in the OCT latent space. Through extensive optimization, this matrix captures lesion localization patterns within the OCT feature space. During the fine-tuning or inference phase of the downstream classification task based solely on fundus images, where paired OCT data is unavailable, we eliminate OCT input and utilize the predilection sites matrix to assist in fundus image classification learning. Extensive experiments conducted on 9 diverse datasets across 28 critical categories demonstrate that our framework outperforms existing benchmarks.
Impact of Radio Frequency Power on Columnar and Filamentary Modes in Atmospheric Pressure Very Low Frequency Plasma within Pores
Haozhe Wang, Yu Zhang, Jie Cui
et al.
The impact of radio frequency (RF) power on columnar and filamentary modes of very low frequency (VLF) plasma within pores is investigated in this work. The 12.5 kHz VLF discharge under various RF powers (13.56 MHz) was analyzed using optical photography and current-voltage measurements. Two-dimensional electron densities were derived using optical emission spectroscopy combined with collisional radiation modeling methods. It is found that RF power and very low frequency voltage (VVLF) significantly influence the plasma and its discharge modes within the 200 μm pore. Under low VVLF conditions, the plasma is more intense within the pore, and the discharge mode is columnar discharge. With increasing RF power, the reciprocal motion of electrons counteracts the local enhancement effect of columnar discharge, the discharge transforms into RF discharge, the pore is completely wrapped by the sheath, and the plasma inside is gradually quenched. Under high VVLF conditions, the electron density within the pore is low and the discharge mode is filamentary discharge. RF introduction reduces plasma intensity within the pores firstly. As RF power increases, more ion trapping in the pore increases the field strength distortion and enhances the plasma intensity inside the pore, this enhancement effects becomes more obvious with increasing RF power. In addition, the above effects were observed for all pore widths from 100 um to 1000 um. These findings provide key insights for controlling plasma in pores and offer new methodologies for plasma technology applications.
en
physics.plasm-ph, physics.app-ph
F2IDiff: Real-world Image Super-resolution using Feature to Image Diffusion Foundation Model
Devendra K. Jangid, Ripon K. Saha, Dilshan Godaliyadda
et al.
With the advent of Generative AI, Single Image Super-Resolution (SISR) quality has seen substantial improvement, as the strong priors learned by Text-2-Image Diffusion (T2IDiff) Foundation Models (FM) can bridge the gap between High-Resolution (HR) and Low-Resolution (LR) images. However, flagship smartphone cameras have been slow to adopt generative models because strong generation can lead to undesirable hallucinations. For substantially degraded LR images, as seen in academia, strong generation is required and hallucinations are more tolerable because of the wide gap between LR and HR images. In contrast, in consumer photography, the LR image has substantially higher fidelity, requiring only minimal hallucination-free generation. We hypothesize that generation in SISR is controlled by the stringency and richness of the FM's conditioning feature. First, text features are high level features, which often cannot describe subtle textures in an image. Additionally, Smartphone LR images are at least $12MP$, whereas SISR networks built on T2IDiff FM are designed to perform inference on much smaller images ($<1MP$). As a result, SISR inference has to be performed on small patches, which often cannot be accurately described by text feature. To address these shortcomings, we introduce an SISR network built on a FM with lower-level feature conditioning, specifically DINOv2 features, which we call a Feature-to-Image Diffusion (F2IDiff) Foundation Model (FM). Lower level features provide stricter conditioning while being rich descriptors of even small patches.
LMOD+: A Comprehensive Multimodal Dataset and Benchmark for Developing and Evaluating Multimodal Large Language Models in Ophthalmology
Zhenyue Qin, Yang Liu, Yu Yin
et al.
Vision-threatening eye diseases pose a major global health burden, with timely diagnosis limited by workforce shortages and restricted access to specialized care. While multimodal large language models (MLLMs) show promise for medical image interpretation, advancing MLLMs for ophthalmology is hindered by the lack of comprehensive benchmark datasets suitable for evaluating generative models. We present a large-scale multimodal ophthalmology benchmark comprising 32,633 instances with multi-granular annotations across 12 common ophthalmic conditions and 5 imaging modalities. The dataset integrates imaging, anatomical structures, demographics, and free-text annotations, supporting anatomical structure recognition, disease screening, disease staging, and demographic prediction for bias evaluation. This work extends our preliminary LMOD benchmark with three major enhancements: (1) nearly 50% dataset expansion with substantial enlargement of color fundus photography; (2) broadened task coverage including binary disease diagnosis, multi-class diagnosis, severity classification with international grading standards, and demographic prediction; and (3) systematic evaluation of 24 state-of-the-art MLLMs. Our evaluations reveal both promise and limitations. Top-performing models achieved ~58% accuracy in disease screening under zero-shot settings, and performance remained suboptimal for challenging tasks like disease staging. We will publicly release the dataset, curation pipeline, and leaderboard to potentially advance ophthalmic AI applications and reduce the global burden of vision-threatening diseases.
A survey of feature matching methods
Qian Huang, Xiaotong Guo, Yiming Wang
et al.
Abstract Feature matching plays a crucial role in computer vision, with applications in visual localization, simultaneous localization and mapping (SLAM), image stitching, and more. It establishes correspondences between sets of feature points from multiple images, enabling various tasks. Over the years, feature matching has witnessed significant development, with an increasing number of methods being applied. However, different methods exhibit different degrees of applicability in different scenarios and requirements due to their different rationales. To cope with these issues, a comprehensive analysis and comparison of matching methods are essential. Existing reviews often lack coverage of deep learning models and focus more on feature detection and description, neglecting the matching process. This survey investigates feature detection, description, and matching techniques within the feature‐based image‐matching pipeline. Representative methods, their mechanisms, and application scenarios are also briefly introduced. In addition, comprehensive evaluations of classical and state‐of‐the‐art methods are conducted through extensive experiments on representative datasets. Particularly, matching‐based applications are compared to fully demonstrate the advantages of the methods. Lastly, this survey highlights current problems and development directions in matching methods, serving as a reference for researchers in the field.
Photography, Computer software
Point Projection Mapping System for Tracking, Registering, Labeling, and Validating Optical Tissue Measurements
Lianne Feenstra, Stefan D. van der Stel, Marcos Da Silva Guimaraes
et al.
The validation of newly developed optical tissue-sensing techniques for tumor detection during cancer surgery requires an accurate correlation with the histological results. Additionally, such an accurate correlation facilitates precise data labeling for developing high-performance machine learning tissue-classification models. In this paper, a newly developed Point Projection Mapping system will be introduced, which allows non-destructive tracking of the measurement locations on tissue specimens. Additionally, a framework for accurate registration, validation, and labeling with the histopathology results is proposed and validated on a case study. The proposed framework provides a more-robust and accurate method for the tracking and validation of optical tissue-sensing techniques, which saves time and resources compared to the available conventional techniques.
Photography, Computer applications to medicine. Medical informatics
Multi‐stage image inpainting using improved partial convolutions
Cheng Li, Dan Xu, Hao Zhang
Abstract In recent years, deep learning models have dramatically influenced image inpainting. However, many existing studies still suffer from over‐smoothed or blurred textures when missing regions are large or contain rich visual details. To restore textures at a fine‐grained level, a multi‐stage inpainting approach is proposed, which applies a series of partial inpainting modules as well as a progressive inpainting module to inpaint missing areas from their boundaries to the centre successively. Some improvements are made on the partial convolutions to reduce artifacts like blurriness, which require a convolution kernel to contain known pixels more than a certain proportion. Towards photorealistic inpainting results, the intermediate outputs from each stage are used to compute the loss. Finally, to facilitate the training process, a multi‐step training is designed that progressively adds inpainting modules to optimize the model. Experiments show that this method outperforms the current excellent techniques on the publicly available datasets: CelebA, Places2 and Paris StreetView.
Photography, Computer software
Image‐based crop row detection utilizing the Hough transform and DBSCAN clustering analysis
Richeng Zhao, Xianju Yuan, Zhanpeng Yang
et al.
Abstract More accurate methods for crop row detection benefit intelligent operation of agricultural machinery, especially avoiding mishandling or crushing crops. For achieving such a target, a traditional method combining the ExGR exponents, Otsu algorithm, Canny method, Hough transform and DBSCAN clustering analysis is proposed so that centerlines of crop rows can be detected effectively without manual intervention. Specifically, ExGR exponents are first adopted to gray green plants. The threshold of binarization will be further obtained by the Otsu algorithm. Further adopting the edge detection algorithm (Canny), edges of crops can be determined. Finally, combining the Hough transform and DBSCAN clustering analysis, the crop row detection is effectively available. Utilizing these methods, numerical simulation and their comparisons with existing methods are also achieved. For example, the Canny algorithm is relatively accurate than the Suzuki algorithm as well as their combinations with a geometric center extraction method if the density of weed is high. Compared with the K‐means clustering method, the DBSCAN algorithm is more suitable to characterize crop rows optimally in more complex conditions. It is validated from experiments that the combination of Canny algorithm, Hough transform and DBSCAN clustering is better than other mentioned traditional methods.
Photography, Computer software
ALEN: A Dual-Approach for Uniform and Non-Uniform Low-Light Image Enhancement
Ezequiel Perez-Zarate, Oscar Ramos-Soto, Chunxiao Liu
et al.
Low-light image enhancement is an important task in computer vision, essential for improving the visibility and quality of images captured in non-optimal lighting conditions. Inadequate illumination can lead to significant information loss and poor image quality, impacting various applications such as surveillance. photography, or even autonomous driving. In this regard, automated methods have been developed to automatically adjust illumination in the image for a better visual perception. Current enhancement techniques often use specific datasets to enhance low-light images, but still present challenges when adapting to diverse real-world conditions, where illumination degradation may be localized to specific regions. To address this challenge, the Adaptive Light Enhancement Network (ALEN) is introduced, whose main approach is the use of a classification mechanism to determine whether local or global illumination enhancement is required. Subsequently, estimator networks adjust illumination based on this classification and simultaneously enhance color fidelity. ALEN integrates the Light Classification Network (LCNet) for illuminance categorization, complemented by the Single-Channel Network (SCNet), and Multi-Channel Network (MCNet) for precise estimation of illumination and color, respectively. Extensive experiments on publicly available datasets for low-light conditions were carried out to underscore ALEN's robust generalization capabilities, demonstrating superior performance in both quantitative metrics and qualitative assessments when compared to recent state-of-the-art methods. The ALEN not only enhances image quality in terms of visual perception but also represents an advancement in high-level vision tasks, such as semantic segmentation, as presented in this work. The code of this method is available at https://github.com/xingyumex/ALEN
Robust Real-time Segmentation of Bio-Morphological Features in Human Cherenkov Imaging during Radiotherapy via Deep Learning
Shiru Wang, Yao Chen, Lesley A. Jarvis
et al.
Cherenkov imaging enables real-time visualization of megavoltage X-ray or electron beam delivery to the patient during Radiation Therapy (RT). Bio-morphological features, such as vasculature, seen in these images are patient-specific signatures that can be used for verification of positioning and motion management that are essential to precise RT treatment. However until now, no concerted analysis of this biological feature-based tracking was utilized because of the slow speed and accuracy of conventional image processing for feature segmentation. This study demonstrated the first deep learning framework for such an application, achieving video frame rate processing. To address the challenge of limited annotation of these features in Cherenkov images, a transfer learning strategy was applied. A fundus photography dataset including 20,529 patch retina images with ground-truth vessel annotation was used to pre-train a ResNet segmentation framework. Subsequently, a small Cherenkov dataset (1,483 images from 212 treatment fractions of 19 breast cancer patients) with known annotated vasculature masks was used to fine-tune the model for accurate segmentation prediction. This deep learning framework achieved consistent and rapid segmentation of Cherenkov-imaged bio-morphological features on another 19 patients, including subcutaneous veins, scars, and pigmented skin. Average segmentation by the model achieved Dice score of 0.85 and required less than 0.7 milliseconds processing time per instance. The model demonstrated outstanding consistency against input image variances and speed compared to conventional manual segmentation methods, laying the foundation for online segmentation in real-time monitoring in a prospective setting.
Missing the Present: Nostalgia and the Archival Impulse in Gentrification Photography
Zeena Price
If gentrification is a violent form of “un-homing” (Elliot-Cooper et al., p. 494), then it is no surprise to witness an intensification of photographic practice in gentrifying areas; photography is, after all, fundamentally a place-making practice. Taking “home” to include the wider neighborhood and urban environment (Blunt and Sheringham 2019), this paper argues that the concept of anticipatory nostalgia is a useful way of understanding the recent wave of black and white photography in gentrifying areas. As well as signifying a sense of loss, anticipatory nostalgia, defined as missing the present before it has gone (Batcho and Shikh 2016), can also be seen as an aesthetic strategy of documenting places before they are lost to gentrification. Using the works of Colby Deal (<i>Beautiful, Still</i>), Jules Renault (<i>Suspended in Time</i>), and Lorenzo Grifantini (<i>W10</i>) as case studies, this paper argues that this type of photography, which explicitly utilizes an archival aesthetic, invites spectators to interrogate the intimate ties between home, memory, and identity. While melancholic, these images serve as a call to action and a form of speculation about the future—rejecting the shiny, computer-generated aesthetics of gentrification for a humanized, often gritty, and authentic version of home.
Hazard zoning and assessment of rockfalls based on AHP-3DEC
Run SHI, Jiayu LI, Minghao CHEN
et al.
Restricted by the surface smoothness and topography, it is impossible for some high-speed railways on mountainous regions to avoid parts of rockfall development sections, posing great safety challenges to construction projects and railway operations. In light of that, this paper laid its focus on the rockfall situated by the entrance of Xinghuayu Tunnel along the proposed Jinan-Zaozhuang Railway project, and leveraged the power of drone-captured 3D aerial photography to perform digital geological survey and mapping so as to accurately identify the development characteristics, scale, as well as modes of deformation and failure of said rockfall. Next, by cross-referencing the results of 3DEC numerical simulation with those from analytic hierarchy process (AHP), color-coded maps highlighting the scope of influence and danger levels of potential rockfalls induced were obtained. Using said maps, zone-by-zone hazard and risk assessment were then performed, based on which corresponding prevention and control measures were put forward. The findings show that among the 12 dangerous rock belts identified from the drone-captured 3D aerial photography model, only Belt No. 5 would threaten the safety of the tunnel entrance and bridge abutments, for which the combination of anti-rockfall passive protective netting and an open-cut tunnel structure was recommended as a comprehensive solution. By virtue of the solution’s effectiveness, this study can offer reliable references for not only zone-by-zone hazard and risk assessment for rockfalls, but also railway route selection and disaster prevention and mitigation.
Community perceptions on the factors in the social food environment that influence dietary behaviour in cities of Kenya and Ghana: a Photovoice study
Milkah N Wanjohi, Rebecca Pradeilles, Gershim Asiki
et al.
Abstract
Objective:
To explore communities’ perspectives on the factors in the social food environment that influence dietary behaviours in African cities.
Design:
A qualitative study using participatory photography (Photovoice). Participants took and discussed photographs representing factors in the social food environment that influence their dietary behaviours. Follow-up in-depth interviews allowed participants to tell the ‘stories’ of their photographs. Thematic analysis was conducted, using data-driven and theory-driven (based on the socio-ecological model) approaches.
Setting:
Three low-income areas of Nairobi (n 48) in Kenya and Accra (n 62) and Ho (n 32) in Ghana.
Participants:
Adolescents and adults, male and female aged ≥13 years.
Results:
The ‘people’ who were most commonly reported as influencers of dietary behaviours within the social food environment included family members, friends, health workers and food vendors. They mainly influenced food purchase, preparation and consumption, through (1) considerations for family members’ food preferences, (2) considerations for family members’ health and nutrition needs, (3) social support by family and friends, (4) provision of nutritional advice and modelling food behaviour by parents and health professionals, (5) food vendors’ services and social qualities.
Conclusions:
The family presents an opportunity for promoting healthy dietary behaviours among family members. Peer groups could be harnessed to promote healthy dietary behaviours among adolescents and youth. Empowering food vendors to provide healthier and safer food options could enhance healthier food sourcing, purchasing and consumption in African low-income urban communities.
Public aspects of medicine, Nutritional diseases. Deficiency diseases
Evaluation of the manager work outcomes in a transport company in the context of digitalization
Gagarinskaya Galina Pavlovna, Gagarinsky Aleksandr Vladimirovich, Kremnev Arkadiy Alexandrovich
et al.
The authors study the process of digitalization in an organization, the impact on the competitiveness of personnel and leaders of organizations. The purpose of the study is to increase the competitiveness of managers who acquire competencies and new knowledge in the digitalization process. The paper highlights the elements of changes that digitalization brings to the organization. Research methods: systemic and structural-functional analysis, questioning, interviewing, expert assessments, statistical analysis, production experiment, timing and photography of working time. The information base of the study is public reports of leading transport companies; branch scientific and technical literature; materials of international, branch, territorial conferences, symposiums and seminars; results of application of author’s developments. Result: digitalization provides unhindered access to knowledge, cost reduction and greater interdisciplinarity, which is an obligation in external conditions of the society development.
Diagnostic Quality Assessment of Fundus Photographs: Hierarchical Deep Learning with Clinically Significant Explanations
Shanmukh Reddy Manne, Jose-Alain Sahel, Jay Chhablani
et al.
Fundus photography (FP) remains the primary imaging modality in screening various retinal diseases including age-related macular degeneration, diabetic retinopathy and glaucoma. FP allows the clinician to examine the ocular fundus structures such as the macula, the optic disc (OD) and retinal vessels, whose visibility and clarity in an FP image remain central to ensuring diagnostic accuracy, and hence determine the diagnostic quality (DQ). Images with low DQ, resulting from eye movement, improper illumination and other possible causes, should obviously be recaptured. However, the technician, often unfamiliar with DQ criteria, initiates recapture only based on expert feedback. The process potentially engages the imaging device multiple times for single subject, and wastes the time and effort of the ophthalmologist, the technician and the subject. The burden could be prohibitive in case of teleophthalmology, where obtaining feedback from the remote expert entails additional communication cost and delay. Accordingly, a strong need for automated diagnostic quality assessment (DQA) has been felt, where an image is immediately assigned a DQ category. In response, motivated by the notional continuum of DQ, we propose a hierarchical deep learning (DL) architecture to distinguish between good, usable and unusable categories. On the public EyeQ dataset, we achieve an accuracy of 89.44%, improving upon existing methods. In addition, using gradient based class activation map (Grad-CAM), we generate a visual explanation which agrees with the expert intuition. Future FP cameras equipped with the proposed DQA algorithm will potentially improve the efficacy of the teleophthalmology as well as the traditional system.