Ciphertext-Only Attack on Grayscale-Based EtC Image Encryption via Component Separation and Regularized Single-Channel Compatibility
Ruifeng Li, Masaaki Fujiyoshi
Grayscale-based Encryption-then-Compression (EtC) systems transform RGB images into the YCbCr color space, concatenate the components into a single grayscale image, and apply block permutation, block rotation/flipping, and block-wise negative–positive inversion. Because this pipeline separates color components and disrupts inter-channel statistics, existing extended jigsaw puzzle solvers (JPSs) have been regarded as ineffective, and grayscale-based EtC systems have been considered resistant to ciphertext-only visual reconstruction. In this paper, we present a practical ciphertext-only attack against grayscale-based EtC. The proposed attack introduces three key components: (i) Texture-Based Component Classification (TBCC) to distinguish luminance (Y) and chrominance (Cb/Cr) blocks and focus reconstruction on structure-rich regions; (ii) Regularized Single-Channel Edge Compatibility (R-SCEC), which applies Tikhonov regularization to a single-channel variant of the Mahalanobis Gradient Compatibility (MGC) measure to alleviate covariance rank-deficiency while maintaining robustness under inversion and geometric transforms; and (iii) Adaptive Pruning based on the TBCC-reduced search space that skips redundant boundary matching computations to further improve reconstruction efficiency. Experiments show that, in settings where existing extended JPS solvers fail, our method can still recover visually recognizable semantic content, revealing a potential vulnerability in grayscale-based EtC and calling for a re-evaluation of its security.
Photography, Computer applications to medicine. Medical informatics
Under-Canopy Terrain Reconstruction in Dense Forests Using RGB Imaging and Neural 3D Reconstruction
Refael Sheffer, Chen Pinchover, Haim Zisman
et al.
Mapping the terrain and understory hidden beneath dense forest canopies is of great interest for numerous applications such as search and rescue, trail mapping, forest inventory tasks, and more. Existing solutions rely on specialized sensors: either heavy, costly airborne LiDAR, or Airborne Optical Sectioning (AOS), which uses thermal synthetic aperture photography and is tailored for person detection. We introduce a novel approach for the reconstruction of canopy-free, photorealistic ground views using only conventional RGB images. Our solution is based on the celebrated Neural Radiance Fields (NeRF), a recent 3D reconstruction method. Additionally, we include specific image capture considerations, which dictate the needed illumination to successfully expose the scene beneath the canopy. To better cope with the poorly lit understory, we employ a low light loss. Finally, we propose two complementary approaches to remove occluding canopy elements by controlling per-ray integration procedure. To validate the value of our approach, we present two possible downstream tasks. For the task of search and rescue (SAR), we demonstrate that our method enables person detection which achieves promising results compared to thermal AOS (using only RGB images). Additionally, we show the potential of our approach for forest inventory tasks like tree counting. These results position our approach as a cost-effective, high-resolution alternative to specialized sensors for SAR, trail mapping, and forest-inventory tasks.
Fundus Image-based Glaucoma Screening via Retinal Knowledge-Oriented Dynamic Multi-Level Feature Integration
Yuzhuo Zhou, Chi Liu, Sheng Shen
et al.
Automated diagnosis based on color fundus photography is essential for large-scale glaucoma screening. However, existing deep learning models are typically data-driven and lack explicit integration of retinal anatomical knowledge, which limits their robustness across heterogeneous clinical datasets. Moreover, pathological cues in fundus images may appear beyond predefined anatomical regions, making fixed-region feature extraction insufficient for reliable diagnosis. To address these challenges, we propose a retinal knowledge-oriented glaucoma screening framework that integrates dynamic multi-scale feature learning with domain-specific retinal priors. The framework adopts a tri-branch structure to capture complementary retinal representations, including global retinal context, structural features of the optic disc/cup, and dynamically localized pathological regions. A Dynamic Window Mechanism is devised to adaptively identify diagnostically informative regions, while a Knowledge-Enhanced Convolutional Attention Module incorporates retinal priors extracted from a pre-trained foundation model to guide attention learning. Extensive experiments on the large-scale AIROGS dataset demonstrate that the proposed method outperforms diverse baselines, achieving an AUC of 98.5% and an accuracy of 94.6%. Additional evaluations on multiple datasets from the SMDG-19 benchmark further confirm its strong cross-domain generalization capability, indicating that knowledge-guided attention combined with adaptive lesion localization can significantly improve the robustness of automated glaucoma screening systems.
$\mathbf{M^3A}$ Policy: Mutable Material Manipulation Augmentation Policy through Photometric Re-rendering
Jiayi Li, Yuxuan Hu, Haoran Geng
et al.
Material generalization is essential for real-world robotic manipulation, where robots must interact with objects exhibiting diverse visual and physical properties. This challenge is particularly pronounced for objects made of glass, metal, or other materials whose transparent or reflective surfaces introduce severe out-of-distribution variations. Existing approaches either rely on simulated materials in simulators and perform sim-to-real transfer, which is hindered by substantial visual domain gaps, or depend on collecting extensive real-world demonstrations, which is costly, time-consuming, and still insufficient to cover various materials. To overcome these limitations, we resort to computational photography and introduce Mutable Material Manipulation Augmentation (M$^3$A), a unified framework that leverages the physical characteristics of materials as captured by light transport for photometric re-rendering. The core idea is simple yet powerful: given a single real-world demonstration, we photometrically re-render the scene to generate a diverse set of highly realistic demonstrations with different material properties. This augmentation effectively decouples task-specific manipulation skills from surface appearance, enabling policies to generalize across materials without additional data collection. To systematically evaluate this capability, we construct the first comprehensive multi-material manipulation benchmark spanning both simulation and real-world environments. Extensive experiments show that the M$^3$A policy significantly enhances cross-material generalization, improving the average success rate across three real-world tasks by 58.03\%, and demonstrating robust performance on previously unseen materials.
Selective Diabetic Retinopathy Screening with Accuracy-Weighted Deep Ensembles and Entropy-Guided Abstention
Jophy Lin
Diabetic retinopathy (DR), a microvascular complication of diabetes and a leading cause of preventable blindness, is projected to affect more than 130 million individuals worldwide by 2030. Early identification is essential to reduce irreversible vision loss, yet current diagnostic workflows rely on methods such as fundus photography and expert review, which remain costly and resource-intensive. This, combined with DR's asymptomatic nature, results in its underdiagnosis rate of approximately 25 percent. Although convolutional neural networks (CNNs) have demonstrated strong performance in medical imaging tasks, limited interpretability and the absence of uncertainty quantification restrict clinical reliability. Therefore, in this study, a deep ensemble learning framework integrated with uncertainty estimation is introduced to improve robustness, transparency, and scalability in DR detection. The ensemble incorporates seven CNN architectures-ResNet-50, DenseNet-121, MobileNetV3 (Small and Large), and EfficientNet (B0, B2, B3)- whose outputs are fused through an accuracy-weighted majority voting strategy. A probability-weighted entropy metric quantifies prediction uncertainty, enabling low-confidence samples to be excluded or flagged for additional review. Training and validation on 35,000 EyePACS retinal fundus images produced an unfiltered accuracy of 93.70 percent (F1 = 0.9376). Uncertainty-filtering later was conducted to remove unconfident samples, resulting in maximum-accuracy of 99.44 percent (F1 = 0.9932). The framework shows that uncertainty-aware, accuracy-weighted ensembling improves reliability without hindering performance. With confidence-calibrated outputs and a tunable accuracy-coverage trade-off, it offers a generalizable paradigm for deploying trustworthy AI diagnostics in high-risk care.
DiffCamera: Arbitrary Refocusing on Images
Yiyang Wang, Xi Chen, Xiaogang Xu
et al.
The depth-of-field (DoF) effect, which introduces aesthetically pleasing blur, enhances photographic quality but is fixed and difficult to modify once the image has been created. This becomes problematic when the applied blur is undesirable~(e.g., the subject is out of focus). To address this, we propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level. Specifically, we design a diffusion transformer framework for refocusing learning. However, the training requires pairs of data with different focus planes and bokeh levels in the same scene, which are hard to acquire. To overcome this limitation, we develop a simulation-based pipeline to generate large-scale image pairs with varying focus planes and bokeh levels. With the simulated data, we find that training with only a vanilla diffusion objective often leads to incorrect DoF behaviors due to the complexity of the task. This requires a stronger constraint during training. Inspired by the photographic principle that photos of different focus planes can be linearly blended into a multi-focus image, we propose a stacking constraint during training to enforce precise DoF manipulation. This constraint enhances model training by imposing physically grounded refocusing behavior that the focusing results should be faithfully aligned with the scene structure and the camera conditions so that they can be combined into the correct multi-focus image. We also construct a benchmark to evaluate the effectiveness of our refocusing model. Extensive experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
La photographie comme processus performatif
Richard Shusterman
In this article originally published in 2012, R. Shusterman argues that photography is not limited to the photograph, which is merely its final two-dimensional product, developed, printed, or digitally displayed. Focusing on the example of the portrait photography, he explores what happens before the photograph, in order to highlight the performative dimension of an art form often conceived as strictly visual. Indeed, both the photographer, in taking the photo, and the subject, in posing for it, engage in performative, somatic, and sometimes intense dramatic processes. The photographer must orient his camera, which supposes a certain somatic control of his position, his posture and his balance. He can choose or create an atmosphere, a staging or a situation sometimes complex. Additionally, he must communicate with the subject being photographed to make him feel at ease or in the appropriate state of mind. As for the photographed subject, he takes the pose, which can be assimilated to an actor’s performance: depending on what he wants to reveal about himself, he stylizes himself, (re)presents and shapes his image. Ultimately, photography initiates a true dramatic process, in the sense that it focuses our attention on the events or people being photographed, placing them in a privileged frame that sets them apart and intensifies their presence, without disconnecting them from the flow of ordinary life. R. Shusterman cites examples of this dramatization in photographs of certain ritual events, such as weddings or graduations ceremonies.
Philosophy (General), Sociology (General)
A novel recurrent self-evolving fuzzy neural network for consensus decision-making of unmanned aerial vehicles
ZY Chen, Yahui Meng, Ruei-Yuan Wang
et al.
Currently, for years, unmanned aerial vehicles have been widely applied in a comprehensive realm. By enhancing computer photography and artificial intelligence, it can automatically discriminate against environmental objectives and detect events that occur in the real scene. The application of collaborative unmanned aerial vehicles will offer diverse interpretations which support a multiperspective view of the scene. Due to diverse interpretations of unmanned aerial vehicles usually deviates, thus, unmanned aerial vehicles require a consensus interpretation for the scenario. To previous purposes, this study presents an original consensus-based method to pilot multi-unmanned aerial vehicle systems for achieving consensus on their observation as well as constructing a group situation-based depiction of the scenario. Further, a fuzzy neural network generalized prediction control system known as a recurrent self-evolving fuzzy neural network is mainly used to ensure stability through the use of a descending gradient online learning rule. At the same time, users can think along the lines of evolutionary biological design. Unmanned aerial vehicles can be modeled as system experts for solving group problems that require the definition of conditions that best describe the scene. First, this method allows each unmanned aerial vehicle to set high-level conditions for detection events by aggregating events based on fuzzy information. These aggregated events are modeled by a fuzzy system ontology, which allows each unmanned aerial vehicle to report its preferences in conditions. Therefore, the interpretation of each drone is compressed to achieve a collective interpretation of the state. The final polls, consent and affinity polls confirmed the final decision group’s reliability ratings. The rated consensus indicates how well the collective interpretation of the scene matches each drone’s point of view.
Electronics, Electronic computers. Computer science
Non-Neovascular Age-Related Macular Degeneration Assessment: Focus on Optical Coherence Tomography Biomarkers
Daniela Adriana Iliescu, Ana Cristina Ghita, Larisa Adriana Ilie
et al.
The imagistic evaluation of non-neovascular age-related macular degeneration (AMD) is crucial for diagnosis, monitoring progression, and guiding management of the disease. Dry AMD, characterized primarily by the presence of drusen and retinal pigment epithelium atrophy, requires detailed visualization of the retinal structure to assess its severity and progression. Several imaging modalities are pivotal in the evaluation of non-neovascular AMD, including optical coherence tomography, fundus autofluorescence, or color fundus photography. In the context of emerging therapies for geographic atrophy, like pegcetacoplan, it is critical to establish the baseline status of the disease, monitor the development and expansion of geographic atrophy, and to evaluate the retina’s response to potential treatments in clinical trials. The present review, while initially providing a comprehensive description of the pathophysiology involved in AMD, aims to offer an overview of the imaging modalities employed in the evaluation of non-neovascular AMD. Special emphasis is placed on the assessment of progression biomarkers as discerned through optical coherence tomography. As the landscape of AMD treatment continues to evolve, advanced imaging techniques will remain at the forefront, enabling clinicians to offer the most effective and tailored treatments to their patients.
CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement
Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu
et al.
Retinal fundus photography is significant in diagnosing and monitoring retinal diseases. However, systemic imperfections and operator/patient-related factors can hinder the acquisition of high-quality retinal images. Previous efforts in retinal image enhancement primarily relied on GANs, which are limited by the trade-off between training stability and output diversity. In contrast, the Schrödinger Bridge (SB), offers a more stable solution by utilizing Optimal Transport (OT) theory to model a stochastic differential equation (SDE) between two arbitrary distributions. This allows SB to effectively transform low-quality retinal images into their high-quality counterparts. In this work, we leverage the SB framework to propose an image-to-image translation pipeline for retinal image enhancement. Additionally, previous methods often fail to capture fine structural details, such as blood vessels. To address this, we enhance our pipeline by introducing Dynamic Snake Convolution, whose tortuous receptive field can better preserve tubular structures. We name the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schrödinger Bridge (CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use the SB approach for retinal image enhancement. Experimental results on a large-scale dataset demonstrate the advantage of the proposed method compared to several state-of-the-art supervised and unsupervised methods in terms of image quality and performance on downstream tasks.The code is available at https://github.com/Retinal-Research/CUNSB-RFIE .
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models
Tong Wu, Yinghao Xu, Ryan Po
et al.
Recent advances in text-to-image generation have enabled the creation of high-quality images with diverse applications. However, accurately describing desired visual attributes can be challenging, especially for non-experts in art and photography. An intuitive solution involves adopting favorable attributes from the source images. Current methods attempt to distill identity and style from source images. However, "style" is a broad concept that includes texture, color, and artistic elements, but does not cover other important attributes such as lighting and dynamics. Additionally, a simplified "style" adaptation prevents combining multiple attributes from different sources into one generated image. In this work, we formulate a more effective approach to decompose the aesthetics of a picture into specific visual attributes, allowing users to apply characteristics such as lighting, texture, and dynamics from different images. To achieve this goal, we constructed the first fine-grained visual attributes dataset (FiVA) to the best of our knowledge. This FiVA dataset features a well-organized taxonomy for visual attributes and includes around 1 M high-quality generated images with visual attribute annotations. Leveraging this dataset, we propose a fine-grained visual attribute adaptation framework (FiVA-Adapter), which decouples and adapts visual attributes from one or more source images into a generated one. This approach enhances user-friendly customization, allowing users to selectively apply desired attributes to create images that meet their unique preferences and specific content requirements.
End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model
Xinge Yang, Matheus Souza, Kunyi Wang
et al.
Hybrid refractive-diffractive lenses combine the light efficiency of refractive lenses with the information encoding power of diffractive optical elements (DOE), showing great potential as the next generation of imaging systems. However, accurately simulating such hybrid designs is generally difficult, and in particular, there are no existing differentiable image formation models for hybrid lenses with sufficient accuracy. In this work, we propose a new hybrid ray-tracing and wave-propagation (ray-wave) model for accurate simulation of both optical aberrations and diffractive phase modulation, where the DOE is placed between the last refractive surface and the image sensor, i.e. away from the Fourier plane that is often used as a DOE position. The proposed ray-wave model is fully differentiable, enabling gradient back-propagation for end-to-end co-design of refractive-diffractive lens optimization and the image reconstruction network. We validate the accuracy of the proposed model by comparing the simulated point spread functions (PSFs) with theoretical results, as well as simulation experiments that show our model to be more accurate than solutions implemented in commercial software packages like Zemax. We demonstrate the effectiveness of the proposed model through real-world experiments and show significant improvements in both aberration correction and extended depth-of-field (EDoF) imaging. We believe the proposed model will motivate further investigation into a wide range of applications in computational imaging, computational photography, and advanced optical design. Code will be released upon publication.
MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
Zhenghao Pan, Haijin Zeng, Jiezhang Cao
et al.
Color video snapshot compressive imaging (SCI) employs computational imaging techniques to capture multiple sequential video frames in a single Bayer-patterned measurement. With the increasing popularity of quad-Bayer pattern in mainstream smartphone cameras for capturing high-resolution videos, mobile photography has become more accessible to a wider audience. However, existing color video SCI reconstruction algorithms are designed based on the traditional Bayer pattern. When applied to videos captured by quad-Bayer cameras, these algorithms often result in color distortion and ineffective demosaicing, rendering them impractical for primary equipment. To address this challenge, we propose the MambaSCI method, which leverages the Mamba and UNet architectures for efficient reconstruction of quad-Bayer patterned color video SCI. To the best of our knowledge, our work presents the first algorithm for quad-Bayer patterned SCI reconstruction, and also the initial application of the Mamba model to this task. Specifically, we customize Residual-Mamba-Blocks, which residually connect the Spatial-Temporal Mamba (STMamba), Edge-Detail-Reconstruction (EDR) module, and Channel Attention (CA) module. Respectively, STMamba is used to model long-range spatial-temporal dependencies with linear complexity, EDR is for better edge-detail reconstruction, and CA is used to compensate for the missing channel information interaction in Mamba model. Experiments demonstrate that MambaSCI surpasses state-of-the-art methods with lower computational and memory costs. PyTorch style pseudo-code for the core modules is provided in the supplementary materials.
Information-driven design of imaging systems
Henry Pinkard, Leyla Kabuli, Eric Markley
et al.
Imaging systems have traditionally been designed to mimic the human eye and produce visually interpretable measurements. Modern imaging systems, however, process raw measurements computationally before or instead of human viewing. As a result, the information content of raw measurements matters more than their visual interpretability. Despite the importance of measurement information content, current approaches for evaluating imaging system performance do not quantify it: they instead either use alternative metrics that assess specific aspects of measurement quality or assess measurements indirectly with performance on secondary tasks. We developed the theoretical foundations and a practical method to directly quantify mutual information between noisy measurements and unknown objects. By fitting probabilistic models to measurements and their noise characteristics, our method estimates information by upper bounding its true value. By applying gradient-based optimization to these estimates, we also developed a technique for designing imaging systems called Information-Driven Encoder Analysis Learning (IDEAL). Our information estimates accurately captured system performance differences across four imaging domains (color photography, radio astronomy, lensless imaging, and microscopy). Systems designed with IDEAL matched the performance of those designed with end-to-end optimization, the prevailing approach that jointly optimizes hardware and image processing algorithms. These results establish mutual information as a universal performance metric for imaging systems that enables both computationally efficient design optimization and evaluation in real-world conditions. A video summarizing this work can be found at: https://waller-lab.github.io/EncodingInformationWebsite/
Game4Loc: A UAV Geo-Localization Benchmark from Game Data
Yuxiang Ji, Boyong He, Zhuoyue Tan
et al.
The vision-based geo-localization technology for UAV, serving as a secondary source of GPS information in addition to the global navigation satellite systems (GNSS), can still operate independently in the GPS-denied environment. Recent deep learning based methods attribute this as the task of image matching and retrieval. By retrieving drone-view images in geo-tagged satellite image database, approximate localization information can be obtained. However, due to high costs and privacy concerns, it is usually difficult to obtain large quantities of drone-view images from a continuous area. Existing drone-view datasets are mostly composed of small-scale aerial photography with a strong assumption that there exists a perfect one-to-one aligned reference image for any query, leaving a significant gap from the practical localization scenario. In this work, we construct a large-range contiguous area UAV geo-localization dataset named GTA-UAV, featuring multiple flight altitudes, attitudes, scenes, and targets using modern computer games. Based on this dataset, we introduce a more practical UAV geo-localization task including partial matches of cross-view paired data, and expand the image-level retrieval to the actual localization in terms of distance (meters). For the construction of drone-view and satellite-view pairs, we adopt a weight-based contrastive learning approach, which allows for effective learning while avoiding additional post-processing matching steps. Experiments demonstrate the effectiveness of our data and training method for UAV geo-localization, as well as the generalization capabilities to real-world scenarios.
Event fields: Capturing light fields at high speed, resolution, and dynamic range
Ziyuan Qu, Zihao Zou, Vivek Boominathan
et al.
Event cameras, which feature pixels that independently respond to changes in brightness, are becoming increasingly popular in high-speed applications due to their lower latency, reduced bandwidth requirements, and enhanced dynamic range compared to traditional frame-based cameras. Numerous imaging and vision techniques have leveraged event cameras for high-speed scene understanding by capturing high-framerate, high-dynamic range videos, primarily utilizing the temporal advantages inherent to event cameras. Additionally, imaging and vision techniques have utilized the light field-a complementary dimension to temporal information-for enhanced scene understanding. In this work, we propose "Event Fields", a new approach that utilizes innovative optical designs for event cameras to capture light fields at high speed. We develop the underlying mathematical framework for Event Fields and introduce two foundational frameworks to capture them practically: spatial multiplexing to capture temporal derivatives and temporal multiplexing to capture angular derivatives. To realize these, we design two complementary optical setups one using a kaleidoscope for spatial multiplexing and another using a galvanometer for temporal multiplexing. We evaluate the performance of both designs using a custom-built simulator and real hardware prototypes, showcasing their distinct benefits. Our event fields unlock the full advantages of typical light fields-like post-capture refocusing and depth estimation-now supercharged for high-speed and high-dynamic range scenes. This novel light-sensing paradigm opens doors to new applications in photography, robotics, and AR/VR, and presents fresh challenges in rendering and machine learning.
Inhumanity and sexbots: on incestuous relations with sexbots
Kobes, Tomáš
British multimedia artist K. Davis has joined the campaign against sexbots initiated in 2015 by K. Richardson and E. Billing in the project Logging on to Love. Using photography, video, and sound design, she draws attention to how sexbots rearticulate the widespread treatment of humans as objects and underlines the commodification of sex. For Davis, sexbots in this sense are not simply human products, but anti-humanist tools. On the other hand, sexbot creators and their proponents argue that sexbots can aid people in their occasional loneliness, but also in reducing the sex trade or becoming an effective therapeutic tool. Therefore, sexbots are a controversy creating boundaries between humanity and inhumanity. By examining these differences, I argue in this paper that being human or inhuman in relation to sexbots can only be fully understood with regard to incest, which can contribute to understanding sexbots in a more symmetrical sense than the one offered by their critics and defenders.
The inside‐out surgical anatomy of the paraglottic space a video‐guided endoscopic dissection
Hazem Mohamed Aly Saleh, Nadja Seidel, Thomas Jöns
et al.
Abstract Objectives The paraglottic space is an essential anatomic compartment of the larynx. It is central to the spread of laryngeal cancer and to the choice of conservative laryngeal surgery and many phonosurgical procedures. Since its description, 60 years ago, the surgical anatomy of the paraglottic space was sparsely revisited. Amid the era of endoscopic and transoral microscopic functional surgery of the larynx, we provide here a long‐awaited description of the inside‐out anatomy of the paraglottic space. Methodology Using an endoscope equipped with a 3D camera, we dissected 10 hemilarynges from 5 fresh frozen cadavers from the inside out. Before dissection, we labeled the vessels through injecting them with colored latex. We explored the paraglottic space emphasizing its shape, boundaries, and contents. We documented our findings through endoscopic photography and video recordings. Results The paraglottic space is a spacious tetrahedral space located parallel not only to the glottic, but also to the subglottic and the supraglottic compartments of the laryngeal lumen. It has musculo‐cartilaginous, musculo‐fibrous, and mucosal boundaries. It is separated from the pyriform sinus only by mucosa. A cushion of fat surrounds its vascular and to a lesser extent its neural contents. Harbored intrinsic laryngeal muscles are endoscopically identifiable within the space, namely the thyroarytenoid, the lateral, and posterior cricoarytenoid muscles. Conclusion The endoscopic description of the paraglottic space partly fills the knowledge gap on the laryngeal anatomy from the inside out. It opens the door for novel diagnostic methods and for ultraconservative functional laryngeal interventions under endoscopic control. Level of Evidence N/A
Otorhinolaryngology, Surgery
Using computer vision to monitor ice conditions in water supply infrastructure: a study of salient image features
Junjie Chen, Donghai Liu
Ice condition monitoring (ICM) is critical for the operation and maintenance of water supply infrastructure in cold regions. Existing approaches either depend on ground-level sensors or satellite photography for ICM, which suffer from high maintenance costs or inadequate precision. Computer vision (CV) has the potential to tackle the limitations by providing a precise and scalable solution based on near-shore cameras and increasingly affordable drones. To explore the potential of CV for ICM, this paper presents a systematic study of salient image features for differentiating typical ice evolvement phases throughout the freeze–thaw cycle. First, ice condition during the freeze–thaw cycle is studied to provide a categoric system of typical ice stages. Second, multiple image feature descriptors are proposed to characterize the distinction between different ice conditions. Finally, with the proposed descriptors as input, two support vector machines (SVMs) are trained to classify the ice condition for automatic ICM. Experiments have been implemented to identify salient features for ice characterization. It was found that the SVMs can achieve 71.9 and 77.3% accuracy for the prediction of ice stage and ice flow strength, respectively. Future research is suggested to develop the research findings into practical solutions for webcams or drone-based automatic ICM.
HIGHLIGHTS
Computer vision is used to monitor ice conditions in water supply infrastructure.;
Image features are handcrafted to characterize different ice conditions.;
Effectiveness of the features is quantified and evaluated by correlation analysis.;
Information technology, Environmental technology. Sanitary engineering
La méthode des itinéraires photographiques : une ethnographie visuelle des mondes ouvriers de la logistique
Cecile Cuny, Hortense Soichet, Nathalie Mohadjer
This article presents the approach set up in the framework of a collective research programme on the "social worlds of logistics blue collars", carried out on four French and German sites between 2016 and 2019. One of the aim was to turn fieldwork into a source of knowledge and creation. Based on the photographic itinerary method, as formalised by sociologist Jean-Yves Petiteau in the course of several research projects, a framework for collaboration between researchers, photographers and respondents has been constructed, in order to document and analyse a social world, while representing it to a wider audience.