NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by innovatively utilizing NeRF to enhance representation learning. Despite its notable performance, we uncover three decisive shortcomings in its current design, including semantic ambiguity, inappropriate sampling, and insufficient utilization of depth supervision. To combat the aforementioned problems, we present three corresponding solutions: 1) Semantic Enhancement. We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors. 2) Perspective-Aware Sampling. Instead of employing the uniform sampling strategy, we put forward the perspective-aware sampling policy that samples densely near the camera while sparsely in the distance, more effectively collecting the valuable geometric clues. 3) Ordinal Residual Depth Supervision. As opposed to directly regressing the depth values that are difficult to optimize, we divide the depth range of each scene into a fixed number of ordinal bins and reformulate the depth prediction as the combination of the classification of depth bins as well as the regression of the residual depth values, thereby benefiting the depth learning process. The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and ARKITScenes datasets. Notably, in ScanNetV2, NeRF-Det++ outperforms the competitive NeRF-Det by +1.9% in mAP $\text{@}0.25$ and +3.5% in mAP $\text{@}0.50$ . The code will be publicly available at https://github.com/mrsempress/NeRF-Detplusplus
Visual inspection remains essential for inspecting infrastructure surfaces. While there are cornerstones in developing intelligent inspection systems, most existing solutions are limited to small-scale infrastructures and components, making them challenging to scale up for real-world applications. Leveraging deep learning and unmanned aerial vehicles (UAVs), this article proposes Det-Recon-Reg, an intelligent framework born for large-scale infrastructure inspection by decomposing it into three complementary stages: detect for defect detection, reconstruct for infrastructure reconstruction, and register for defect registration. In the detect stage, we introduce the first high-resolution dataset designed for defect detection on large-scale infrastructure surfaces. State-of-the-art real-time object detectors are evaluated on this dataset, and the CUBIT-Net is proposed to strike a better balance between accuracy and efficiency. In the reconstruct stage, we present a scalable multi-view stereo (MVS) network to reconstruct dense point cloud representation of the infrastructure from multi-view images. Extensive experiments on benchmark datasets, including DTU, Tanks and Temples (TNT), and BlendedMVS, demonstrate the superior performance of our method over existing approaches. In the register stage, we propose a novel defect registration method that leverages the geographic information system (GIS) to accurately map the detected defects onto the infrastructure model while preserving their geometric and visual properties, thereby enabling global defect localization and more informed maintenance decision-making. The proposed framework can serve as a reference for effective and efficient infrastructure maintenance as consolidated in real-world experiments. Codes, datasets, and pretrained models for each stage will be released at https://github.com/YANG-SOBER/Det-Recon-Reg. The supplementary video is available at: https://youtu.be/MVMp7k9qB84
Voxel is one of the common structural representation of 3D point cloud. Due to the sparsity of point cloud generated by light detection and ranging (LiDAR), there is the extreme imbalance in the foreground and background voxels. It decreases the accuracy of 3D object detection, has the negative effect on intelligent driving safety. To overcome this problem, we present a saliency prediction based 3D object detector SP-Det in this article. Although foreground voxels have the sufficient feature of object, it is difficult to localize the foreground region from voxel space with the larger background region. We design an auxiliary learning task, saliency prediction (SP). It benefits 3D detector in identifying the foreground region. SP task uses label diffusion to alleviate the label imbalance. It reduces the learning difficulty of saliency in voxel and bird's eye view (BEV) spaces. After that, to strengthen feature interaction from the sparse foreground region, we design saliency fusion (SF) module to fuse the learning result in SP task. It utilizes voxel and BEV saliency maps as progressive attention to resist the redundant feature from background region. To aggregate more foreground feature inside 3D and BEV region of interest (RoI), we design hybrid grid maps based RoI pooling (Hybrid-RoI pooling). Experiments are conducted in STF dataset. The adverse weather enlarges the sparsity of LiDAR point cloud, increasing the difficulty of object detection. SP-Det identifies and leverages foreground region, and achieves the performance better than the current methods. Hence, we believe that SP-Det benefits to LiDAR based 3D scene understanding in the adverse weather.
Object detection in unmanned aerial vehicle (UAV) images poses significant challenges due to complex scale variations and class imbalance among objects. Existing methods often address these challenges separately, overlooking the intricate nature of UAV images and the potential synergy between them. In response, this paper proposes AD-Det, a novel framework employing a coherent coarse-to-fine strategy that seamlessly integrates two pivotal components: adaptive small object enhancement (ASOE) and dynamic class-balanced copy–paste (DCC). ASOE utilizes a high-resolution feature map to identify and cluster regions containing small objects. These regions are subsequently enlarged and processed by a fine-grained detector. On the other hand, DCC conducts object-level resampling by dynamically pasting tail classes around the cluster centers obtained by ASOE, maintaining a dynamic memory bank for each tail class. This approach enables AD-Det to not only extract regions with small objects for precise detection but also dynamically perform reasonable resampling for tail-class objects. Consequently, AD-Det enhances the overall detection performance by addressing the challenges of scale variations and class imbalance in UAV images through a synergistic and adaptive framework. We extensively evaluate our approach on two public datasets, i.e., VisDrone and UAVDT, and demonstrate that AD-Det significantly outperforms existing competitive alternatives. Notably, AD-Det achieves a 37.5% average precision (AP) on the VisDrone dataset, surpassing its counterparts by at least 3.1%.
The Visible-Infrared (VIS-IR) object detection is a challenging detection task, which combines visible and infrared data to give information on the category and location of objects in the scene. Therefore, the core of this task is to combine complementary information in the visible and infrared modalities to provide more object detection results for detection. The existing methods mainly face the problem of insufficient ability to perceive and combine visible-infrared modal information and have difficulty in balancing the optimization directions of the fusion and detection tasks. To solve these problem, we propose the MMI-Det which is a multi-modal fusion method for visible and infrared object detection. The method can provide a good combination of complementary information in the visible-infrared modalities and output accurate and robust object information. Specifically, to improve the ability of the model to perceive environment at the visible-infrared image level, we designed the Contour Enhancement Module. Furthermore, to extract complementary information from VIS and IR modalities, we design the Fusion Focus Module. It can extract different frequency spectral features of the visible and infrared modalities and focus on the key information of the object at different spatial locations. Moreover, we design the Contrast Bridge Module to improve the ability to extract modal invariant features in the visible-infrared scene. Finally, to ensure that our model can balance the optimization directions of image fusion and object detection, we design the Info Guided Module as a way to improve the effectiveness of the model’s training optimization. We implement extensive experiments on the public FLIR, M3FD, LLVIP, TNO and MSRS datasets, and compared with previous methods, our method achieves better performance with powerful multi-modal information perception capabilities.
The XAFS-DET work package of the European LEAPS-INNOV project is developing a high-purity Germanium detectors for synchrotron applications requiring spectroscopic-grade response. The detectors integrate three key features: (1) newly designed monolithic Germanium sensors optimised to mitigate charge-sharing events, (2) an improved cooling and mechanical design structure supported by thermal simulations, and (3) complete electronic chain featuring a low-noise CMOS technology-based preamplifier. enabling high X-ray count rate capability over a broad energy range (5-100 keV). This paper discusses the first integration and characterization of one of the two multi-element Ge detectors at the European Synchrotron Radiation Facility (ESRF). The integration phase included validating high-throughput front-End electronics, integrating them with the Ge sensor, and operating them at liquid nitrogen temperature, in addition to the experimental characterization, which consists of electronics noise study and spectroscopic performance evaluation.
Cauliflower cultivation plays a pivotal role in the Indian Subcontinent’s winter cropping landscape, contributing significantly to both agricultural output, economy and public health. However, the susceptibility of cauliflower crops to various diseases poses a threat to productivity and quality. This paper presents a novel machine vision approach employing a modified YOLOv8 model called Cauli-Det for automatic classification and localization of cauliflower diseases. The proposed system utilizes images captured through smartphones and hand-held devices, employing a finetuned pre-trained YOLOv8 architecture for disease-affected region detection and extracting spatial features for disease localization and classification. Three common cauliflower diseases, namely ‘Bacterial Soft Rot’, ‘Downey Mildew’ and ‘Black Rot’ are identified in a dataset of 656 images. Evaluation of different modification and training methods reveals the proposed custom YOLOv8 model achieves a precision, recall and mean average precision (mAP) of 93.2%, 82.6% and 91.1% on the test dataset respectively, showcasing the potential of this technology to empower cauliflower farmers with a timely and efficient tool for disease management, thereby enhancing overall agricultural productivity and sustainability
Locality-sensitive hashing (LSH) is a well-known solution for approximate nearest neighbor (ANN) search in high-dimensional spaces due to its robust theoretical guarantee on query accuracy. Traditional LSH-based methods mainly focus on improving the efficiency and accuracy of the query phase by designing different query strategies, but pay little attention to improving the efficiency of the indexing phase. They typically fine-tune existing data-oriented partitioning trees to index data points and support their query strategies. However, their strategy to directly partition the multi-dimensional space is time-consuming, and performance degrades as the space dimensionality increases. In this paper, we design an encoding-based tree called Dynamic Encoding Tree (DE-Tree) to improve the indexing efficiency and support efficient range queries based on Euclidean distance. Based on DE-Tree, we propose a novel LSH scheme called DET-LSH. DET-LSH adopts a novel query strategy, which performs range queries in multiple independent index DE-Trees to reduce the probability of missing exact NN points, thereby improving the query accuracy. Our theoretical studies show that DET-LSH enjoys probabilistic guarantees on query accuracy. Extensive experiments on real-world datasets demonstrate the superiority of DET-LSH over the state-of-the-art LSH-based methods on both efficiency and accuracy. While achieving better query accuracy than competitors, DET-LSH achieves up to 6x speedup in indexing time and 2x speedup in query time over the state-of-the-art LSH-based methods.
To address the challenges of low detection rate and high missed detection rate of military aircraft in current complex remote sensing data, and to meet the requirements of real-time detection and easy deployment of models, this article introduces DET-you only look once (YOLO), an innovative detection model. First, to tackle the issue of reduced accuracy in identifying small targets amidst intricate backgrounds, a novel feature extraction component, C2f_DEF, was devised. This module replaced all existing C2f components within YOLOv8n, thereby significantly enhancing the model's ability to cope with complicated environmental contexts. Second, to achieve the functionality of easy deployment of the model, some deep structures were simplified to make the model more lightweight. Afterward, to further improve the model's ability to handle complex backgrounds and dense environments in remote sensing images and to improve the model's detection accuracy for military aircraft, the DAT module was embedded in the model. Finally, this article also optimized the loss function and reg_max to further reduce computational costs while improving the detection accuracy of the model. To verify the effectiveness and strong universality of DET-YOLO, extensive experimental verification was conducted on three publicly available datasets, namely MAR20, NWPU VHR-10, and NEU-DET. On the MAR20 dataset, compared with other advanced models, DET-YOLO achieved the highest mAP0.5 (namely 94.7%) with only 80 training epochs while meeting lightweight and real-time requirements. While on the other two datasets, DET-YOLO also achieved the best detection performance.
Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
With the advent of the Internet of Things, self-powered wearable sensors have become increasingly prevalent in our daily lives. The utilization of piezoelectric composites to harness and sense surrounding mechanical vibrations has been extensively investigated during the last decades. However, the poor interface compatibility between ceramics nanofillers and polymers matrix, as well as low piezoelectric performance, still serves as a critical challenge. In this work, we employed Di(dioctylpyrophosphato) ethylene titanate (DET) as the coupling agent for modifying barium titanate (BTO) nanofillers. Compared to the BTO/PVDF counterpart, the DET-BTO/PVDF nanofibers exhibit an augmented content of piezoelectric β phase (~85.7%) and significantly enhanced stress transfer capability. The piezoelectric coefficient (d33) is up to ~40 pC/N, which is the highest value among reported BTO/PVDF composites. The piezoelectric energy harvesters (PEHs) present benign durability and attain a high instantaneous power density of 276.7 nW/cm2 at a matched load of 120 MΩ. Furthermore, the PEHs could sense various human activities, with the sensitivity as high as 0.817 V/N ranging from 0.05–0.1 N. This work proposes a new strategy to boosting the piezoelectric performance of PVDF-based composites via DET-doping ceramics nanoparticles, and in turn show significantly improved energy harvesting and sensing capability.
Many scientific and astronomical instruments need precise time measurement with high resolution between two or more events which is very challenging since decades. Presently a fast response of high resolution 17.1ps Digital Event Timer (DET) has been implemented using FPGA. The floor plan of FPGA is purposely arranged to get the higher resolution for timing measurement. As compared to various event timers (time to digit converters), the present design has better accuracy and higher resolution. The present DET is using the clock frequency (GHz) capable of measuring time in fraction microseconds and a unique asynchronous vernier approach to measure the time fraction of clock cycle which increases its resolution compared to Digital Event Timers. Fast response and higher resolution make it a better choice for multi KHz satellite laser ranging and other LIDAR applications.
I re-examine a recent work by G. Landi and G. E. Landi. [arXiv:1808.06708 [physics.ins-det]], in which the authors claim that the resolution of a tracker ca vary linearly with the number of detection layers, $N$, that is, faster than the commonly known $\sqrt{N}$ variation, for a tracker of fixed length, in case the precision of the position measurement is allowed to vary from layer to layer, i.e. heteroscedasticity, and an appropriate analysis method, a weighted least squares fit, is used.
Tetsuichi Kishishita, Yutaro Sato, Yoichi Fujita
et al.
A new silicon-strip readout chip named "SliT" has been developed for the measurement of the muon anomalous magnetic moment and electric dipole moment at J-PARC. The SliT is designed in the Silterra 180 nm CMOS technology with mixed-signal integrated circuits. An analog circuit incorporates a conventional charge-sensitive amplifier, shaping amplifiers, and two distinct discriminators for each of 128 identical channels. A digital part includes storage memories, an event building block, a serializer, and LVDS drivers. A distinct feature of the SliT is utilization of the zero-cross architecture, which consists of a CR-RC filter followed by a CR circuit as a voltage differentiator. This architecture enables to generate hit signals with subnanosecond amplitude-independent time walk, which is the primary requirement for the experiment. The test results show the time walk of $0.38 \pm 0.16$ ns between 0.5 and 3 MIP signals. The equivalent noise charge is $1547 \pm 75 $ $e^{-}$ (rms) at $C_{\rm det} = 33$ pF as a strip-sensor capacitance. Other functionalities such as a strip-sensor readout chip have also been proven in the tests. The SliT128C satisfies all requirements of the J-PARC muon $g-2$/EDM experiment.
Miki Nakazawa, Tetsuichi Kishishita, Masayoshi Shoji
et al.
We report on the recent development of a versatile analog front-end compatible with a negative-ion $μ$-TPC for a directional dark matter search as well as a dual-phase, next-generation $\mathcal{O}$(10~kt) liquid argon TPC to study neutrino oscillations, nucleon decay, and astrophysical neutrinos. Although the operating conditions for negative-ion and liquid argon TPCs are quite different (room temperature \textit{vs.} $\sim$88~K operation, respectively), the readout electronics requirements are similar. Both require a wide-dynamic range up to 1600 fC, and less than 2000--5000 e$^-$ noise for a typical signal of 80 fC with a detector capacitance of $C_{\rm det} \approx 300$~pF. In order to fulfill such challenging requirements, a prototype ASIC was newly designed using 180-nm CMOS technology. Here, we report on the performance of this ASIC, including measurements of shaping time, dynamic range, and equivalent noise charge (ENC). We also demonstrate the first operation of this ASIC on a low-pressure negative-ion $μ$-TPC.
This primer is a brief introduction to the technologies used in particle detectors designed for high-energy particle physics experiments. The intended readers are students, especially undergraduates, starting laboratory work.