C. Gollier, J. Pratt
Hasil untuk "Instruments and machines"
Menampilkan 20 dari ~633042 hasil · dari DOAJ, CrossRef, Semantic Scholar, arXiv
Rishikesh Bhyri, Brian R Quaranto, Philip J Seger et al.
Accurate counting of surgical instruments in Operating Rooms (OR) is a critical prerequisite for ensuring patient safety during surgery. Despite recent progress of large visual-language models and agentic AI, accurately counting such instruments remains highly challenging, particularly in dense scenarios where instruments are tightly clustered. To address this problem, we introduce Chain-of-Look, a novel visual reasoning framework that mimics the sequential human counting process by enforcing a structured visual chain, rather than relying on classic object detection which is unordered. This visual chain guides the model to count along a coherent spatial trajectory, improving accuracy in complex scenes. To further enforce the physical plausibility of the visual chain, we introduce the neighboring loss function, which explicitly models the spatial constraints inherent to densely packed surgical instruments. We also present SurgCount-HD, a new dataset comprising 1,464 high-density surgical instrument images. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches for counting (e.g., CountGD, REC) as well as Multimodality Large Language Models (e.g., Qwen, ChatGPT) in the challenging task of dense surgical instrument counting.
Qingyu Zhang, M. Vonderembse, Jeen-Su Lim
Pu Jiao, Limin Ran
Abstract With the rapid development of science and technology, augmented reality technology provides intelligent and application services. The research is based on imaging techniques using augmented reality technology and camera image capture. Then, it uses screen error algorithms and scale-invariant feature transformation operators to test the quality of scene spatial models. The experimental results demonstrated that the camera significantly improved the frame rate of scene model rendering and could steadily enhance rendering efficiency. For image quality and its influencing factors, binary robust invariant scalable keypoints and scale-invariant feature transformation algorithms in viewpoint changes had the highest recall of 92%. The map drawing module, Hessian matrix, and scale-invariant feature transformation algorithm in the image blurring test achieved the highest recall rate of 98%. This demonstrates the advantage of using a scale-invariant feature transformation operator to capture scene space influence and provide a more accurate spatial model reference for augmented reality technology. This enhances the functional design of the guide system.
Lin Zheng, Jinlong Li, Zhanbo Zhu et al.
Abstract In recent years, with the popularization of online education, real-time monitoring of learning engagement has become a key challenge for scholars. Existing studies mainly rely on questionnaires and physiological signal detection, which have limitations such as high subjectivity, poor real-time performance, and expensive equipment. Previous research has shown that head pose is closely related to cognitive state. However, current estimation models require substantial computational resources, making real-time deployment on mobile devices challenging. In this study, we validate the significant correlation between head pose and learning engagement based on the DAiSEE dataset (8,925 video clips) and propose a lightweight head pose estimation method. The LightNet proposed in this paper uses an improved feature extraction module (MG-Net) and an Attention-based multi-scale fusion model (AMF). Experiments conducted on the 300W-LP and BIWI benchmark datasets demonstrate that, compared with existing state-of-the-art methods, LightNet substantially reduces model complexity by decreasing the number of parameters to just 0.45 $$\times 10^6$$ × 10 6 , representing over 90% reduction in model size. Despite this significant compression, LightNet maintains a high level of accuracy, with the mean absolute error (MAE) increasing by only 0.15°, indicating a minimal loss in prediction precision. Moreover, the model achieves a notable improvement in processing speed, exceeding 50% increase relative to baseline approaches. This combination of a lightweight architecture, competitive accuracy, and accelerated inference speed underscores LightNet’s effectiveness and its potential suitability for real-time applications. This study not only expands the application of head pose in education but also provides a feasible solution for real-time engagement monitoring on resource-constrained devices.
Muhammad Yusuf Halim, Ahmad Luthfi
Unmanned Aerial Vehicles (UAV) have become vital tools in industrial sectors such as coal mining for site inspections and operational monitoring. However, unauthorized UAV flights present security risks that necessitate forensic investigation. This study examines a forensic case involving a DJI Mini 3 UAV suspected of crossing company boundaries. Using the Conceptual Digital Forensics Model for the Drone Forensic Field, both static and dynamic forensic acquisition methods were applied. Static acquisition recovered 53 photographs, 11 videos, 11 audio files, 10 deleted photos, 4 deleted videos, and 3 unidentified log files. Dynamic acquisition yielded 64 media files including 63 photographs (.JPG and .jpg) with 10 deleted, 14 videos (.MP4, .MOV, .SWF) with 6 deleted, 11 audio files, 4 plain text files, 31 deleted files, 3 EXIF metadata records containing GPS coordinates, and 3 unidentified log files. The GPS data from EXIF metadata was visualized in Google Earth to map flight paths and confirm boundary violations. These findings demonstrate that dynamic acquisition retrieves a more comprehensive artifact set than static acquisition. This study highlights the importance of UAV digital forensics in supporting security investigations and ensuring compliance with industrial UAV policies.
Sen Yan, Etienne Goffinet, Fabrizio Gabellieri et al.
Nuclear Magnetic Resonance (NMR) spectroscopy is a crucial analytical technique used for molecular structure elucidation, with applications spanning chemistry, biology, materials science, and medicine. However, the frequency resolution of NMR spectra is limited by the "field strength" of the instrument. High-field NMR instruments provide high-resolution spectra but are prohibitively expensive, whereas lower-field instruments offer more accessible, but lower-resolution, results. This paper introduces an AI-driven approach that not only enhances the frequency resolution of NMR spectra through super-resolution techniques but also provides multi-scale functionality. By leveraging a diffusion model, our method can reconstruct high-field spectra from low-field NMR data, offering flexibility in generating spectra at varying magnetic field strengths. These reconstructions are comparable to those obtained from high-field instruments, enabling finer spectral details and improving molecular characterization. To date, our approach is one of the first to overcome the limitations of instrument field strength, achieving NMR super-resolution through AI. This cost-effective solution makes high-resolution analysis accessible to more researchers and industries, without the need for multimillion-dollar equipment.
Haonan Pan, Shuheng Chen, Elham Pishgar et al.
Coronary artery disease remains one of the leading causes of mortality globally. Despite advances in revascularization treatments like PCI and CABG, postoperative stroke is inevitable. This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk in coronary revascularization patients.This research employed data from the MIMIC-IV database, consisting of a cohort of 7023 individuals. Study data included clinical, laboratory, and comorbidity variables. To reduce multicollinearity, variables with over 30% missing values and features with a correlation coefficient larger than 0.9 were deleted. The dataset has 70% training and 30% test. The Random Forest technique interpolated residual dataset missing values. Numerical values were normalized, whereas categorical variables were one-hot encoded. LASSO regularization selected features, and grid search found model hyperparameters. Finally, Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable. AUC of 0.855 (0.829-0.878) showed that SVM model outperformed logistic regression and CatBoost models in prior research. SHAP research showed that the Charlson Comorbidity Index (CCI), diabetes, chronic kidney disease, and heart failure are significant prognostic factors for postoperative stroke. This study shows that improved machine learning reduces overfitting and improves model predictive accuracy. Models using the CCI alone cannot predict postoperative stroke risk as accurately as those using independent comorbidity variables. The suggested technique provides a more thorough and individualized risk assessment by encompassing a wider range of clinically relevant characteristics, making it a better reference for preoperative risk assessments and targeted intervention.
David J. Gunkel
Maria Frasca, Davide La Torre, Gabriella Pravettoni et al.
Abstract This review aims to explore the growing impact of machine learning and deep learning algorithms in the medical field, with a specific focus on the critical issues of explainability and interpretability associated with black-box algorithms. While machine learning algorithms are increasingly employed for medical analysis and diagnosis, their complexity underscores the importance of understanding how these algorithms explain and interpret data to take informed decisions. This review comprehensively analyzes challenges and solutions presented in the literature, offering an overview of the most recent techniques utilized in this field. It also provides precise definitions of interpretability and explainability, aiming to clarify the distinctions between these concepts and their implications for the decision-making process. Our analysis, based on 448 articles and addressing seven research questions, reveals an exponential growth in this field over the last decade. The psychological dimensions of public perception underscore the necessity for effective communication regarding the capabilities and limitations of artificial intelligence. Researchers are actively developing techniques to enhance interpretability, employing visualization methods and reducing model complexity. However, the persistent challenge lies in finding the delicate balance between achieving high performance and maintaining interpretability. Acknowledging the growing significance of artificial intelligence in aiding medical diagnosis and therapy, and the creation of interpretable artificial intelligence models is considered essential. In this dynamic context, an unwavering commitment to transparency, ethical considerations, and interdisciplinary collaboration is imperative to ensure the responsible use of artificial intelligence. This collective commitment is vital for establishing enduring trust between clinicians and patients, addressing emerging challenges, and facilitating the informed adoption of these advanced technologies in medicine.
Xinjing Qi, Huan Wang, Yubo Ji et al.
As the economy continues to develop and technology advances, there is an increasing societal need for an environmentally friendly ecosystem. Consequently, natural gas, known for its minimal greenhouse gas emissions, has been widely adopted as a clean energy alternative. The accurate prediction of short-term natural gas demand poses a significant challenge within this context, as precise forecasts have important implications for gas dispatch and pipeline safety. The incorporation of intelligent algorithms into prediction methodologies has resulted in notable progress in recent times. Nevertheless, certain limitations persist. However, there exist certain limitations, including the tendency to easily fall into local optimization and inadequate search capability. To address the challenge of accurately predicting daily natural gas loads, we propose a novel methodology that integrates the adaptive particle swarm optimization algorithm, attention mechanism, and bidirectional long short-term memory (BiLSTM) neural networks. The initial step involves utilizing the BiLSTM network to conduct bidirectional data learning. Following this, the attention mechanism is employed to calculate the weights of the hidden layer in the BiLSTM, with a specific focus on weight distribution. Lastly, the adaptive particle swarm optimization algorithm is utilized to comprehensively optimize and design the network structure, initial learning rate, and learning rounds of the BiLSTM network model, thereby enhancing the accuracy of the model. The findings revealed that the combined model achieved a mean absolute percentage error (MAPE) of 0.90% and a coefficient of determination (R2) of 0.99. These results surpassed those of the other comparative models, demonstrating superior prediction accuracy, as well as exhibiting favorable generalization and prediction stability.
Fiona A. M. Porter, A. Scaife
The volume of data from current and future observatories has motivated the increased development and application of automated machine learning methodologies for astronomy. However, less attention has been given to the production of standardized data sets for assessing the performance of different machine learning algorithms within astronomy and astrophysics. Here we describe in detail the MiraBest data set, a publicly available batched data set of 1256 radio-loud AGN from NVSS and FIRST, filtered to 0.03 < z < 0.1, manually labelled by Miraghaei and Best according to the Fanaroff–Riley morphological classification, created for machine learning applications and compatible for use with standard deep learning libraries. We outline the principles underlying the construction of the data set, the sample selection and pre-processing methodology, data set structure and composition, as well as a comparison of MiraBest to other data sets used in the literature. Existing applications that utilize the MiraBest data set are reviewed, and an extended data set of 2100 sources is created by cross-matching MiraBest with other catalogues of radio-loud AGN that have been used more widely in the literature for machine learning applications.
SHEN Xueli, HAN Qianwen
Click-Through Rate(CTR) prediction is one of the most important tools for ad placement.Predicting the CTR of an ad and making recommendations to users can increase ad revenue.Field-aware click-through rate prediction models are superior to other click-through rate prediction models because they consider the field information; however, they generate a large amount of redundant information during feature interaction, which results in a low prediction accuracy.A Field-aware Attention Embedding Neural Network(FAENN) model is herein proposed.This model uses a Self-Attentive Mechanism(SAM) to distribute weights to the input vectors of the embedding layer.This helps to clearly identify the importance of the field-aware embedded features, speeding up the training process.The lower-order feature interaction layer focuses on the explicit first-order information of the features and the second-order interaction feature information and outputs the effective features to the higher-order interaction layer.The higher-order feature interaction layer combines the learned interaction vectors with the deep neural network to capture higher-order feature interactions to improve prediction accuracy.The experimental results show that the FAENN model has a higher prediction accuracy than the FM, FFM, and AFM models.
Anton Nijholt
Minan Tang, Kai Liang, Jiandong Qiu
Abstract The proportion of insulators in aerial power patrol images is small and the background of overhead lines is complex, often leading to incomplete and inaccurate detection of insulators. Therefore, an algorithm for detecting insulator targets based on multi‐feature fusion is developed in this study. Firstly, a dynamic threshold oriented fast and rotated brief algorithm is proposed, which uses the bag‐of‐words dictionary model to determine local shape features of the image, applies gradient weighting to the global texture feature vector extracted by the histogram of oriented gradients algorithm and performs radial gradient transformations to get the improved HOG of features. Secondly, the feature vectors are fused serially, the learning machine is trained and the parameters of the support vector machine are optimized using the quantum particle swarm optimization algorithm. Finally, the target area is pre‐divided by the selective search algorithm, and the area is classified by the learning machine. The experimental results show that the proposed feature extraction method can describe the image details more accurately than the existing methods, and the average accuracy of the feature extraction classifier can reach 93.7%, which helps to overcome the incomplete detection problem of insulator detection at the aerial work site.
Antonio J. Marques Cardoso
Our journal Machines (https://www [...]
Zijian Zhou, Oluwatosin Alabi, Meng Wei et al.
In this paper, we propose a novel text promptable surgical instrument segmentation approach to overcome challenges associated with diversity and differentiation of surgical instruments in minimally invasive surgeries. We redefine the task as text promptable, thereby enabling a more nuanced comprehension of surgical instruments and adaptability to new instrument types. Inspired by recent advancements in vision-language models, we leverage pretrained image and text encoders as our model backbone and design a text promptable mask decoder consisting of attention- and convolution-based prompting schemes for surgical instrument segmentation prediction. Our model leverages multiple text prompts for each surgical instrument through a new mixture of prompts mechanism, resulting in enhanced segmentation performance. Additionally, we introduce a hard instrument area reinforcement module to improve image feature comprehension and segmentation precision. Extensive experiments on several surgical instrument segmentation datasets demonstrate our model's superior performance and promising generalization capability. To our knowledge, this is the first implementation of a promptable approach to surgical instrument segmentation, offering significant potential for practical application in the field of robotic-assisted surgery. Code is available at https://github.com/franciszzj/TP-SIS.
Hongxu Yang, Caifeng Shan, Alexander F. Kolen et al.
Medical instrument detection is essential for computer-assisted interventions, since it facilitates clinicians to find instruments efficiently with a better interpretation, thereby improving clinical outcomes. This article reviews image-based medical instrument detection methods for ultrasound-guided (US-guided) operations. Literature is selected based on an exhaustive search in different sources, including Google Scholar, PubMed, and Scopus. We first discuss the key clinical applications of medical instrument detection in the US, including delivering regional anesthesia, biopsy taking, prostate brachytherapy, and catheterization. Then, we present a comprehensive review of instrument detection methodologies, including non-machine-learning and machine-learning methods. The conventional non-machine-learning methods were extensively studied before the era of machine learning methods. The principal issues and potential research directions for future studies are summarized for the computer-assisted intervention community. In conclusion, although promising results have been obtained by the current (non-) machine learning methods for different clinical applications, thorough clinical validations are still required.
S. Catalucci, A. Thompson, J. Eastwood et al.
Manufacturing has recently experienced increased adoption of optimised and fast solutions for checking product quality during fabrication, allowing for manufacturing times and costs to be significantly reduced. Due to the integration of machine learning algorithms, advanced sensors and faster processing systems, smart instruments can autonomously plan measurement pipelines, perform decisional tasks and trigger correctional actions as required. In this paper, we summarise the state of the art in smart optical metrology, covering the latest advances in integrated intelligent solutions in optical coordinate and surface metrology, respectively for the measurement of part geometry and surface texture. Within this field, we include the use of a priori knowledge and implementation of machine learning algorithms for measurement planning optimisation. We also cover the development of multi-sensor and multi-view instrument configurations to speed up the measurement process, as well as the design of novel feedback tools for measurement quality evaluation.
Senem Sezer, F. Kartal, Uğur Özveren
Halaman 18 dari 31653