Hasil untuk "Vocational guidance. Career development"

Menampilkan 20 dari ~5288187 hasil · dari DOAJ, arXiv, Semantic Scholar, CrossRef

JSON API
arXiv Open Access 2026
MCLR: Improving Conditional Modeling via Inter-Class Likelihood-Ratio Maximization and Unifying Classifier-Free Guidance with Alignment Objectives

Xiang Li, Yixuan Jia, Xiao Li et al.

Diffusion models have achieved state-of-the-art performance in generative modeling, but their success often relies heavily on classifier-free guidance (CFG), an inference-time heuristic that modifies the sampling trajectory. From a theoretical perspective, diffusion models trained with standard denoising score matching (DSM) are expected to recover the target data distribution, raising the question of why inference-time guidance is necessary in practice. In this work, we ask whether the DSM training objective can be modified in a principled manner such that standard reverse-time sampling, without inference-time guidance, yields effects comparable to CFG. We identify insufficient inter-class separation as a key limitation of standard diffusion models. To address this, we propose MCLR, a principled alignment objective that explicitly maximizes inter-class likelihood-ratios during training. Models fine-tuned with MCLR exhibit CFG-like improvements under standard sampling, achieving comparable qualitative and quantitative gains without requiring inference-time guidance. Beyond empirical benefits, we provide a theoretical result showing that the CFG-guided score is exactly the optimal solution to a weighted MCLR objective. This establishes a formal equivalence between classifier-free guidance and alignment-based objectives, offering a mechanistic interpretation of CFG.

en cs.LG, cs.AI
arXiv Open Access 2026
VIGiA: Instructional Video Guidance via Dialogue Reasoning and Retrieval

Diogo Glória-Silva, David Semedo, João Maglhães

We introduce VIGiA, a novel multimodal dialogue model designed to understand and reason over complex, multi-step instructional video action plans. Unlike prior work which focuses mainly on text-only guidance, or treats vision and language in isolation, VIGiA supports grounded, plan-aware dialogue that requires reasoning over visual inputs, instructional plans, and interleaved user interactions. To this end, VIGiA incorporates two key capabilities: (1) multimodal plan reasoning, enabling the model to align uni- and multimodal queries with the current task plan and respond accurately; and (2) plan-based retrieval, allowing it to retrieve relevant plan steps in either textual or visual representations. Experiments were done on a novel dataset with rich Instructional Video Dialogues aligned with Cooking and DIY plans. Our evaluation shows that VIGiA outperforms existing state-of-the-art models on all tasks in a conversational plan guidance setting, reaching over 90\% accuracy on plan-aware VQA.

en cs.CV, cs.CL
DOAJ Open Access 2025
A Digitized Concept for Tracking Schoolchildren Personal Achievements

Sergey S. Zaydullin, Svetlana V. Novikova

Introduction. The need to create a digital portfolio of schoolchildren achievements is emphasized by parents, school administrators, and representatives of relevant federal agencies. Civil society is capable of offering government authorities various options for digitizing the record of schoolchildren personal achievements, which will accelerate the implementation of the ministry’s plans and increase the system’s conformity to the real needs of the population. The research aim is to develop a conceptual framework for an information system that records personal achievements, based on a comprehensive analysis of the needs of modern society. Materials and Methods. The research was conducted based on information obtained through purposeful searching and extraction of relevant documents from heterogeneous sources of information. For further processing, text document analysis methods were applied: expert text evaluation, intent analysis, and content analysis. Identification of significant factors was carried out using factor analysis with empirical indicators and descriptive statistics. The results’ were generated using statistical methods of comparing means and statistical visualization. Results. The list of stakeholders of the system was defined, which included schoolchildren, their pa­rents, employees of educational institutions, state educational structures, universities, secondary vocational schools, as well as organizers of thematic contests and Olympiads. The analysis of students’ achievements was conducted, based on which a set of data relevant for storage in the system was selected. A conclusion was made about the general decrease in the number of achievements with the increase in the age of students. A list of functional requirements for the information system for recording personal achievements of schoolchildren was formed. A concept for creating a full-fledged and fully functional information system, a version of the “digital portfolio of a student” that meets the requirements of all stakeholders, was developed. Discussion and Conclusion. The proposed information system for recording personal achievements of schoolchildren will enable to shape a student’s educational trajectory, automate the reporting system of general education schools on extracurricular achievements, and also assist universities and secondary vocational educational institutions in career guidance for potential applicants. The practical significance of the article lies in the development of recommendations for creating a system to record the achievements of schoolchildren, including its modules and implementation, which is helpful for teachers and civil servants.

DOAJ Open Access 2025
Exploring the influence of collaborative leadership on Grade 12 mathematics results in a low-resourced secondary school

Hendri J. Theron

Background: This case study examines the role of collaborative leadership in Grade 12 Mathematics results. The article challenges the notion that collaborative leadership is not a significant factor in improving Grade 12 Mathematics results, particularly in low-resourced secondary schools in South Africa. Objectives: This study seeks to offer an alternative perspective and highlight the positive influence of collaborative leadership on Grade 12 Mathematics results in a low-resourced secondary school. Methods: The research uses the Collegial Model as the theoretical framework. Utilising a qualitative method, data were gathered through semi-structured interviews with members of the School Management Team (SMT), the chair of the School Governing Body (SGB), and the Circuit Manager (CM) from the provincial Department of Education (DoE). Additional data were collected through on-site observations at the school. Results: The findings indicate that collaborative leadership via the SMT, SGB, and CM improves Grade 12 Mathematics results in a low-resourced secondary school. Conclusion: The article concludes that collaborative leadership relationships empower low-resourced secondary schools to elevate Grade 12 Mathematics results. Contribution: This research challenged the norm and contributes to a pathway of collaboration between the SMT, SGB, and CM and its positive effect.

Vocational guidance. Career development, Social Sciences
arXiv Open Access 2025
Dynamic Theater: Location-Based Immersive Dance Theater, Investigating User Guidance and Experience

You-Jin Kim, Joshua Lu, Tobias Höllerer

Dynamic Theater explores the use of augmented reality (AR) in immersive theater as a platform for digital dance performances. The project presents a locomotion-based experience that allows for full spatial exploration. A large indoor AR theater space was designed to allow users to freely explore the augmented environment. The curated wide-area experience employs various guidance mechanisms to direct users to the main content zones. Results from our 20-person user study show how users experience the performance piece while using a guidance system. The importance of stage layout, guidance system, and dancer placement in immersive theater experiences are highlighted as they cater to user preferences while enhancing the overall reception of digital content in wide-area AR. Observations after working with dancers and choreographers, as well as their experience and feedback are also discussed.

arXiv Open Access 2025
Deep Learning-Based Robust Optical Guidance for Hypersonic Platforms

Adrien Chan-Hon-Tong, Aurélien Plyer, Baptiste Cadalen et al.

Sensor-based guidance is required for long-range platforms. To bypass the structural limitation of classical registration on reference image framework, we offer in this paper to encode a stack of images of the scene into a deep network. Relying on a stack is showed to be relevant on bimodal scene (e.g. when the scene can or can not be snowy).

en cs.CV
arXiv Open Access 2025
GRAND: Guidance, Rebalancing, and Assignment for Networked Dispatch in Multi-Agent Path Finding

Johannes Gaber, Meshal Alharbi, Daniele Gammelli et al.

Large robot fleets are now common in warehouses and other logistics settings, where small control gains translate into large operational impacts. In this article, we address task scheduling for lifelong Multi-Agent Pickup-and-Delivery (MAPD) and propose a hybrid method that couples learning-based global guidance with lightweight optimization. A graph neural network policy trained via reinforcement learning outputs a desired distribution of free agents over an aggregated warehouse graph. This signal is converted into region-to-region rebalancing through a minimum-cost flow, and finalized by small, local assignment problems, preserving accuracy while keeping per-step latency within a 1 s compute budget. We call this approach GRAND: a hierarchical algorithm that relies on Guidance, Rebalancing, and Assignment to explicitly leverage the workspace Network structure and Dispatch agents to tasks. On congested warehouse benchmarks from the League of Robot Runners (LoRR) with up to 500 agents, our approach improves throughput by up to 10% over the 2024 winning scheduler while maintaining real-time execution. The results indicate that coupling graph-structured learned guidance with tractable solvers reduces congestion and yields a practical, scalable blueprint for high-throughput scheduling in large fleets.

en cs.RO, cs.LG
arXiv Open Access 2025
Implementing blind navigation through multi-modal sensing and gait guidance

Feifan Yan, Tianle Zeng, Meixi He

By the year 2023, the global population of individuals with impaired vision has surpassed 220 million. People with impaired vision will find it difficult while finding path or avoiding obstacles, and must ask for auxiliary tools for help. Although traditional aids such as guide canes and guide dogs exist, they still have some shortcomings. In this paper, we present our wearable blind guiding device, what perform navigation guidance through our proposed Gait-based Guiding System. Our device innovatively integrates gait phase analysis for walking guide, and in terms of environmental perception, we use multimodal sensing to acquire diverse environment information. During the experiment, we conducted both indoor and outdoor experiments, and compared with the standard guide cane. The result shows superior performance of our device in blind guidance.

en cs.CV, eess.SY
arXiv Open Access 2025
Reducing students' misconceptions about video game development. A mixed-method study

Łukasz Sikorski, Jacek Matulewski

This study examines students' naïve mindset (misconceptions) about video game development, idealized and inaccurate beliefs that shape an unrealistic understanding of the field. The research evaluated the effectiveness of a fifteen-hour-long lecture series delivered by industry professionals, designed to challenge this mindset and expose students to the complexities and realities of game production. A mixed-methods approach was employed, combining qualitative analysis with a prototype quantitative tool developed to measure levels of misconception. Participants included students (n = 91) from diverse academic backgrounds interested in game creation and professionals (n = 94) working in the video game industry. Findings show that the intervention significantly reduced students' naïve beliefs while enhancing their motivation to pursue careers in the industry. Exposure to professional perspectives fostered a more realistic and informed mindset, taking into account the understanding of the technical, collaborative, and business aspects of game development. The results suggest that incorporating similar expert-led interventions early in game development education can improve learning outcomes, support informed career choices, and mitigate future professional disappointment.

en cs.HC, cs.CY
DOAJ Open Access 2024
A survey on improving core literacy among impoverished students in Chinese pharmaceutical universities under a development-oriented funding system

Juan Chen, Jieru Chen, Jiayu Li et al.

Abstract Background Impoverished students constitute a group that cannot be overlooked in higher education. It is crucial for pharmaceutical universities worldwide to implement financial assistance programs that promote the development and success of students from poor families. Chinese universities have carried out active exploration of subsidized education, and made important achievements, but there are also problems, which are worth learning from. This study proposed a strategy to improve the core literacy of impoverished pharmaceutical students under the development-oriented funding system. Methods The study centers on clinical pharmacy students from a pharmaceutical college as the research subjects. A sequential explanatory mixed-methods approach was employed, with quantitative data collected through a survey questionnaire, supplemented by qualitative data collected through in-depth interviews. Data gathered from these two levels were integrated and subjected to statistical analysis. Results The quantitative survey yielded a total of 397 valid samples, most of whom were females (73.8%), mainly at undergraduate level (89.67%), and from non-urban area (73.81%). Five of them participated in further qualitative interviews. The combined data identified: (a) financial aid—most students were highly satisfied with the financial support; (b) psychological support—most of the students interviewed reported that the scholarship significantly improved their self-confidence and motivation; (c) academic guidance—funded students had clear expectations for career development and academic guidance, demonstrating a strong need for further professional study; and (d) employment assistance—most students wanted career guidance and career planning support. Conclusions The financial assistance in pharmaceutical colleges and universities should be enriched to resolve the worries of impoverished clinical pharmacy students through economic assistance, improve the moral cultivation with psychological assistance, strengthen their cultural and scientific literacy, and improve their knowledge and practical ability through academic assistance. Through the integration of pharmacy education, the vocational competence of clinical pharmacy students can be improved by employment assistance, and the reform of higher pharmacy education can be further promoted to improve the training quality of pharmaceutical talents.

Special aspects of education, Medicine
DOAJ Open Access 2024
تبیین نقش تفکر کارآفرینانه در پیشرفت شغلی کارکنان: مورد مطالعه مدیریت جهاد کشاورزی شهرستان کرمان

مهدی حاج محمد حسنی, کورش رضائی مقدم

حضور کارآفرینان به عنوان یک استراتژی نوین در سازمان‌ها برای رشد و بقای سازمان دارای اهمیت و جایگاه ویژه-ای در بسیاری از زمینه‌ها به خصوص کشاورزی دارند. دولت‌ها تلاش می‌کنند با حداکثر امکانات تعداد افراد دارای ویژگی‌های کارآفرینی را افزایش و آموزش‌های کارآفرینی را به جامعه ارائه دهند. پژوهش حاضر با هدف بررسی روابط بین تفکر کارآفرینانه کارکنان مدیریت جهاد کشاورزی شهرستان کرمان بر دلبستگی و پیشرفت شغلی انجام شد. با توجه به تعداد 141 نفر کارمند این اداره، 101 نفر با استفاده از جدول کرجسی و مورگان (1970) و روش نمونه‌گیری تصادفی ساده به عنوان نمونه وارد مطالعه شدند. به منظور بررسی مدل پژوهش، مجموعه متغیرهای نوآوری، ریسک‌پذیری، خلاقیت، انگیزه کارآفرینی، روحیه کارآفرینی و استقلال طلبی (از ویژگی‌های تفکر کارآفرینی) به عنوان متغیر مستقل و میانجیگری دلبستگی شغلی بر پیشرفت شغلی مورد بررسی قرار گرفتند. تجزیه و تحلیل با نرم افزار SPSS و ترسیم مدل معادلات ساختاری با نرم‌افزار Amos انجام گرفت. نتایج نشان داد خلاقیت و روحیه کارآفرینی بیشترین میزان تاثیر بر پیشرفت شغلی کارکنان مدیریت جهاد کشاورزی را دارد، از طرف دیگر با تاثیر خلاقیت و روحیه کارآفرینی بر دلبستگی شغلی، سطح پیشرفت شغلی آنها بیشتر نیز شد. پیشنهاد می‌شود به منظور ایجاد تفکر کارآفرینی در بین کارکنان مدیریت جهاد کشاورزی شهرستان کرمان، علاوه بر آموزش ضمن خدمت، تدابیری برای تنظیم و مشارکت کارکنان در برنامه‌های مختلف بر اساس ویژگی‌های کارآفرینی انجام شود. تا علاوه بر تمرین و تکرار این ویژگی‌ها، منجر به افزایش و آموزش به سایرین نیز فراهم شود.

Vocational guidance. Career development, Agriculture (General)
DOAJ Open Access 2024
Positive inclusive experiences of a same-sex desiring male Foundation Phase teacher

Obakeng Kagola

Background: Much work has been done to realise the discourse of diversity and gender equality in the perceived feminised Foundation Phase (FP) of teaching and learning. Part of the work was done through the inclusion of males in FP teaching to provide learners and teachers alike with diverse learning experiences. In South Africa, most research focuses on why FP teaching remains a feminised space, the marginalisation of few male FP teachers. However, less is known about male FP teachers’ positive inclusive experiences. Objectives: To contribute to the discussion by presenting Camagu’s case study, a same-sex desiring male FP teacher, and his positive inclusive experiences in a conservative context in the Eastern Cape province, South Africa. Methods: A qualitative single case study methodology employing photovoice to elicit Camagu’s positive inclusive experiences in FP teaching. Results: Findings revealed positive inclusive practices implemented by the school, discussed under two themes: Finding a sense of belonging through school structures and ‘It’s nice to have someone as different as you in the school …’ - Learners’ perspective. Conclusion: The study suggests a need for increased support for school leadership and teachers to promote inclusive policies and practices affirming diverse identities of all learners and teachers. Contribution: Camagu’s experiences offer new research dimension by sharing best practices for fostering inclusive school environments. The study challenges deeply rooted gender norms in South African education, particularly in FP teaching. It advocates for inclusion and acceptance of male teachers, regardless of their diverse gender and sexual orientations, promoting gender diversity.

Vocational guidance. Career development, Social Sciences
arXiv Open Access 2024
METDrive: Multi-modal End-to-end Autonomous Driving with Temporal Guidance

Ziang Guo, Xinhao Lin, Zakhar Yagudin et al.

Multi-modal end-to-end autonomous driving has shown promising advancements in recent work. By embedding more modalities into end-to-end networks, the system's understanding of both static and dynamic aspects of the driving environment is enhanced, thereby improving the safety of autonomous driving. In this paper, we introduce METDrive, an end-to-end system that leverages temporal guidance from the embedded time series features of ego states, including rotation angles, steering, throttle signals, and waypoint vectors. The geometric features derived from perception sensor data and the time series features of ego state data jointly guide the waypoint prediction with the proposed temporal guidance loss function. We evaluated METDrive on the CARLA leaderboard benchmarks, achieving a driving score of 70%, a route completion score of 94%, and an infraction score of 0.78.

en cs.RO, cs.CV
arXiv Open Access 2024
Safe and Personalizable Logical Guidance for Trajectory Planning of Autonomous Driving

Yuejiao Xu, Ruolin Wang, Chengpeng Xu et al.

Autonomous vehicles necessitate a delicate balance between safety, efficiency, and user preferences in trajectory planning. Existing traditional or learning-based methods face challenges in adequately addressing all these aspects. In response, this paper proposes a novel component termed the Logical Guidance Layer (LGL), designed for seamless integration into autonomous driving trajectory planning frameworks, specifically tailored for highway scenarios. The LGL guides the trajectory planning with a local target area determined through scenario reasoning, scenario evaluation, and guidance area calculation. Integrating the Responsibility-Sensitive Safety (RSS) model, the LGL ensures formal safety guarantees while accommodating various user preferences defined by logical formulae. Experimental validation demonstrates the effectiveness of the LGL in achieving a balance between safety and efficiency, and meeting user preferences in autonomous highway driving scenarios.

en cs.RO, cs.LO
arXiv Open Access 2024
Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.

Crowd counting and localization have become increasingly important in computer vision due to their wide-ranging applications. While point-based strategies have been widely used in crowd counting methods, they face a significant challenge, i.e., the lack of an effective learning strategy to guide the matching process. This deficiency leads to instability in matching point proposals to target points, adversely affecting overall performance. To address this issue, we introduce an effective approach to stabilize the proposal-target matching in point-based methods. We propose Auxiliary Point Guidance (APG) to provide clear and effective guidance for proposal selection and optimization, addressing the core issue of matching uncertainty. Additionally, we develop Implicit Feature Interpolation (IFI) to enable adaptive feature extraction in diverse crowd scenarios, further enhancing the model's robustness and accuracy. Extensive experiments demonstrate the effectiveness of our approach, showing significant improvements in crowd counting and localization performance, particularly under challenging conditions. The source codes and trained models will be made publicly available.

en cs.CV, cs.AI
DOAJ Open Access 2023
Olympiad participation: Problem-solving skills in mathematically gifted disadvantaged learners

Beccy Stones, Jacobus G. Maree, Joyce Jordaan

Background: Gifted learners are South Africa’s future leaders and investment in the skills of disadvantaged learners would benefit the country. Objectives: This study investigated whether Olympiad participation could develop problem-solving skills in mathematically gifted disadvantaged learners. Methods: The methodology of the study was quantitative. A total of 100 mathematically gifted Grade 7 learners from two quintile two schools in the same disadvantaged area of South Africa were exposed either to Olympiad-style questions (South African Mathematics Challenge past papers), or traditional Department of Education worksheets. Five aspects of Study Orientation, including problem-solving behaviour, were assessed using the Study Orientation in Mathematics (SOM) before and after the intervention. Results: The findings revealed a correlation between success in traditional mathematics and study attitude, study habits, and overall study orientation, as well as an interaction between disadvantage and success in mathematics. The intervention did not increase problem-solving skills. Participants found the Olympiad-type questions unfamiliar and difficult, which is indicative of the limited enrichment opportunities for mathematically gifted learners in disadvantaged areas of South Africa. Conclusion: Poverty and giftedness were shown to interact: the gifted disadvantaged learners in this study were less disadvantaged by their surroundings than one would expect and conversely had higher mathematics anxiety than expected for their achievement level. Contribution: This study highlights the need to nurture the skills of mathematically gifted disadvantaged children.

Vocational guidance. Career development, Social Sciences
DOAJ Open Access 2023
ارائه و اعتبارسنجی الگویی برای پرورش منتور کارآفرینی در ایران

مصطفی اسلامبول چی, محمد عزیزی, سید رسول حسینی

آموزش کارآفرینی می تواند به بهبود توانایی فرد برای نوآوری، ایجاد کسب و کار و ایجاد شغل کمک کند. بنابراین داشتن یک منتور برای رسیدن به موفقیت یک کارآفرین از ضروریات است. در همین راستا تحقیق حاضر با هدف ارائه و اعتبارسنجی الگویی برای پرورش منتور کارآفرینی در ایران انجام شد. جامعه آماری پژوهش منتورهای کارآفرینی در ایران بوده که با استفاده از فرمول کوکران 384 نفر به عنوان حجم نمونه انتخاب شدند. ابزار سنجش پرسشنامه محقق‌ساخت بود که روایی آن با استفاده از نظرات تخصصی اساتید و مرشدان خبرۀ کارآفرینی انجام شد. میزان پایایی آن نیز از طریق محاسبۀ ضریب آلفای کرونباخ محاسبه شد. پس از جمع‌آوری و دسته‌بندی داده‌ها، از روش آمار توصیفی و استنباطی در محیط نرم‌افزار SPSS20 و همچنین برای استخراج مدل معادلات ساختاری از نرم‌افزار Smart PLS3 استفاده گردید. نتایج به دست آمده از روابط مؤلفه‌های مدل منتور کارآفرینی نشان داد قوی-ترین رابطه مربوط به «پشتیبانی خانواده- ویژگی‌های شخصیتی» با ضریب مسیر 832/0 و ضعیف‌ترین آن مربوط به «فعالیت‌های ترویجی- دانش بازار» با ضریب مسیر 130/0 بود. یافته‌ها حاکی از آن است دو عامل دانش منتوری و شایستگی منتوری در سطح 99 درصد اطمینان تأثیر مثبت و معنی‌داری بر پرورش منتورکارآفرینی داشته است. با توجه به نتایج به کارآفرینان کم‌تجربه پیشنهاد می‌شود ارتباط مستمر با منتورهای ارشد و بهره‌گیری از تجارب آن‌ها در زمینۀ راه‌اندازی و توسعه کسب‌وکار و ایجاد شبکه‌ای قوی، به توانمندی و دانش منتوری حرفه‌ای دست یابند تا بتوان از این طریق منتور‌پروری در جهت احیاء مرشدیت در کسب‌وکارها فراهم شود.

Vocational guidance. Career development, Agriculture (General)
arXiv Open Access 2023
Temperature of the hot $α$-source: a guidance to probe the Bose-Einstein condensation of $α$ clusters in the heavy-ion collision

Jun Su, Long Zhu

Recent experimental efforts were made to explore the Bose-Einstein condensation in multi-$α$ system using the heavy-ion collisions. This Letter provides a explanation why no signatures were observed and a guidance for future experiments. More precisely, a harmonic oscillator model is developed to study the multi-$α$ source at or near zero temperature. The temperature and particle population of the multi-$α$ sources observed in the current experiments are extracted. It is found that almost no $α$ particles occupy the ground state in those multi-$α$ sources. The critical temperature for the multi-$α$-condensed states are predicted, which provides a guidance for the future experiments to probe the Bose-Einstein condensation of $α$ clusters in the heavy-ion collisions.

arXiv Open Access 2023
Improving Computational Efficiency for Powered Descent Guidance via Transformer-based Tight Constraint Prediction

Julia Briden, Trey Gurga, Breanna Johnson et al.

In this work, we present Transformer-based Powered Descent Guidance (T-PDG), a scalable algorithm for reducing the computational complexity of the direct optimization formulation of the spacecraft powered descent guidance problem. T-PDG uses data from prior runs of trajectory optimization algorithms to train a transformer neural network, which accurately predicts the relationship between problem parameters and the globally optimal solution for the powered descent guidance problem. The solution is encoded as the set of tight constraints corresponding to the constrained minimum-cost trajectory and the optimal final time of landing. By leveraging the attention mechanism of transformer neural networks, large sequences of time series data can be accurately predicted when given only the spacecraft state and landing site parameters. When applied to the real problem of Mars powered descent guidance, T-PDG reduces the time for computing the 3 degree of freedom fuel-optimal trajectory, when compared to lossless convexification, from an order of 1-8 seconds to less than 500 milliseconds. A safe and optimal solution is guaranteed by including a feasibility check in T-PDG before returning the final trajectory.

en math.OC, cs.LG
arXiv Open Access 2023
Phase-Specific Augmented Reality Guidance for Microscopic Cataract Surgery Using Long-Short Spatiotemporal Aggregation Transformer

Puxun Tu, Hongfei Ye, Haochen Shi et al.

Phacoemulsification cataract surgery (PCS) is a routine procedure conducted using a surgical microscope, heavily reliant on the skill of the ophthalmologist. While existing PCS guidance systems extract valuable information from surgical microscopic videos to enhance intraoperative proficiency, they suffer from non-phasespecific guidance, leading to redundant visual information. In this study, our major contribution is the development of a novel phase-specific augmented reality (AR) guidance system, which offers tailored AR information corresponding to the recognized surgical phase. Leveraging the inherent quasi-standardized nature of PCS procedures, we propose a two-stage surgical microscopic video recognition network. In the first stage, we implement a multi-task learning structure to segment the surgical limbus region and extract limbus region-focused spatial feature for each frame. In the second stage, we propose the long-short spatiotemporal aggregation transformer (LS-SAT) network to model local fine-grained and global temporal relationships, and combine the extracted spatial features to recognize the current surgical phase. Additionally, we collaborate closely with ophthalmologists to design AR visual cues by utilizing techniques such as limbus ellipse fitting and regional restricted normal cross-correlation rotation computation. We evaluated the network on publicly available and in-house datasets, with comparison results demonstrating its superior performance compared to related works. Ablation results further validated the effectiveness of the limbus region-focused spatial feature extractor and the combination of temporal features. Furthermore, the developed system was evaluated in a clinical setup, with results indicating remarkable accuracy and real-time performance. underscoring its potential for clinical applications.

en cs.CV

Halaman 45 dari 264410