Hasil untuk "artificial intelligence"

Menampilkan 20 dari ~3571549 hasil · dari DOAJ, Semantic Scholar, CrossRef

JSON API
S2 Open Access 2025
DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning

DeepSeek-AI, Daya Guo, Dejian Yang et al.

General reasoning represents a long-standing and formidable challenge in artificial intelligence (AI). Recent breakthroughs, exemplified by large language models (LLMs)1,2 and chain-of-thought (CoT) prompting3, have achieved considerable success on foundational reasoning tasks. However, this success is heavily contingent on extensive human-annotated demonstrations and the capabilities of models are still insufficient for more complex problems. Here we show that the reasoning abilities of LLMs can be incentivized through pure reinforcement learning (RL), obviating the need for human-labelled reasoning trajectories. The proposed RL framework facilitates the emergent development of advanced reasoning patterns, such as self-reflection, verification and dynamic strategy adaptation. Consequently, the trained model achieves superior performance on verifiable tasks such as mathematics, coding competitions and STEM fields, surpassing its counterparts trained through conventional supervised learning on human demonstrations. Moreover, the emergent reasoning patterns exhibited by these large-scale models can be systematically used to guide and enhance the reasoning capabilities of smaller models. A new artificial intelligence model, DeepSeek-R1, is introduced, demonstrating that the reasoning abilities of large language models can be incentivized through pure reinforcement learning, removing the need for human-annotated demonstrations.

5353 sitasi en Medicine, Computer Science
S2 Open Access 2023
What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education

A. Tlili, Boulus Shehata, M. Adarkwah et al.

Artificial Intelligence (AI) technologies have been progressing constantly and being more visible in different aspects of our lives. One recent phenomenon is ChatGPT, a chatbot with a conversational artificial intelligence interface that was developed by OpenAI. As one of the most advanced artificial intelligence applications, ChatGPT has drawn much public attention across the globe. In this regard, this study examines ChatGPT in education, among early adopters, through a qualitative instrumental case study. Conducted in three stages, the first stage of the study reveals that the public discourse in social media is generally positive and there is enthusiasm regarding its use in educational settings. However, there are also voices who are approaching cautiously using ChatGPT in educational settings. The second stage of the study examines the case of ChatGPT through lenses of educational transformation, response quality, usefulness, personality and emotion, and ethics. In the third and final stage of the study, the investigation of user experiences through ten educational scenarios revealed various issues, including cheating, honesty and truthfulness of ChatGPT, privacy misleading, and manipulation. The findings of this study provide several research directions that should be considered to ensure a safe and responsible adoption of chatbots, specifically ChatGPT, in education.

1325 sitasi en Computer Science
S2 Open Access 2021
Machine Learning: Algorithms, Real-World Applications and Research Directions

Iqbal H. Sarker

In the current age of the Fourth Industrial Revolution (4IR or Industry 4.0), the digital world has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data, business data, social media data, health data, etc. To intelligently analyze these data and develop the corresponding smart and automated applications, the knowledge of artificial intelligence (AI), particularly, machine learning (ML) is the key. Various types of machine learning algorithms such as supervised, unsupervised, semi-supervised, and reinforcement learning exist in the area. Besides, the deep learning, which is part of a broader family of machine learning methods, can intelligently analyze the data on a large scale. In this paper, we present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application. Thus, this study’s key contribution is explaining the principles of different machine learning techniques and their applicability in various real-world application domains, such as cybersecurity systems, smart cities, healthcare, e-commerce, agriculture, and many more. We also highlight the challenges and potential research directions based on our study. Overall, this paper aims to serve as a reference point for both academia and industry professionals as well as for decision-makers in various real-world situations and application areas, particularly from the technical point of view.

4210 sitasi en Computer Science, Medicine
S2 Open Access 2020
Reinforcement learning

F. Wörgötter, B. Porr

Observing celestial objects and advancing our scientific knowledge about them involves tedious planning, scheduling, data collection and data post-processing. Many of these operational aspects of astronomy are guided and executed by expert astronomers. Reinforcement learning is a mechanism where we (as humans and astronomers) can teach agents of artificial intelligence to perform some of these tedious tasks. In this paper, we will present a state of the art overview of reinforcement learning and how it can benefit astronomy.

3221 sitasi en Computer Science, Physics
S2 Open Access 2018
Representation Learning with Contrastive Predictive Coding

Aäron van den Oord, Yazhe Li, O. Vinyals

While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

12953 sitasi en Computer Science, Mathematics
S2 Open Access 2015
An Introduction to Convolutional Neural Networks

K. O’Shea, Ryan Nash

The field of machine learning has taken a dramatic twist in recent times, with the rise of the Artificial Neural Network (ANN). These biologically inspired computational models are able to far exceed the performance of previous forms of artificial intelligence in common machine learning tasks. One of the most impressive forms of ANN architecture is that of the Convolutional Neural Network (CNN). CNNs are primarily used to solve difficult image-driven pattern recognition tasks and with their precise yet simple architecture, offers a simplified method of getting started with ANNs. This document provides a brief introduction to CNNs, discussing recently published papers and newly formed techniques in developing these brilliantly fantastic image recognition models. This introduction assumes you are familiar with the fundamentals of ANNs and machine learning.

3963 sitasi en Computer Science
S2 Open Access 2010
The use of computational intelligence in intrusion detection systems: A review

Shelly Xiaonan Wu, W. Banzhaf

Intrusion detection based upon computational intelligence is currently attracting considerable interest from the research community. Characteristics of computational intelligence (CI) systems, such as adaptation, fault tolerance, high computational speed and error resilience in the face of noisy information, fit the requirements of building a good intrusion detection model. Here we want to provide an overview of the research progress in applying CI methods to the problem of intrusion detection. The scope of this review will encompass core methods of CI, including artificial neural networks, fuzzy systems, evolutionary computation, artificial immune systems, swarm intelligence, and soft computing. The research contributions in each field are systematically summarized and compared, allowing us to clearly define existing research challenges, and to highlight promising new research directions. The findings of this review should provide useful insights into the current IDS literature and be a good source for anyone who is interested in the application of CI approaches to IDSs or related fields.

804 sitasi en Computer Science
DOAJ Open Access 2026
The role of large language models in emergency care: a comprehensive benchmarking study

Borna Naderi, Longsha Liu, Anita Ghandehari et al.

Abstract With EDs increasingly overburdened, Large Language Models (LLMs) may help streamline workflow and decision-making. We evaluated their emergency medicine knowledge and performance in simulated ED tasks. This two-part study first tested factual knowledge of 18 LLMs using a curated MedMCQA subset covering 12 ED chief complaints, assessing accuracy, precision, and recall. Five models (GPT-5, GPT-4, Claude 3.5, Claude 4, and LLaMA 3.1) were then evaluated on patient summaries, Emergency Severity Index scoring, investigative questioning, management planning, and differential diagnosis across 12 simulated ED cases presented through four sequential information levels. Physicians rated outputs for accuracy, safety, and clinical relevance, with performance differences analyzed statistically. LLaMA-4 Maverick achieved the highest factual accuracy(90.7%), followed by LLaMA-3.1-70B(90.1%). In clinical tasks, GPT-5 outperformed all models, (Level 2 onwards, p < 0.05), with performance stable or improving as complexity increased. Claude 3.5 ranked next, while Claude 4 performed slightly lower but stable with complexity. LLaMA-3.1 and GPT-4 ranked lowest and showed the greatest degradation. All models undertriaged except Claude 3.5, which initially overtriaged. GPT-5 demonstrated the strongest clinical reasoning and scalability with complexity, while LLaMA models excelled in factual recall. Findings suggest a generational leap in reasoning performance and support GPT-5 as a potential ED decision-support tool.

Information technology
DOAJ Open Access 2026
Performance of successive generative pretrained transformers (GPT) models in medical cases and board style questions

Anshum Patel, Het Contractor, Hayden Heninger et al.

Abstract Large language models (LLMs) are evolving rapidly, yet their performance trajectory in specialized medical domains remains incompletely characterized. We evaluated the diagnostic and knowledge-based accuracy of six successive generative pre-trained transformer (GPT) models to test the hypothesis that performance gains are beginning to plateau. We conducted a comparative evaluation of GPT-3.5 Turbo, GPT-4-Turbo, GPT-4o, GPT-4.1, GPT-o3, and GPT-5 using two datasets: 78 sleep medicine case vignettes to assess diagnostic reasoning, and 897 sleep medicine board-style multiple choice questions (MCQs) to assess domain knowledge. Diagnostic accuracy improved across model generations on clinical vignettes, from 74.4% (58/78) for GPT-3.5 Turbo to 93.6% (73/78) for GPT-o3 and 91.0% (71/78) for GPT-5. A similar trend occurred for MCQs, increasing from 56.9% for GPT-3.5 Turbo to 93.0% for GPT-5. Pairwise comparisons confirmed significant improvements for advanced models over earlier iterations on both tasks (P < 0.05), and the most recent models demonstrated high levels of clinical competency. These results suggest that the latest LLMs may be approaching a high level of performance in medical tasks of sleep medicine diagnosis and knowledge retrieval. Future progress may require incorporation of curated medical datasets and domain-specific training to achieve clinical-grade reliability.

Medicine, Science
DOAJ Open Access 2025
Enhanced soil salinity index prediction using hybrid stacking ensemble machine learning with explainable artificial intelligence (XAI) technique: a case study of the Nile Delta, Egypt

Satiprasad Sahoo, Chiranjit Singha, Ajit Govind

Abstract Soil salinity represents the leading form of land degradation in arid and semi-arid regions. This study employed five hybrid stacking ensemble (SE) machine learning models (SE-GBM, SE-RF, SE-SVM, SE-XGB, and SE-MARS) to map salinity distribution across Egypt’s Nile Delta for 2023 and projected conditions for 2030, using EC-Earth3 and MIROC6 CMIP6 climate scenarios under SSP2-4.5 and SSP5-8.5. Results reveal substantial differences between scenarios, with SSP5-8.5 indicating up to a 15% higher salinity increase in the eastern Delta compared to SSP2-4.5. This highlights its reliability for assessing future salinity dynamics across the Nile Delta. Model validation confirmed that the SE-GBM model achieved the highest accuracy in predicting soil salinity, with an R² of 0.396 and RMSE of 0.061. Except for MARS model due to their low accuracy, all models indicated that the north-eastern, eastern, and south-eastern Nile Delta had the highest soil salinity in 2023. Salinization in these zones is driven by climate change, seawater intrusion, poor irrigation, and human pressures. Boruta analysis highlighted pH as the most influential predictor, while bulk density was least significant. SHAP (SHapley Additive exPlanations) results further showed precipitation and clay content as key drivers of salinity variability. These findings underline the robustness of Machine learning models in capturing complex soil–climate interactions. Future work should expand applications in globally, especially in resource-constrained regions.

Science (General)
DOAJ Open Access 2025
Research on evaluation methods for inference business capability of intelligent computing center

WU Zhenyu, ZHAO Zhanjun, BU Zhonggui et al.

The construction of artificial intelligence inference centers has become a hotspot in the current development of intelligent computing centers. Evaluating the inference business capability of intelligent computing centers solely based on the scale of intelligent computing power is no longer accurate. A quantitative evaluation method for the inference business capability of intelligent computing centers was proposed by establishing three models: a delay-insensitive business model, a delay-sensitive business model, and a user access business model. This approach aims to achieve alignment between construction and requirements during the construction phase, thereby improving investment efficiency.

Telecommunication, Technology

Halaman 42 dari 178578