Agnivo Gosai, Shuvodeep De, Karun Thankachan
et al.
This paper presents a comprehensive survey of sentiment analysis methods for movie reviews, a benchmark task that has played a central role in advancing natural language processing. We review the evolution of techniques from early lexicon-based and classical machine learning approaches to modern deep learning architectures and large language models, covering widely used datasets such as IMDb, Rotten Tomatoes, and SST-2, and models ranging from Naive Bayes and support vector machines to LSTM networks, BERT, and attention-based transformers. Beyond summarizing prior work, this survey differentiates itself by offering a comparative, challenge-driven analysis of how these modeling paradigms address domain-specific issues such as sarcasm, negation, contextual ambiguity, and domain shift, which remain open problems in existing literature. Unlike earlier reviews that focus primarily on text-only pipelines, we also synthesize recent advances in multimodal sentiment analysis that integrate textual, audio, and visual cues from movie trailers and clips. In addition, we examine emerging concerns related to interpretability, fairness, and robustness that are often underexplored in prior surveys, and we outline future research directions including zero-shot and few-shot learning, hybrid symbolic--neural models, and real-time deployment considerations. Overall, this abstract provides a domain-focused roadmap that highlights both established solutions and unresolved challenges toward building more accurate, generalizable, and explainable sentiment analysis systems for movie review data.
In modern display technology and visualization tools, downscaling images is one of the most important activities. This procedure aims to maintain both visual authenticity and structural integrity while reducing the dimensions of an image at a large scale to fit the dimension of the display devices. In this study, we proposed a new technique for downscaling images that uses co-occurrence learning to maintain structural and perceptual information while reducing resolution. The technique uses the input image to create a data-driven co-occurrence profile that captures the frequency of intensity correlations in nearby neighborhoods. A refined filtering process is guided by this profile, which acts as a content-adaptive range kernel. The contribution of each input pixel is based on how closely it resembles pair-wise intensity values with it's neighbors. We validate our proposed technique on four datasets: DIV2K, BSD100, Urban100, and RealSR to show its effective downscaling capacity. Our technique could obtain up to 39.22 dB PSNR on the DIV2K dataset and PIQE up to 26.35 on the same dataset when downscaling by 8x and 16x, respectively. Numerous experimental findings attest to the ability of the suggested picture downscaling method to outperform more contemporary approaches in terms of both visual quality and performance measures. Unlike most existing methods, which did not focus on the large-scale image resizing scenario, we achieve high-quality downscaled images without texture loss or edge blurring. Our method, LSID (large scale image downscaling), successfully preserves high-frequency structures like edges, textures, and repeating patterns by focusing on statistically consistent pixels while reducing aliasing and blurring artifacts that are typical of traditional downscaling techniques.
From large language models to multi-modal agents, Generative Artificial Intelligence (AI) now underpins state-of-the-art systems. Despite their varied architectures, many share a common foundation in probabilistic latent variable models (PLVMs), where hidden variables explain observed data for density estimation, latent reasoning, and structured inference. This paper presents a unified perspective by framing both classical and modern generative methods within the PLVM paradigm. We trace the progression from classical flat models such as probabilistic PCA, Gaussian mixture models, latent class analysis, item response theory, and latent Dirichlet allocation, through their sequential extensions including Hidden Markov Models, Gaussian HMMs, and Linear Dynamical Systems, to contemporary deep architectures: Variational Autoencoders as Deep PLVMs, Normalizing Flows as Tractable PLVMs, Diffusion Models as Sequential PLVMs, Autoregressive Models as Explicit Generative Models, and Generative Adversarial Networks as Implicit PLVMs. Viewing these architectures under a common probabilistic taxonomy reveals shared principles, distinct inference strategies, and the representational trade-offs that shape their strengths. We offer a conceptual roadmap that consolidates generative AI's theoretical foundations, clarifies methodological lineages, and guides future innovation by grounding emerging architectures in their probabilistic heritage.
Serry Sibaee, Omer Nacar, Yasser Al-Habashi
et al.
The rich linguistic landscape of the Arab world is characterized by a significant gap between Modern Standard Arabic (MSA), the language of formal communication, and the diverse regional dialects used in everyday life. This diglossia presents a formidable challenge for natural language processing, particularly machine translation. This paper introduces \textbf{SHAMI-MT}, a bidirectional machine translation system specifically engineered to bridge the communication gap between MSA and the Syrian dialect. We present two specialized models, one for MSA-to-Shami and another for Shami-to-MSA translation, both built upon the state-of-the-art AraT5v2-base-1024 architecture. The models were fine-tuned on the comprehensive Nabra dataset and rigorously evaluated on unseen data from the MADAR corpus. Our MSA-to-Shami model achieved an outstanding average quality score of \textbf{4.01 out of 5.0} when judged by OPENAI model GPT-4.1, demonstrating its ability to produce translations that are not only accurate but also dialectally authentic. This work provides a crucial, high-fidelity tool for a previously underserved language pair, advancing the field of dialectal Arabic translation and offering significant applications in content localization, cultural heritage, and intercultural communication.
Power is the primary design objective of large-scale integrated circuits (ICs), especially for complex modern processors (i.e., CPUs). Accurate CPU power evaluation requires designers to go through the whole time-consuming IC implementation process, easily taking months. At the early design stage (e.g., architecture-level), classical power models are notoriously inaccurate. Recently, ML-based architecture-level power models have been proposed to boost accuracy, but the data availability is a severe challenge. Currently, there is no open-source dataset for this important ML application. A typical dataset generation process involves correct CPU design implementation and repetitive execution of power simulation flows, requiring significant design expertise, engineering effort, and execution time. Even private in-house datasets often fail to reflect realistic CPU design scenarios. In this work, we propose ArchPower, the first open-source dataset for architecture-level processor power modeling. We go through complex and realistic design flows to collect the CPU architectural information as features and the ground-truth simulated power as labels. Our dataset includes 200 CPU data samples, collected from 25 different CPU configurations when executing 8 different workloads. There are more than 100 architectural features in each data sample, including both hardware and event parameters. The label of each sample provides fine-grained power information, including the total design power and the power for each of the 11 components. Each power value is further decomposed into four fine-grained power groups: combinational logic power, sequential logic power, memory power, and clock power. ArchPower is available at https://github.com/hkust-zhiyao/ArchPower.
Akansha Kalra, Basavasagar Patil, Guanhong Tao
et al.
Learning from Demonstration (LfD) algorithms have shown promising results in robotic manipulation tasks, but their vulnerability to offline universal perturbation attacks remains underexplored. This paper presents a comprehensive study of adversarial attacks on both classic and recently proposed algorithms, including Behavior Cloning (BC), LSTM-GMM, Implicit Behavior Cloning (IBC), Diffusion Policy (DP), and Vector-Quantizied Behavior Transformer (VQ-BET). We study the vulnerability of these methods to universal adversarial perturbations. Our experiments on several simulated robotic manipulation tasks reveal that most of the current methods are highly vulnerable to adversarial perturbations. We also show that these attacks are often transferable across algorithms, architectures, and tasks, raising concerning security vulnerabilities to black-box attacks. To the best of our knowledge, we are the first to present a systematic study of the vulnerabilities of different LfD algorithms to both white-box and black-box attacks. Our findings highlight the vulnerabilities of modern BC algorithms, paving the way for future work in addressing such limitations.
We revisit the longstanding electromagnetic mass problem from a modern quantum field theory perspective. Focusing on a system of two widely separated hydrogen atoms, one in an excited $nS$ state and the other in the ground $1S$ state, we isolate the electromagnetic contribution to the electron's total linear momentum by comparing the full energy-momentum tensor with the predictions of a point-like bound state model. Our analysis reveals that the leading perturbative correction introduces a factor $4/3$, which, along with subsequent corrections, indicates that the effective electromagnetic mass deviates from the conventional relation $E/c^2$. This discrepancy is attributed to the intrinsic nonlocality of the electromagnetic field, rather than to additional compensating mechanisms such as Poincaré stresses. We further contrast our quantum field theory results with the highly accurate predictions of the Schrödinger equation, which, despite neglecting higher-order terms, achieves an average error on the order of $10^{-5}\%$. Attempts to improve this accuracy via perturbative inclusion of the self-interaction of the electron's wave function instead increase the error, prompting a re-examination of the underlying perturbative assumptions. Our findings suggest that a non-perturbative treatment of the tree-level action may be required to fully capture the dynamics of bound states in quantum field theory.
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. Despite the recent introduction of Word Embeddings and Recurrent Neural Networks to design powerful context-related features, the interest in improving WSD models using Semantic Lexical Resources (SLRs) is mostly restricted to knowledge-based approaches. In this paper, we enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We propose an effective way to introduce semantic features into the classifiers, and we consider using the SLR structure to augment the training data. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks, and we extend the proposed model into a novel multi-layer architecture for WSD. A detailed experimental comparison in the recent Unified Evaluation Framework (Raganato et al., 2017) shows that the proposed approach leads to supervised models that compare favourably with the state-of-the art.
As large language models achieve increasingly impressive results, questions arise about whether such performance is from generalizability or mere data memorization. Thus, numerous data contamination detection methods have been proposed. However, these approaches are often validated with traditional benchmarks and early-stage LLMs, leaving uncertainty about their effectiveness when evaluating state-of-the-art LLMs on the contamination of more challenging benchmarks. To address this gap and provide a dual investigation of SOTA LLM contamination status and detection method robustness, we evaluate five contamination detection approaches with four state-of-the-art LLMs across eight challenging datasets often used in modern LLM evaluation. Our analysis reveals that (1) Current methods have non-trivial limitations in their assumptions and practical applications; (2) Notable difficulties exist in detecting contamination introduced during instruction fine-tuning with answer augmentation; and (3) Limited consistencies between SOTA contamination detection techniques. These findings highlight the complexity of contamination detection in advanced LLMs and the urgent need for further research on robust and generalizable contamination evaluation. Our code is available at https://github.com/vsamuel2003/data-contamination.
Benjamin Warner, Antoine Chaffin, Benjamin Clavié
et al.
Encoder-only transformer models such as BERT offer a great performance-size tradeoff for retrieval and classification tasks with respect to larger decoder-only models. Despite being the workhorse of numerous production pipelines, there have been limited Pareto improvements to BERT since its release. In this paper, we introduce ModernBERT, bringing modern model optimizations to encoder-only models and representing a major Pareto improvement over older encoders. Trained on 2 trillion tokens with a native 8192 sequence length, ModernBERT models exhibit state-of-the-art results on a large pool of evaluations encompassing diverse classification tasks and both single and multi-vector retrieval on different domains (including code). In addition to strong downstream performance, ModernBERT is also the most speed and memory efficient encoder and is designed for inference on common GPUs.
Adam N. McCaughan, Bakhrom G. Oripov, Natesh Ganesh
et al.
We present multiplexed gradient descent (MGD), a gradient descent framework designed to easily train analog or digital neural networks in hardware. MGD utilizes zero-order optimization techniques for online training of hardware neural networks. We demonstrate its ability to train neural networks on modern machine learning datasets, including CIFAR-10 and Fashion-MNIST, and compare its performance to backpropagation. Assuming realistic timescales and hardware parameters, our results indicate that these optimization techniques can train a network on emerging hardware platforms orders of magnitude faster than the wall-clock time of training via backpropagation on a standard GPU, even in the presence of imperfect weight updates or device-to-device variations in the hardware. We additionally describe how it can be applied to existing hardware as part of chip-in-the-loop training, or integrated directly at the hardware level. Crucially, the MGD framework is highly flexible, and its gradient descent process can be optimized to compensate for specific hardware limitations such as slow parameter-update speeds or limited input bandwidth.
Hendrik Ranocha, Michael Schlottke-Lakemper, Jesse Chan
et al.
Many modern discontinuous Galerkin (DG) methods for conservation laws make use of summation by parts operators and flux differencing to achieve kinetic energy preservation or entropy stability. While these techniques increase the robustness of DG methods significantly, they are also computationally more demanding than standard weak form nodal DG methods. We present several implementation techniques to improve the efficiency of flux differencing DG methods that use tensor product quadrilateral or hexahedral elements, in 2D or 3D respectively. Focus is mostly given to CPUs and DG methods for the compressible Euler equations, although these techniques are generally also useful for other physical systems including the compressible Navier-Stokes and magnetohydrodynamics equations. We present results using two open source codes, Trixi.jl written in Julia and FLUXO written in Fortran, to demonstrate that our proposed implementation techniques are applicable to different code bases and programming languages.
Modern electron linear accelerators are often designed to produce smooth bunch distributions characterized by their macroscopic ensemble-average moments. However, an increasing number of accelerator applications call for finer control over the beam distribution, e.g., by requiring specific shapes for its projection along one coordinate. Ultimately, the control of the beam distribution at the single-particle level could enable new opportunities in accelerator science. This review discusses the recent progress toward controlling electron beam distributions on the "mesoscopic" scale with an emphasis on shaping the beam or introducing complex correlations required for some applications. This review emphasizes experimental and theoretical developments of electron-bunch shaping methods based on bounded external electromagnetic fields or via interactions with the self-generated velocity and radiation fields.
Object detection remains as one of the most notorious open problems in computer vision. Despite large strides in accuracy in recent years, modern object detectors have started to saturate on popular benchmarks raising the question of how far we can reach with deep learning tools and tricks. Here, by employing 2 state-of-the-art object detection benchmarks, and analyzing more than 15 models over 4 large scale datasets, we I) carefully determine the upper bound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017), and 58.9% on OpenImages V4 (validation), regardless of the IOU threshold. These numbers are much better than the mAP of the best model (47.9% on VOC, and 46.9% on COCO; IOUs=.5:.05:.95), II) characterize the sources of errors in object detectors, in a novel and intuitive way, and find that classification error (confusion with other classes and misses) explains the largest fraction of errors and weighs more than localization and duplicate errors, and III) analyze the invariance properties of models when surrounding context of an object is removed, when an object is placed in an incongruent background, and when images are blurred or flipped vertically. We find that models generate a lot of boxes on empty regions and that context is more important for detecting small objects than larger ones. Our work taps into the tight relationship between object detection and object recognition and offers insights for building better models. Our code is publicly available at https://github.com/aliborji/Deetctionupper bound.git.
Brian Thomas, Tim Jenness, Frossie Economou
et al.
The Flexible Image Transport System (FITS) standard has been a great boon to astronomy, allowing observatories, scientists and the public to exchange astronomical information easily. The FITS standard, however, is showing its age. Developed in the late 1970s, the FITS authors made a number of implementation choices that, while common at the time, are now seen to limit its utility with modern data. The authors of the FITS standard could not anticipate the challenges which we are facing today in astronomical computing. Difficulties we now face include, but are not limited to, addressing the need to handle an expanded range of specialized data product types (data models), being more conducive to the networked exchange and storage of data, handling very large datasets, and capturing significantly more complex metadata and data relationships. There are members of the community today who find some or all of these limitations unworkable, and have decided to move ahead with storing data in other formats. If this fragmentation continues, we risk abandoning the advantages of broad interoperability, and ready archivability, that the FITS format provides for astronomy. In this paper we detail some selected important problems which exist within the FITS standard today. These problems may provide insight into deeper underlying issues which reside in the format and we provide a discussion of some lessons learned. It is not our intention here to prescribe specific remedies to these issues; rather, it is to call attention of the FITS and greater astronomical computing communities to these problems in the hope that it will spur action to address them.
Salvatore Capozziello, Mariafelicia De Laurentis, Orlando Luongo
Inflation and dark energy are two of the most relevant aspects of modern cosmology. These different epochs provide the universe is passing through accelerated phases soon after the Big-Bang and at present stage of its evolution. In this review paper, we discuss that both eras can be, in principle, described by a geometric picture, under the standard of $f(R)$ gravity. We give the fundamental physics motivations and outline the main ingredients of $f(R)$ inflation, quintessence and cosmography. This wants to be a quick summary of $f(R)$ paradigm without claiming of completeness.
In this first paper, we briefly retrace some historical pathways of modern physics of 20th Century. In particular, we have considered some moments of cosmic ray physics and, above all, the early theoretical and experimental bases which will lead to the first exact measurements of the anomalous magnetic moment of the muon, one of the main high precision tests of QED.