The Modified Fréchet-Exponentiated Exponential Distribution: Novel Model for Reliability and Survival Analysis
Merga Abdissa Aga, Shibiru Jabessa Dugasa, Habte Tadese
et al.
This study introduces a novel statistical model called the modified Fréchet-exponentiated exponential (MFrEE) distribution. The existing exponentiated exponential (EE) distribution, while useful for lifetime and reliability data, has limited flexibility in capturing diverse hazard shapes and may not adequately model extreme events or tail behavior. To address these limitations, the MFrEE distribution applies a modified Fréchet generator to the EE baseline, enhancing the model’s flexibility and robustness. Its survival and hazard functions, cumulative distribution function, and probability density function are derived, presented, and illustrated with plots for various parameter values. The study provides a comprehensive mathematical analysis of the distribution, deriving its moments, mean, variance, quantiles, and moment-generating function. Methodologically, the model is simulated using an accept–reject algorithm, and its parameters are estimated via maximum likelihood estimation (MLE). The performance of the estimators is assessed through Monte Carlo simulations using bias, mean squared error, and coverage probability (CP), with the CP results showing values close to the nominal 95% level across different parameter settings. Furthermore, the robustness and performance of the proposed method are evaluated using AIC, BIC, and AICc, demonstrating superior performance compared to baseline methods across three publicly available datasets. The study concludes by proposing this model as a significant contribution to probability theory and suggests two avenues for future research: applying the model to more real-world problems and using machine learning methods for parameter estimation to compare with the MLE approach used in this study.
Probabilities. Mathematical statistics
Probabilities
Jean-Yves Ouvrard, Xavier Ouvrard
Probabilities is the English translation of the book Probabilités Tome 1 and Tome 2. The mathematic content is authored by Prof. Jean-Yves Ouvrard. The English version has been done by his eldest son Dr. Xavier Ouvrard. In this first version, only the first part is released. Part 1 contains 7 chapters and corresponds to bachelor level. The first part introduces the fundamentals of probability theory, including event algebras, random variables, independence, conditional probabilities, moments of discrete and continuous random variables, generating functions, and limit theorems. Disclaimer: The second part is given as such and still needs full review. Corrected versions will be made available as soon as they are available during the coming months of 2026, stay tune!
Investigating the Implications of Goods and Services Tax Revenue on Economic Growth: Empirical Insight from Indian Economy
Shubham Garg, Sangeeta Mittal, Aman Garg
The current study aims to investigate the impact of Goods and Services Tax (GST) revenue on the economic growth of the Indian economy. The study has used the Auto Regressive Distributed Lag (ARDL) modeling by collecting the data from August, 2017 to March, 2024. The results depict that GST revenue has a positive impact on the economic growth of the Indian economy in both short and long run. Similarly, the results assert that foreign direct investment and government expenditure also exert a positive impact on the economic growth in India. Conversely, the results affirm that gross fiscal deficit and inflation have adverse impact on the Indian economy. The findings assert that the policymakers should devise policies to curb the inflation and fiscal deficit to attain long run economic growth for the Indian economy. Similarly, proper consideration should be given to boost the GST revenue and FDI inflow in the Indian economy. The findings have major implications for the policymakers, GST council and government to boost the economic growth and GST revenue of the nation.
Political institutions and public administration (General), Probabilities. Mathematical statistics
NIL DERIVATIONS AND d-IDEALS ON POLYNOMIAL RINGS
Ditha Lathifatul Mursyidah, Fitriani Fitriani, Bernadhita Herindri Samodera Utami
et al.
Let be a ring. An additive mapping is called derivation if satisfies Leibniz's rule, i.e., for every In a special case, for each there exists a positive integer which depends on such that , then is called as a nil derivation on . The concept of - ideal which is an ideal that remains stable under the derivation operation . This research presents a systematic construction of nil derivations on polynomial rings and investigates their corresponding nilpotency indices. Unlike prior studies that often treat derivations in abstract terms, this work emphasizes explicit constructions, offering concrete examples and techniques for generating such derivations. A key focus is the relationship between nil derivations and general nilpotent derivations, including an analysis of their linear combinations. Furthermore, the study provides new insights into the behavior of nil derivations in the context of d-ideals, shedding light on their structural properties within ring theory. To enhance understanding, each theoretical development is supported by illustrative examples, reinforcing the applicability and significance of the results.
Probabilities. Mathematical statistics
Valid causal inference with unobserved confounding in high-dimensional settings
Moosavi Niloofar, Gorbach Tetiana, de Luna Xavier
Various methods have recently been proposed to estimate causal effects with confidence intervals that are uniformly valid over a set of data-generating processes when high-dimensional nuisance models are estimated by post-model-selection or machine learning estimators. These methods typically require that all the confounders are observed to ensure identification of the effects. We contribute by showing how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models. We propose uncertainty intervals that allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is not arbitrarily large; the latter is formalized in terms of convergence rates. Simulation experiments illustrate the finite sample properties of the proposed intervals. Finally, a case study on the effect of smoking during pregnancy on birth weight is used to illustrate the use of the methods introduced to perform an informed sensitivity analysis to the presence of unobserved confounding.
Mathematics, Probabilities. Mathematical statistics
Streamlining business functions in official statistical production with Machine Learning
Sandra Barragán, Adrián Pérez-Bote, Carlos Sáez
et al.
We provide a description of pilot and production experiences to streamline some business functions in the official statistical production process using statistical learning models. Our approach is quality-oriented searching for an improvement on accuracy, cost-efficiency, timeliness, granularity, response burden reduction, and frequency. Pilot experiences have been conducted with data from real surveys in Statistics Spain (INE).
PREDICTION OF CRUDE OIL PRICES IN INDONESIA USING FOURIER SERIES ESTIMATOR AND ARIMA METHOD
Alma Khalisa Rahma, Qumadha Zaenal Abidin, Juan Krisfigo Prasetyo
et al.
Crude oil is one of the non-renewable natural resources that is crucial for countries around the world in driving economic development. However, the availability of crude oil is decreasing over time. The high demand for crude oil results in scarcity which causes price fluctuations. Low oil prices can reduce state revenues, disrupt development programs, and even trigger budget deficits. On the other hand, an increase in crude oil prices can make a positive contribution to state revenues. Crude oil exports become more profitable, which can increase state revenue through royalties and taxes levied on the oil and gas sector. This additional revenue can be used to support infrastructure development, social programs, and investment in key sectors of the economy. High oil prices can also harm the economy. With the many impacts that can be caused by crude oil prices, the government must be able to anticipate and prepare for it. The data used in this study are data on crude oil prices in Indonesia for monthly periods from January 2018 to October 2023 sourced from the official website of the Ministry of Energy and Mineral Resources (ESDM) of the Republic of Indonesia. The researcher tried to compare two analysis methods, namely the Fourier series and the ARIMA estimator. The results of this study show that the best method in predicting crude oil prices in Indonesia is the Fourier series estimator with Cos-Sin function which produces RMSE and MAPE values of 7.93 and 8.4%. The prediction results can be used as a reference for the government to anticipate and make programs or policies that are more focused and targeted toward the impacts that can be caused by changes in crude oil prices.
Probabilities. Mathematical statistics
Statistical algorithms for low-frequency diffusion data: A PDE approach
Matteo Giordano, Sven Wang
We consider the problem of making nonparametric inference in a class of multi-dimensional diffusions in divergence form, from low-frequency data. Statistical analysis in this setting is notoriously challenging due to the intractability of the likelihood and its gradient, and computational methods have thus far largely resorted to expensive simulation-based techniques. In this article, we propose a new computational approach which is motivated by PDE theory and is built around the characterisation of the transition densities as solutions of the associated heat (Fokker-Planck) equation. Employing optimal regularity results from the theory of parabolic PDEs, we prove a novel characterisation for the gradient of the likelihood. Using these developments, for the nonlinear inverse problem of recovering the diffusivity, we then show that the numerical evaluation of the likelihood and its gradient can be reduced to standard elliptic eigenvalue problems, solvable by powerful finite element methods. This enables the efficient implementation of a large class of popular statistical algorithms, including (i) preconditioned Crank-Nicolson and Langevin-type methods for posterior sampling, and (ii) gradient-based descent optimisation schemes to compute maximum likelihood and maximum-a-posteriori estimates. We showcase the effectiveness of these methods via extensive simulation studies in a nonparametric Bayesian model with Gaussian process priors, in which both the proposed optimisation and sampling schemes provide good numerical recovery. The reproducible code is available online at https://github.com/MattGiord/LF-Diffusion.
The Impact of COVID-19 on Relative Changes in Aggregated Mobility Using Mobile-phone Data
Georg Heiler, Allan Hanbury, Peter Filzmoser
Mobile-phone data can be used to investigate the mobility of a big part of a population in a given period. Here we have analyzed this information for Austria in the first half year of the COVID-19 pandemic. Especially the period around the first lockdown was of interest, and our focus is on exploring possible differences between age groups and among females and males. The data is once treated from an absolute point of view, by analyzing the numbers as they are reported and from a relative point of view, with the help of compositional data analysis tools. Our goal is to compare analyses of the absolute values and of relative information, in order to reveal possible differences in the groups formed by age and gender. It turns out that both types of analyses provide different and partially complementary insights. This is also underlined when analyzing data from call durations, or subdata just for specific Austrian districts.
Probabilities. Mathematical statistics, Statistics
POISSON REGRESSION MODELING GENERALIZED IN MATERNAL MORTALITY CASES IN ACEH TAMIANG REGENCY
Riska Novita Sari, Ulya Nabilla, Riezky Purnama Sari
Maternal Mortality Rate (MMR) is the number of maternal deaths due to the process of pregnancy, childbirth, and postpartum which is used as an indicator of women's health degrees. The number of maternal deaths in Aceh Tamiang Regency in 2021 is a discrete random variable distributed by Poisson. The purpose of this study is to find out what poisson regression model is generalized in the case of MMR in Aceh Tamiang Regency in 2021 and what factors affect the AKI in Aceh Tamiang Regency in 2021. The research data was obtained from the Aceh Tamiang District Health Office. This type of research is quantitative by using the Generalized Poisson Regression method. The data used are maternal mortality rates and data on factors affecting MMR in Aceh Tamiang Regency in 2021. Influencing factors are the percentage of visits by pregnant women in K1 , percentage of visits by pregnant women K4 , percentage of maternity assistance by health workers , TT immunization of pregnant women , pregnant women who get Fe tablets , and puerperal ministry . Based on the results of research, the factors that affect the maternal mortality rate in Aceh Tamiang Regency in 2021 are TT immunizations for pregnant women (X4) with a p-value of 0.009 which states that for every additional TT immunization of pregnant women by 1%, the average maternal mortality rate also decreases by . The form of the generalized poisson regression model obtained is .
Probabilities. Mathematical statistics
On Some Non-local Boundary Value and Internal Boundary Value Problems for the String Oscillation Equation
A.Kh. Attaev
The work is devoted to the problem of setting new boundary and internal boundary value problems for hyperbolic equations. The consideration of these settings is given on the example of a wave equation. The research involves the d’Alembert method, the mean value theorem and the method of successive approximations. The paper formulates and studies a number of non-local problems summarizing the classical Goursat and Dardu tasks. Some of them are marginal, and the other part is internal-marginal, and in both cases both characteristic and uncharacteristic displacements are considered. It should also be noted that a number of problems discussed below arose as a special case in the construction of the theory of correct problems for the model loaded equation of string oscillation.
Analysis, Analytic mechanics
Reimagining Doctoral Training in Statistics: Is There a Role for a Professional Doctorate?
Camden L. Lopez
Modern demands of the statistics profession call for reimagining statistics training. The discipline needs to attract and develop students who are effective as real-world problem solvers, interdisciplinary collaborators, communicators, leaders, and teachers. Demand for statistics professionals with broad technical and non-technical skills has grown in a variety of settings, but especially in business and industry. Academic curricula, though, remain primarily oriented around a narrow, technical conception of statistics. Advanced graduate-level training essentially is limited to research doctorate (PhD) programs which tend to prioritize theoretical and methodological research over development of effective applied statisticians. Other professions, such as those of physicians and surgeons, have training oriented around a professional doctorate, as opposed to a research doctorate. The statistics profession should consider not only changes to PhD curricula, but also the potential for a professional doctorate, drawing ideas from the curricula of other professional degrees such as the MD.
On the Statistical Complexity of Sample Amplification
Brian Axelrod, Shivam Garg, Yanjun Han
et al.
The ``sample amplification'' problem formalizes the following question: Given $n$ i.i.d. samples drawn from an unknown distribution $P$, when is it possible to produce a larger set of $n+m$ samples which cannot be distinguished from $n+m$ i.i.d. samples drawn from $P$? In this work, we provide a firm statistical foundation for this problem by deriving generally applicable amplification procedures, lower bound techniques and connections to existing statistical notions. Our techniques apply to a large class of distributions including the exponential family, and establish a rigorous connection between sample amplification and distribution learning.
SOFT CLUSTERING DENGAN ALGORITMA FUZZY K-MEANS (STUDI KASUS : PENGELOMPOKAN DESA DI KOTA TIDORE KEPULAUAN)
Muhamad Budiman Johra
Mengembangkan wilayah untuk mengurangi kesenjangan dan menjamin pemerataan merupakan salah satu dari tujuh agenda Pembangunana RPJMN IV Tahun 2020-2024. Setiap wilayah tentunya memiliki potensi yang berbeda, baik potensi fisik maupun non-fisik. Perbedaan inilah yang menjadi dasar dalam pengelompokan desa sehingga pembangunan desa menjadi lebih terarah. Secara umum metode klaster dapat dibedakan menjadi dua kelompok yaitu hard clustering dan soft clustering. Pada hard clustering setiap objek dipetakan terhadap setiap kelompok. Metode yang populer pada kelompok hard clustering adalah Cluster K-Means. Sedangkan pada soft clustering objek tidak hanya dipetakan kedalam satu kelompok. Fuzzy K Means (FCM) merupakan salah satu metode dalam soft clustering, dimana Fuzzy K Means merupakan pengembangan dari Cluster K-Means. Cara kerja FCM adalah objek diberi probabilitas yang pada dasarnya menggambarkan kepemilikan objek ke dalam Cluster.
Probabilities. Mathematical statistics
Trends in Teaching Advanced Placement Statistics: Results from a National Survey
Hollylynne S. Lee, Taylor Harrison
This study provides a glimpse into the professional learning, beliefs, and practices of high school teachers of Advanced Placement (AP) Statistics. Data are from a survey of 445 AP Statistics teachers in late 2018. Results indicate many AP Statistics teachers have taken several statistics courses and engage in professional development related to statistics sponsored by the College Board (summer institutes, exam readings, and online community). They generally do not engage with resources developed by the American Statistical Association and the statistics education community. While AP statistics teachers structure class time with student–student interaction and use student-centered activities, they generally do not use statistics-specific technology tools and rarely engage students with datasets larger than 100 cases or with multiple variables. Teachers’ beliefs about teaching statistics do not always reflect their teaching practices. Personal time to improve, time with students (especially those on a blocked semester schedule), structure of curriculum and exam schedule, and lack of access to technology often prevent teachers from making changes to their practices. Findings call for targeted efforts to reach high school statistics teachers, engage them more in the statistics education community, and encourage curriculum and instructional approaches that more closely align with recommendations and trends in college-level introductory statistics.
Probabilities. Mathematical statistics, Special aspects of education
New Numerical Solution for Two Parametric Surfaces Intersection Dragging Problem
Ramadhan A. M. Alsaidi
The problem of intersecting two parametric surfaces has been one of the main technical challenges in computer-aided design, computer graphics, solid modeling, and geometrics. This paper aims at reducing and minimizing time and space required for the computations process of parametric surface intersection. To do this, a new numerically accelerating method based on continuation technique was utilized first by calculating a starting point, and second by tracing sequential points along the intersection curve following Broyden’s method. Two factors have been identified as influential in controlling component jumping: initial points and step size. Test examples of intersecting two parametric surfaces demonstrated that this method was highly efficient with high-speed parametric solution. The intersection results are often given as curve’s points.
Probabilities. Mathematical statistics, Analysis
A Practical Two-Sample Test for Weighted Random Graphs
Mingao Yuan, Qian Wen
Network (graph) data analysis is a popular research topic in statistics and machine learning. In application, one is frequently confronted with graph two-sample hypothesis testing where the goal is to test the difference between two graph populations. Several statistical tests have been devised for this purpose in the context of binary graphs. However, many of the practical networks are weighted and existing procedures can't be directly applied to weighted graphs. In this paper, we study the weighted graph two-sample hypothesis testing problem and propose a practical test statistic. We prove that the proposed test statistic converges in distribution to the standard normal distribution under the null hypothesis and analyze its power theoretically. The simulation study shows that the proposed test has satisfactory performance and it substantially outperforms the existing counterpart in the binary graph case. A real data application is provided to illustrate the method.
A class of admissible estimators of multiple regression coefficient with an unknown variance
Chengyuan Song, Dongchu Sun
Suppose that we observe ${\boldsymbol y} \mid {\boldsymbol \theta },\ \tau \sim N_p({\boldsymbol X}{\boldsymbol \theta },\tau ^{-1}{\boldsymbol I}_p) $, where ${\boldsymbol \theta } $ is an unknown vector with unknown precision τ. Estimating the regression coefficient ${\boldsymbol \theta } $ with known τ has been well studied. However, statistical properties such as admissibility in estimating ${\boldsymbol \theta } $ with unknown τ are not well studied. Han [(2009). Topics in shrinkage estimation and in causal inference (PhD thesis). Warton School, University of Pennsylvania] appears to be the first to consider the problem, developing sufficient conditions for the admissibility of estimating means of multivariate normal distributions with unknown variance. We generalise the sufficient conditions for admissibility and apply these results to the normal linear regression model. 2-level and 3-level hierarchical models with unknown precision τ are investigated when a standard class of hierarchical priors leads to admissible estimators of ${\boldsymbol \theta } $ under the normalised squared error loss. One reason to consider this problem is the importance of admissibility in the hierarchical prior selection, and we expect that our study could be helpful in providing some reference for choosing hierarchical priors.
Probabilities. Mathematical statistics
Mathematical model of a root harvester after-cleaning system
R.B. Hevko, I.G. Tkachenko, Y.B. Hlado
et al.
A mathematical model of root crops after cleaning and their movement pattern in a technological run for loading into means of transportation has been presented. The impact of both controlled and uncontrolled factors on the efficient continuous transportation of root crops has been determined on the basis of the above mentioned model. The use of the suggested mathematical model will enable to narrow the search of the most efficient design, kinematic and dynamic parameters in conducting some experimental research on efficient operation of after cleaning systems of harvesters with their possible adjustment according to the natural and climatic conditions of harvesting.
Analysis, Analytic mechanics
Introducing Data Science Techniques by Connecting Database Concepts and dplyr
Jennifer E. Broatch, Suzanne Dietrich, Don Goelman
Early exposure to data science skills, such as relational databases, is essential for students in statistics as well as many other disciplines in an increasingly data driven society. The goal of the presented pedagogy is to introduce undergraduate students to fundamental database concepts and to illuminate the connection between these database concepts and the functionality provided by the dplyr package for R. Specifically, students are introduced to relational database concepts using visualizations that are specifically designed for students with no data science or computing background. These educational tools, which are freely available on the Web, engage students in the learning process through a dynamic presentation that gently introduces relational databases and how to ask questions of data stored in a relational database. The visualizations are specifically designed for self-study by students, including a formative self-assessment feature. Students are then assigned a corresponding statistics lesson to utilize statistical software in R within the dplyr framework and to emphasize the need for these database skills. This article describes a pilot experience of introducing this pedagogy into a calculus-based introductory statistics course for mathematics and statistics majors, and provides a brief evaluation of the student perspective of the experience. Supplementary materials for this article are available online.
Special aspects of education, Probabilities. Mathematical statistics