B. Bolker
Hasil untuk "Probabilities. Mathematical statistics"
Menampilkan 20 dari ~1724753 hasil · dari DOAJ, arXiv, CrossRef, Semantic Scholar
S. Fotopoulos
Shiyi Chen, Zhe Feng, Xiaolian Yi
L. Mandel, E. Wolf
S. Johansen
I. Csiszár
K. Koch
K. Pearson
Gaurav Kumar, R. Banerjee, Deepak Kr Singh et al.
Machine learning is a way to study the algorithm and statistical model that is used by computer to perform a specific task through pattern and deduction [1]. It builds a mathematical model from a sample data which may come under either supervised or unsupervised learning. It is closely related to computational statistics which is an interface between statistics and computer science. Also, linear algebra and probability theory are two tools of mathematics which form the basis of machine learning. In general, statistics is a science concerned with collecting, analysing, interpreting the data. Data are the facts and figure that can be classified as either quantitative or qualitative. From the given set of data, we can predict the expected observation, difference between the outcome of two observations and how data look like which can help in better decision making process [2]. Descriptive and inferential statistics are the two methods of data analysis. Descriptive statistics summarize the raw data into information through which common expectation and variation of data can be taken. It also provides graphical methods that can be used to visualize the sample of data and qualitative understanding of observation whereas inferential statistics refers to drawing conclusions from data. Inferences are made under the framework of probability theory. So, understanding of data and interpretation of result are two important aspects of machine learning. In this paper, we have reviewed the different methods of ML, mathematics behind ML, its application in day to day life and future aspects.
M. Bourguignon, Rodrigo B. Silva, G. Cordeiro
The Weibull distribution is the most important distribution for problems in reliability. We study some mathematical properties of the new wider Weibull-G family of distributions. Some special models in the new family are discussed. The properties derived hold to any distribution in this family. We obtain general explicit expressions for the quantile function, ordinary and incomplete moments, generating function and order statistics. We discuss the estimation of the model parameters by maximum likelihood and illustrate the potentiality of the extended family with two applications to real data.
B. Finetti, A. Machi, Adrian J. Smith
Donald L. Burkholder, R. Graham, W. Johnson et al.
G. Winkler
Natalie Gerhart, Elham Rastegari, Emma Cole
Analytical skills are important for all business majors. Because of the statistical underpinnings, statisticians are a natural fit for the delivery of these skills. Unfortunately, there is a lack of common understanding of what skills and competencies are needed. This article has two primary objectives: determine how business analytics course offerings delineate course content (i.e., where statistical and analytical topics are currently offered in business disciplines); and determine the industry analytical needs of various business domains. We performed two phases of analysis. First, we collected course listings and descriptions from the top thirty undergraduate and graduate business programs in the United States to determine what analytics courses are currently offered. Second, we collected industry data to determine what skills are needed across business disciplines based on job postings that identify analytics. Our results indicate there is some overlap of content between current courses. We also find that there are varying analytical skills and competencies needed in other business domains based on job postings. These results can help guide the structure of analytics offerings in business schools.
Dianliang Deng, Xiaoqing Zhang
Abstract This paper introduces the generalized Lindley binomial (GLB) distribution, a novel model for analyzing proportional data with excessive endpoint observations. The GLB distribution is derived by compounding the binomial distribution with a generalized three-parameter Lindley distribution, itself defined as a mixture of two gamma distributions with distinct rate parameters. We establish the probabilistic properties of the GLB distribution, including its probability mass function, factorial moments, mean, variance, moment generating function, and dispersion index, demonstrating its flexibility in modeling both under- and over-dispersed data as well as unimodal and bimodal shapes. Likelihood-based inference is developed for the GLB model, with and without covariates, using Fisher scoring and expectation-maximization (EM) algorithms. To improve estimation stability, a penalized EM algorithm incorporating Bayes-inspired penalties is proposed. Model diagnostics are addressed through Pearson and deviance residuals, as well as randomized quantile residual plots. Simulation studies are conducted to evaluate the performance of the estimation procedures under different scenarios. Finally, the practical utility of the GLB regression model is illustrated with the whitefly dataset, where it is shown to provide superior fit compared to existing endpoint-inflated binomial models.
W. Stewart
Qoria Yudi Pratama, Mohammad Isa Irawan
Monkeypox is a zoonotic disease that can be transmitted from animals to humans. The monkeypox virus is the cause of monkeypox disease, which belongs to the orthopoxvirus family. Although the mortality rate from monkeypox is not as high as COVID-19, this virus can be the cause of the next global pandemic if the epidemic worsens. Therefore, it is very important to carry out proper surveillance and prevention to prevent the spread of this disease. In this study, researchers developed another method to detect monkeypox disease based on its symptoms using the classification by association (CBA) method. CBA integrates the advantages of classification and association analysis, allowing the classification process and a deeper understanding of the strength of the relationship between features in the dataset through the analysis of metrics such as support and confidence. Based on the results of the experiments in this study, an accuracy of 68.64%, a precision of 92.21%, and a sensitivity of 71.09% were obtained. In this case, the accuracy obtained is still low, but the results of other metrics show that the CBA model performs fairly well in predicting the positive class with high precision.
Shafa Fitria Aqilah Khansa, Nurissaidah Ulinnuha, Wika Dianita Utami
Parkinson's disease is a neurodegenerative disorder affecting motor abilities, with a prevalence of 329 cases per 100,000 individuals. Early diagnosis is crucial to prevent complications. This study classifies Parkinson's disease using the Extreme Gradient Boosting (XGBoost) algorithm with hyperparameter tuning via Grid Search and Random Search. The dataset from Kaggle consists of 2105 records from 2024 and includes 32 clinical and demographic features such as age, gender, BMI, medical history, and Parkinson's symptoms. The XGBoost method effectively manages large and complex data and reduces. Tuning was performed with 5-fold cross-validation for result validity. After tuning with Grid Search, the model achieved 93.35% accuracy in 44 minutes 51 seconds, with optimal parameters gamma=5, max depth=3, learning rate=0.3, n estimators=100, and subsample=0.7. Meanwhile, Random Search with 50 iterations achieved 93.97% accuracy in 3 minutes 4 seconds with optimal parameters gamma=5, max depth=3, learning rate=0.262, n estimators=58, and subsample=0.631. Random Search also shows better time efficiency than Grid Search, although with relatively similar accuracy. The results of this study confirm that hyperparameter tuning using Random Search not only produces competitive accuracy performance but also minimizes computation time, making it a more optimal choice for Parkinson's disease classification.
Paul N Zivich
Within the biological, physical, and social sciences, there are two broad quantitative traditions: statistical and mathematical modeling. Both traditions have the common pursuit of advancing our scientific knowledge, but these traditions have developed largely independently using distinct languages and inferential frameworks. This paper uses the notion of identification from causal inference, a field originating from the statistical modeling tradition, to develop a shared language. I first review foundational identification results for statistical models and then extend these ideas to mathematical models. Central to this framework is the use of bounds, ranges of plausible numerical values, to analyze both statistical and mathematical models. I discuss the implications of this perspective for the interpretation, comparison, and integration of different modeling approaches, and illustrate the framework with a simple pharmacodynamic model for hypertension. To conclude, I describe areas where the approach taken here should be extended in the future. By formalizing connections between statistical and mathematical modeling, this work contributes to a shared framework for quantitative science. My hope is that this work will advance interactions between these two traditions.
Sukma Anindita, Rahmadi Yotenka
The current potential market demand with ever-changing situations and conditions must be managed properly to find out the potential market demand in the future. Rumah Warna Yogyakarta is one of the manufacturing industry players that has experienced fluctuations in market demand, even tending to decline from the end of 2021 to mid-2022. Data was obtained from the database and direct interviews with Rumah Warna in Yogyakarta from November 2021 to November 2022. This study aims to determine the prediction of product demand for Rumah Warna Yogyakarta in the next period, so that companies can carry out production planning strategies to minimize production cost. Product demand prediction is carried out using the Grey System method of the GM (1,1) model. Then proceed with the heuristic aggregate planning method that focuses on overtime control and subcontracting control. Based on the results of the analysis, the Grey System GM (1,1) method produces good prediction accuracy of 9.231%. The best aggregate planning method is the overtime control where Rumah Warna Yogyakarta can reduce costs by Rp 351,258,758 when compared to the subcontracting control method.
Halaman 12 dari 86238