This methodological article aims to present the type I Pareto distribution in a clear and illustrative manner for better understanding among social researchers. It also provides R scripts for practical application. This continuous distribution, with its inverted J shape, skewness towards the right side, and heavy right tail, serves as an effective probability model for various social variables, such as wealth and income, as well as behaviors that are highly frequent in a few individuals and infrequent in the majority. The type I distribution, which has a scale parameter xm and a shape parameter α, is introduced, beginning with a brief historical overview. The density, cumulative distribution, tail, moment, and characteristic functions are presented. The article proceeds with descriptive measures, estimators based on the method of moments and maximum likelihood, its relationship with other distributions, and goodness-of-fit tests. This material is applied through two examples: one involving probability and descriptive measure calculations, and the other focused on parameter estimation and fit testing using the Kolmogorov-Smirnov and Anderson-Darling tests. Additionally, scripts were developed to perform the corresponding calculations in R, a freely available software. Simulated data were used in two examples illustrating the application of the distribution. Finally, suggestions for its use are provided.
References
[1]
Ahmad, H. A. H., & Almetwally, E. M. (2020). Marshall-Olkin Generalized Pareto Distribution: Bayesian and Non Bayesian Estimation. PakistanJournalofStatisticsandOperationResearch,16, 21-33. https://doi.org/10.18187/pjsor.v16i1.2935
[2]
Akinsete, A., Famoye, F., & Lee, C. (2008). The Beta-Pareto Distribution. Statistics,42, 547-563. https://doi.org/10.1080/02331880801983876
[3]
Anderson, T. W., & Darling, D. A. (1952). Asymptotic Theory of Certain “Goodness of Fit” Criteria Based on Stochastic Processes. TheAnnalsofMathematicalStatistics,23, 193-212. https://doi.org/10.1214/aoms/1177729437
[4]
Andria, J. (2022). A Computational Proposal for a Robust Estimation of the Pareto Tail Index: An Application to Emerging Markets. AppliedSoftComputing,114, Article ID: 108048. https://doi.org/10.1016/j.asoc.2021.108048
[5]
Arnold, B. C. (2015). ParetoDistribution (2nd ed.). John Wiley y Sons, Ltd. https://doi.org/10.1201/b18141
[6]
Barczy, M., K. Nedényi, F., & Sütő, L. (2023). Probability Equivalent Level of Value at Risk and Higher-Order Expected Shortfalls. Insurance:MathematicsandEconomics,108, 107-128. https://doi.org/10.1016/j.insmatheco.2022.11.004
[7]
Barnoy, A., & Reich, Z. (2022). Trusting Others: A Pareto Distribution of Source and Message Credibility among News Reporters. Communication Research, 49, 196-220.
[8]
Beare, B. K., & Toda, A. A. (2020). On the Emergence of a Power Law in the Distribution of COVID-19 Cases. PhysicaD:NonlinearPhenomena,412, Article ID: 132649. https://doi.org/10.1016/j.physd.2020.132649
[9]
Benczes, I. (2022). Taking Back Control over the Economy: From Economic Populism to the Economic Consequences of Populism. EuropeanPolicyAnalysis,8, 109-123. https://doi.org/10.1002/epa2.1134
[10]
Bhoj, D. S., & Chandra, G. (2021). Ranked Set Sampling with Lowest Order Statistics for Pareto Distribution. CommunicationsinStatistics-SimulationandComputation,52, 2327-2335. https://doi.org/10.1080/03610918.2021.1904143
[11]
Campbell, M. R., & Brauer, M. (2021). Is Discrimination Widespread? Testing Assumptions about Bias on a University Campus. JournalofExperimentalPsychology:General,150, 756-777. https://doi.org/10.1037/xge0000983
[12]
Charpentier, A., & Flachaire, E. (2022). Pareto Models for Top Incomes and Wealth. TheJournalofEconomicInequality,20, 1-25. https://doi.org/10.1007/s10888-021-09514-6
[13]
Chattamvelli, R., & Shanmugam, R. (2021). Pareto Distribution. In Continuous Distributions in Engineering and the Applied Sciences-Part II (pp. 179-188). Springer International Publishing. https://doi.org/10.1007/978-3-031-02435-1_3
[14]
Chen, B., Zhang, K., Wang, L., Jiang, S., & Liu, G. (2019). Generalized Extreme Value-Pareto Distribution Function and Its Applications in Ocean Engineering. ChinaOceanEngineering,33, 127-136. https://doi.org/10.1007/s13344-019-0013-9
[15]
Cheng, W., Fu, H., Wang, L., Dong, C., Jin, Y., Jiang, M. et al. (2023). Data-Driven, Multi-moment Fluid Modeling of Landau Damping. ComputerPhysicsCommunications,282, Article ID: 108538. https://doi.org/10.1016/j.cpc.2022.108538
[16]
Chu, J., Dickin, O., & Nadarajah, S. (2019). A Review of Goodness of Fit Tests for Pareto Distributions. JournalofComputationalandAppliedMathematics,361, 13-41. https://doi.org/10.1016/j.cam.2019.04.018
[17]
Diawara, D., Kane, L., Dembele, S., & Lo, G. S. (2021). Applying of the Extreme Value Theory for Determining Extreme Claims in the Automobile Insurance Sector: Case of a China Car Insurance. AfrikaStatistika,16, 2883-2909. https://doi.org/10.16929/as/2021.2883.188
[18]
Fedotenkov, I. (2020). A Review of More than One Hundred Pareto-Tail Index Estimators. Statistica,80, 245-299. https://doi.org/10.6092/issn.1973-2201/9533
[19]
Feng, M., Deng, L., Chen, F., Perc, M., & Kurths, J. (2020). The Accumulative Law and Its Probability Model: An Extension of the Pareto Distribution and the Log-Normal Distribution. ProceedingsoftheRoyalSocietyA:Mathematical,PhysicalandEngineeringSciences,476, Article ID: 20200019. https://doi.org/10.1098/rspa.2020.0019
[20]
Gini, C. (1936). On the Measure of Concentration with Special Reference to Income and Statistics. ColoradoCollegePublication,GeneralSeries,208, 73-79.
[21]
Landoni, J. S., & Villegas, L. (2022). PaganlosPobres:ConsecuenciasNegativasdePolíticasPúblicasconBuenas(yMalas)Intenciones[The Poor Pay: Negative Consequencesof Public Policieswith Good (and Bad) Intentions]. Editorial Galerna.
[22]
Le Gall, P., Favre, A., Naveau, P., & Prieur, C. (2022). Improved Regional Frequency Analysis of Rainfall Data. WeatherandClimateExtremes,36, Article ID: 100456. https://doi.org/10.1016/j.wace.2022.100456
[23]
Lomax, K. S. (1954). Business Failures: Another Example of the Analysis of Failure Data. JournaloftheAmericanStatisticalAssociation,49, 847-852. https://doi.org/10.1080/01621459.1954.10501239
[24]
Lorenz, M. O. (1905). Methods of Measuring the Concentration of Wealth. PublicationsoftheAmericanStatisticalAssociation,9, 209-219. https://doi.org/10.2307/2276207
[25]
Martín, J., Parra, M. I., Pizarro, M. M., & Sanjuán, E. L. (2022). Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy,24, Article No. 178. https://doi.org/10.3390/e24020178
[26]
Martins, A. L. A., Liska, G. R., Beijo, L. A., Menezes, F. S. d., & Cirillo, M. Â. (2020). Generalized Pareto Distribution Applied to the Analysis of Maximum Rainfall Events in Uruguaiana, RS, Brazil. SNAppliedSciences,2, Article No. 1479. https://doi.org/10.1007/s42452-020-03199-8
[27]
Mateus, A., & Caeiro, F. (2022). Confidence Intervals for the Shape Parameter of a Pareto Distribution. AIPConferenceProceedings, 2425, Article ID: 320003. https://doi.org/10.1063/5.0081541
[28]
McCarthy, D. M., & Winer, R. S. (2019). The Pareto Rule in Marketing Revisited: Is It 80/20 or 70/20? MarketingLetters,30, 139-150. https://doi.org/10.1007/s11002-019-09490-y
[29]
Mojiri, A., & َََAhmadi, K. (2022). Inequality in the Distribution of Resources in Health Care System by Using the Gini Coefficient and Lorenz Curve (A Case Study of Sistan and Baluchestan Province over a Five-Year Period). HealthMonitorJournaloftheIranianInstituteforHealthSciencesResearch,21, 227-236. https://doi.org/10.52547/payesh.21.3.227
[30]
Navarro, D (2024). LearningStatisticswithR—ATutorialforPsychologyStudentsandOtherBeginners. LibreTexts Libraries. Statistics. https://stats.libretexts.org/Bookshelves/Applied_Statistics/Learning_Statistics_with_R_-_A_tutorial_for_Psychology_Students_and_other_Beginners_(Navarro)
[31]
Pareto, V. F. D. (1896). Cours d’Economie Politique (Vol. 1). F. Rouge éditeur.
[32]
Pareto, V. F. D. (1897). Cours d’Economie Politique (Vol. 2). F. Rouge éditeur.
[33]
Pearson, K. (1895). Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material. PhilosophicalTransactionsoftheRoyalSocietyofLondonA,186, 343-414. https://doi.org/10.1098/rsta.1895.0010
[34]
Pearson, K. (1905). “Das fehlergesetz und seine verallgemeiner-ungen durch fechner und pearson.” A rejoinder. Biometrika,4, 169-212. https://doi.org/10.1093/biomet/4.1-2.169
[35]
Qian, W., Chen, W., & He, X. (2021). Parameter Estimation for the Pareto Distribution Based on Ranked Set Sampling. StatisticalPapers,62, 395-417. https://doi.org/10.1007/s00362-019-01102-1
[36]
Rácz, E., Spasibko, K., Manceau, M., Ruppert, L., Chekhova, M. V., & Filip, R. (2023). Quantifying Optical Rogue Waves. https://doi.org/10.48550/arXiv.2303.04615
[37]
Rajeev, C. D. S. (2022). Pareto Principle and Compulsive Buying Disorder—An Analysis. JournalofEducationalandSocialResearch,8, 44-59.
[38]
Rao, C. R. (1973). LinearStatisticalInferenceandItsApplications. Wiley. https://doi.org/10.1002/9780470316436
[39]
Rodríguez Abreu, M. (2021). Gasto de bolsillo y gastos catastróficos en salud en hogares mexicanos. CartaEconómicaRegional,34, 59-83. https://doi.org/10.32870/cer.v0i128.7825
[40]
Ross, S. M. (2022). Simulation (6th ed.). Academic Press.
[41]
Rytgaard, M. (1990). Estimation in the Pareto Distribution. ASTINBulletin,20, 201-216. https://doi.org/10.2143/ast.20.2.2005443
[42]
Safari, M. A. M., Masseran, N., Ibrahim, K., & Hussain, S. I. (2019). A Robust and Efficient Estimator for the Tail Index of Inverse Pareto Distribution. PhysicaA:StatisticalMechanicsanditsApplications,517, 431-439. https://doi.org/10.1016/j.physa.2018.11.029
[43]
Sarabia, J. M., Jordá, V., & Prieto, F. (2019). On a New Pareto-Type Distribution with Applications in the Study of Income Inequality and Risk Analysis. PhysicaA:StatisticalMechanicsandItsApplications,527, Article ID: 121277. https://doi.org/10.1016/j.physa.2019.121277
[44]
Sinclair, C. D., Spurr, B. D., & Ahmad, M. I. (1990). Modified Anderson Darling Test. CommunicationsinStatistics—TheoryandMethods,19, 3677-3686. https://doi.org/10.1080/03610929008830405
[45]
Sitthiyot, T., & Holasut, K. (2021). A Simple Method for Estimating the Lorenz Curve. HumanitiesandSocialSciencesCommunications,8, Article No. 268. https://doi.org/10.1057/s41599-021-00948-x
[46]
Siudem, G., Nowak, P., & Gagolewski, M. (2022). Power Laws, the Price Model, and the Pareto Type-2 Distribution. PhysicaA:StatisticalMechanicsanditsApplications,606, Article ID: 128059. https://doi.org/10.1016/j.physa.2022.128059
[47]
Song, I., Ryoung-Park, S., & Yoon, S. (2022). Probability and Random Variables: Theory and Applications. Springer International Publishing.
[48]
Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons. JournaloftheAmericanStatisticalAssociation,69, 730-737. https://doi.org/10.2307/2286009
[49]
Stephens, M. A. (1986). Tests Based on EDF Statistics. In R. B. D’Agostino, & M. A. Stephens (Eds.), Goodness-of-FitTechniques (pp. 97-193) Marcel Dekker, Inc. https://doi.org/10.1201/9780203753064-4
[50]
Sudharson, D., & Prabha, D. (2019). Retracted Article: A Novel Machine Learning Approach for Software Reliability Growth Modelling with Pareto Distribution Function. Soft Computing, 23, 8379-8387. https://doi.org/10.1007/s00500-019-04047-7
[51]
Sudharson, D., Divya, P., Ratheeshkumar, M., Saravanan, A., Nithiyashree, V. K., & Srinithi, J. (2022). A PD ANN Machine Learning Framework for Reliability Optimization in Application Software. In 2022SmartTechnologies,CommunicationandRobotics(STCR) (pp. 1-4). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/stcr55312.2022.10009626
[52]
Tokhirov, A., Harmáček, J., & Syrovátka, M. (2021). Remittances and Inequality: The Post-Communist Region. PragueEconomicPapers,30, 426-448. https://doi.org/10.18267/j.pep.776
[53]
Valkanas, K., & Diamandis, P. (2022). Pareto Distribution in Virtual Education: Challenges and Opportunities. CanadianMedicalEducationJournal,13, 102-104. https://doi.org/10.36834/cmej.73511
[54]
World Bank (2022). Gini Index. https://data.worldbank.org/indicator/
[55]
Xu, T., Sedory, S. A., & Singh, S. (2022). Lowering the Cramer-Rao Lower Bounds of Variance in Randomized Response Sampling. CommunicationsinStatistics—SimulationandComputation,51, 4112-4126. https://doi.org/10.1080/03610918.2020.1737874
[56]
Yang, X., & Zhou, P. (2022). Wealth Inequality and Social Mobility: A Simulation-Based Modelling Approach. JournalofEconomicBehavior&Organization,196, 307-329. https://doi.org/10.1016/j.jebo.2022.02.012
[57]
Zhang, Y., Wu, Y., & Yao, H. (2022). Optimal Health Insurance with Constraints under Utility of Health, Wealth and Income. JournalofIndustrialandManagementOptimization,18, 1519-1540. https://doi.org/10.3934/jimo.2021031