A Compound of Generalized Negative Binomial and Shanker Distribution

Bakang P. Tlhaloganyang; David R. Mooketsi; Leano Leinanyane; Remi Sakia

A Compound of Generalized Negative Binomial and Shanker Distribution

Bakang P. Tlhaloganyang^*, David R. Mooketsi, Leano Leinanyane and Remi Sakia

Department of Statistics, Faculty of Social-Science, University of Botswana, Gaborone, Botswana

*Corresponding Author:: Bakang P. Tlhaloganyang
Department of Statistics, Faculty of Social-Science, University of Botswana, Gaborone, Botswana.
E-mail: tlhaloganyangs@gmail.com

Received date: May 10, 2018; Accepted date: July 02, 2018; Published date: July 06, 2018

Visit for more related articles at Research & Reviews: Journal of Statistics and Mathematical Sciences

Abstract

The objective of this paper is to provide an alternative distribution to the varieties of discrete distributions to be used to fit count data. We propose a compound of Generalized Negative Binomial and Shanker distribution, namely, the Generalized Negative Binomial-Shanker (GNB-SH) distribution. GNB-SH distribution can be used to fit count data while still maintaining similar characteristics as the traditional negative binomial. This new formulation is a generalization of new mixture distributions namely the Negative Binomial- Shanker (NB-SH) distribution and the Binomial-Shanker (BI-SH) distribution. Some mathematical properties of this distribution and that of its special cases are studied. Parameter estimation for GNB-SH and NB-SH distributions is also implemented using maximum likelihood.

Keywords

Generalized Negative Binomial-Shanker (GNB-SH), Binomial-Shanker (BI-SH)

Introduction

Most often, existing discrete distributions sometimes fail to fit count data well due to various reasons such variations within the data, the shape of the distribution and assumptions related to these distributions [1]. As a result, experiencing poor fit of existing discrete models in analysis of count data is a major concern in fields such as medicine, transport, engineering and agriculture. Therefore, researchers are striving to come up with new discrete distributions which could provide a better fit to the observed count data when compared to other existing models. For that reason, we propose the new distribution namely the Generalized Negative Binomial-Shanker (GNB-SH) distribution which is obtained by compounding the distribution of GNB (m,p,β), where p = exp(−λ) with distribution of SH(θ). The expectation is that GNB-SH distribution should provide a better fit to observed count data when compared to other competing distributions such as the traditional NB distribution.

Various researchers used the concept of mixing distributions to explore new flexible distributions that performs better than standard well known models. In many cases, mixed Poisson and Negative Binomial (NB) distributions usually provide better fit when compared to other existing distributions. When the data is over-dispersed, stands out to be the best the when compared to Poisson due to its assumptions flexibility. Based on this strategy of mixing distributions, various researchers were able to explore new flexible distributions.

Looking at previous researches, [2] mixed NB and Lindely distribution which have been extended and applied in many count data analysis [3-6]. Saengthong et al. obtained a mixture of NB and Crack distribution which contains three special cases namely Negative Binomial-Inverse Gaussian (NB-IG), Negative Binomial-Birnbaum-Saunders (NB-BS) and Negative Binomial-Length Biased Inverse Gaussian (NB-LBIG) [7]. These results were extended to make obtain a distribution suitable for zero inflated count data [8].

Gerstenkorn compounded the Generalized Negative Binomial (GNB) distribution and the Generalized Beta (GB) distribution which was later modified by Rashid et al. to study zero truncated count data [9,10]. Rashid et al. studied a mixture of Generalized Negative Binomial with Generalized Exponential (GNB-GE) distribution which entailed a mixture of Negative Binomial with Generalized Exponential (NB-GE) [11,12]. For its appealing performance, a zero inflation parameter was added to NB-GE distribution to make it more suitable for count data with excess number of zeros [13].

In mixtures related to Poisson distribution [14] introduced a mixture of one parameter Lindely distribution [15] with Poisson distribution. Some extensions and modification related to this formulation can be found [16-20]. Other Poisson mixtures include the Poisson-Shanker mixture [1], the Poisson- Amarenda mixture [21] and the Poisson- Sujatha mixture [22-25].

In this work, we present the concept of compounding distributions and the distributions involved in formulating GNB-SH distribution in Section 2. This section ends with mixing GNB distribution and Shanker distribution and provides its special cases. Section 3 entails mathematical properties related to this distribution including that of special cases. Section 4 deals with parameter estimation of NB-SH and GNB-SH using maximum likelihood. Section 5 presents the conclusion of this paper that includes our future plans.

Materials and Methods

Generalized Negative-Binomial Distribution

A discrete random variable X is said to be a Generalized Negative-Binomial (GNB) distribution if its pmf is given as:

Equation (1)

for x = 0,1, 2, 3,……, and zero otherwise, where

Equation (a)

Equation (b)

When β = 0 & m∈N, the pmf of equation (1) reduces to Binomial distribution and when β = 1, equation (1) reduces to the pmf of NB distribution which its mean and factorial moment respectively are given as:

Equation (2)

where ᴦ (.) is the Gamma function, see [23-25]. In GNB distribution, the parameters m, pand β are constants but here it is assumed that where λ is a random variable following the Shanker distribution.

Shanker distribution

As an extension of Lindely distribution [15] Shanker (SH) distribution was proposed by Shanker [26] who also provided its mathematical properties. This distribution is a mixture of Exponential distribution with scale parameter θ and a Gamma distribution with shape parameter 2 and scale parameter θ. This distribution has shown a better fit when it was compared with Exponential and Lindely distribution in modelling of lifetime data. The density function of this distribution is given as:

Equation (3)

for λ > 0 and zero otherwise, where θ > 0. Its moment generating function (mgf) is given as:

Equation (4)

Compound Distribution

According to the definitions provided by Gurland [27], Compounding of distributions occurs when all or some parameters of a certain distribution (Parent probability distribution) is treated as a random variable of another probability distribution called Compounding distribution. In compounding, the support of the Parent distribution determines the support of the compound distribution [27,28]. If the parent distribution is discrete (continuous), then the compound distribution will become discrete (continuous).

Compounding played an important role in revival of NB distribution which is a compound of Poisson distribution with its parameter λ treated as a Gamma variable. Considering the case of one discrete variable, the definitions and relations provided by Gurland [27] to compound a distribution are as follows:

Let X be a discrete random variable with f (X|λ)where parameter λ is a random variable with probability density function g(λ), then a compound distribution h(x) is defined as:

Equation (5)

Compounding of Generalized Negative-Binomial distribution with the Shanker Distribution

Definition: Let X be a random variable of a GNB-SH(m,β,θ) distribution denoted by X ~ GNB − SH(m,β,θ) when X has a GNB distribution with parameters m,β and p= e^−λ where λ is distributed as SH with parameter θ >0 , i.e X|λ ~ GNB(m,β , p = e^−λ) and λ ~ SH (θ).

Theorem: Let X ~ GNB − SH (m,β,θ) , then the pmf of X is given as:

Equation (6)

for x = 0,1,2,3,…., and zero otherwise, where

0 ≤ p < 1,m > 0, p β < 1,β ≥1, θ (a)

0 ≤ θ ≤1, m∈N, β = 0, θ > 0 (b)

Proof: If X ~ GNB − SH (m,β,θ) defined in equation (1) and λ ~ SH (θ) defined in equation (3), then using equation (5), the pmf of X can be obtained by:

Equation (7)

Substituting p = e^−λ in equation (1) we have f(X|λ) being defined as

Equation

using binomial expansion we obtain

Equation (8)

By substituting equation (3) and equation (8) into equation (7) we obtain

Equation (9)

Substituting the moment generating function of SH distribution in equation (4) into equation (9) the pmf of GNB − SH (m,β,θ) the distribution is finally given as

Equation

Next we provide the special cases of GNB-SH and their probability mass functions. Note that these special cases can simply be proven by substituting in the assumed values provided for each case.

Corollary: If β = 1, then the GNB-SH pmf in equation (6) reduces to a mixture of NB and Shanker distribution denoted as X ~ NB − SH (m,θ )with pmf

Equation (10)

Corollary: If β = 0 and m∈N , then the GNB-SH pmf in equation (6) reduces to a mixture of Binomial and Shanker distribution denoted as X ~ BI − SH (m,θ) with pmf

Equation (11)

The Properties of The Gnb-Sh Distribution

This deals with provision of Factorial moments of the distributions. The ordinary (crude) moments of the GNB-SH distributions can be obtained by using the formula

Equation

where S^l_k stands for the Stirling numbers of the second kind [29]. Therefore, only the factorial moments of the mixture of NB and Shanker distribution will be considered.

Equation

where S^l_k stands for the Stirling numbers of the second kind [29]. Therefore, only the factorial moments of the mixture of NB and Shanker distribution will be considered.

Definition: If X ~ NB − SH (m,θ) , then the factorial moment polynomial

Equation

is called the factorial moment of order r of a mixture of NB with Shanker distribution (NB-SH), where μ_r (x|λ) is the factorial moment of NB distribution.

Theorem: The factorial moment of order r of NB-SH distribution is given by

Equation (12)

for r = 1,2,3,…., where m, θ>0.

Proof: From the factorial moment of NB distribution in equation (2), if we let p = e^−λ then the factorial moment of order r of NB-SH distribution is given as

Equation

using binomial expansion we obtain

Equation

Substituting the moment generating function of SH distribution with t = r- k we get

Equation

From the factorial moments of NB-SH distribution in equation (12), for convenience we let

Equation

Then, the first four moments about zero are respectively given as

Equation

Parameter Estimation

Definition: In this section, maximum likelihood is used to provide parameter estimates of GNB − SH (m,β,θ) distribution and that of its special case of NB-SH distribution (m,θ).

Estimation of GNB-SH distribution parameters using full likelihood function

Let X1,X2,X3,….,Xn a random sample of size n from the GNB-SH distribution with observed values x1, x2, x3,…., xn. We find the values of m, β and θ that maximizes the likelihood function (joint pmf of the sample) of GNB-SH. Parameter estimates can easily be obtained by maximizing the logarithm of the likelihood function with respect to m, β and θ as the product is replaced by the sums. Consider the likelihood function of GNB-SH distribution defined by

Equation

with corresponding log-likelihood function given as:

Equation

Maximum likelihood estimators of m, β and θ can be obtained by maximizing Log L (x; m, β, θ) with respect to m, β and θ respectively. That is

Equation

Estimation of NB-SH distribution parameters using full likelihood function

Consider the log-likelihood function of NB-SH distribution defined by

Equation

and the partial derivatives of this log-likehood with respect to m and θ are given as

Equation

where Equation is the digamma function [29,30].

The above derivative equations cannot be solved analytically, therefore we use Newton Raphson method which is a simple and powerful technique for solving equations numerically. Therefore, parameter estimates will be obtained by maximizing the loglikelihood function using a numerical iterative method.

Conclusion

This paper proposed a new distribution which was obtained by mixing GNB distribution with a shanker distribution. It was found that NB-SH and Binomial-Shanker distributions are its special cases. Some mathematical properties which relates to its special case was provided. Parameter estimation of GNB-SH and NB-SH using MLE. Finally, our future interest will be in comparing the efficiency of this distribution with that of Poisson and NB distributions using real data sets.

References

Shanker R. The discrete poisson-shanker distribution. J of Biostat. 2015a.
Zamani H, et al. Negative binomial-lindley distribution and its application. J Math Stat. 2010;6(1):4-9.
Lord D, et al. The negative binomial–lindley distribution as a tool for analyzing crash data characterized by a large amount of zeros. Accid Anal Prev. 2011;43(5):1738-1742.
Geedipally SR, et al. The negative binomial-lindley generalized linear model: Characteristics and application using crash data. Accid Anal Prev. 2012;45:258-265.
Shirazi M, et al. A methodology to design heuristics for model selection based on the characteristics of data: Application to investigate when the negative binomial lindley (nb-l) is preferred over the negative binomial (nb). Accid Anal Prev. 2017;107:186-194.
Pudprommarat C, et al. Stochastic orders comparisons of negative binomial distribution with negative binomial-lindley distribution. Open J Stat. 2012;2(02):208.
Saengthong P, et al. Negative binomial-crack (nb-cr) distribution. Int J Pure Appl Math. 2013;84(3):213-230.
Saengthong P, et al. The zero inflated negative binomial–crack distribution: some properties and parameter estimation. J Sci Technol. 2015;37(6):701-711.
Gerstenkorn T. A compound of the generalized negative binomial distribution with the generalized beta distribution. Open Math. 2004;2(4):527-537.
Rashid A, et al. A compound of zero truncated generalized negative binomial distribution with the generalized beta distribution. J Rel Stat Stud. 2013;6(1):11-19.
Rashid A, et al. A mixture of generalized negative binomial distribution with generalized exponential distribution. J Stat App Pro. 2014;3(3):451.
Aryuyuen S, et al. The negative binomial-generalized exponential (nb-ge) distribution. Appl Math Sci. 2013;7(22):1093-1105.
Aryuyuen S, et al. Zero inflated negative binomial-generalized exponential distribution and its applications. Songklanakarin J Sci Technol. 2014;36(4).
Sankaran M. The discrete poisson-lindley distribution. Biometrics. 1970;145-149.
Lindley DV. Fiducial distributions and bayes’ theorem. J R Stat Soc Series B Stat Methodol. 1958;102-107.
Ghitany M, et al. Size-biased poisson-lindley distribution and its application. Met Inter J Stat. 2008a;66(3):299-311.
Ghitany M, et al. Zero-truncated poisson–lindley distribution and its application. Math Comput Simul. 2008b;79(3):279-287.
Mahmoudi E, et al. Generalized poisson–lindley distribution. Commun Stat Simul Comput. 2010;39(10):1785-1798.
Asgharzadeh A, et al. Pareto poisson–lindley distribution with applications. J App Stat. 2013;40(8):1717-1734.
Shanker R, et al. A two-parameter poisson-lindley distribution. Int J Stat Sys. 2014;9(1):79-85.
Shanker R. The discrete poisson-amarendra distribution. IJSDA. 2016a.
Shanker R. The discrete poisson-sujatha distribution. Int J Pro Stat. 2016b;5(1):1-9.
Consul PC, et al. Lagrangian probability distributions. Springer. 2006.
Balakrishnan N, et al. A primer on statistical distributions. John Wiley & Sons. 2004.
Johnson N, et al. Univariate discrete distributions. 1992.
Shanker R. Shanker distribution and its applications. Int J Stat App. 2015b;5(6):338-348.
Gurland J. Some interrelations among compound and generalized distributions. Biometrika. 1957;44(1/2):265-268.
Ahmad Z, et al. A new discrete compound distribution with application. J Stat Appl Pro. 2017;6:233-241.
Gupta PL, et al. On the moments and factorial moments of a mpsd. In Statistical Distributions in Scientific Work. Springer. 1981;189-195.
Abramowitz M, et al. Handbook of mathematical functions: with formulas, graphs, and mathematical tables, volume 55. Courier Corporation. 1964.