negbinomial {VGAM} | R Documentation |
Maximum likelihood estimation of the two parameters of a negative binomial distribution.
negbinomial(lmu = "loge", lsize = "loge", emu = list(), esize = list(), imu = NULL, isize = NULL, quantile.probs = 0.75, nsimEIM = 100, cutoff = 0.995, Maxiter = 5000, deviance.arg = FALSE, imethod = 1, parallel = FALSE, shrinkage.init = 0.95, zero = -2) polya(lprob = "logit", lsize = "loge", eprob = list(), esize = list(), iprob = NULL, isize = NULL, quantile.probs = 0.75, nsimEIM = 100, deviance.arg = FALSE, imethod = 1, shrinkage.init = 0.95, zero = -2)
lmu, lsize, lprob |
Link functions applied to the mu, k
and p parameters.
See |
emu, esize, eprob |
List. Extra argument for each of the links.
See |
imu, isize, iprob |
Optional initial values for the mean and k and p.
For k, if failure to converge occurs then try different values
(and/or use |
quantile.probs |
Passed into the |
nsimEIM |
This argument is used
for computing the diagonal element of the
expected information matrix (EIM) corresponding to k.
See |
cutoff |
Used in the finite series approximation.
A numeric which is close to 1 but never exactly 1.
Used to specify how many terms of the infinite series
for computing the second diagonal element of the
EIM are actually used.
The sum of the probabilites are added until they reach this value or more
(but no more than |
Maxiter |
Used in the finite series approximation. Integer. The maximum number of terms allowed when computing the second diagonal element of the EIM. In theory, the value involves an infinite series. If this argument is too small then the value may be inaccurate. |
deviance.arg |
Logical. If |
imethod |
An integer with value |
parallel |
See |
shrinkage.init |
How much shrinkage is used when initializing mu.
The value must be between 0 and 1 inclusive, and
a value of 0 means the individual response values are used,
and a value of 1 means the median or mean is used.
This argument is used in conjunction with |
zero |
Integer valued vector, usually assigned -2 or 2 if used
at all. Specifies which of the two linear/additive predictors are
modelled as an intercept only. By default, the k parameter
(after |
The negative binomial distribution can be motivated in several ways,
e.g., as a Poisson distribution with a mean that is gamma
distributed.
There are several common parametrizations of the negative binomial
distribution.
The one used by negbinomial()
uses the
mean mu and an index parameter
k, both which are positive.
Specifically, the density of a random variable Y is
f(y;mu,k) = C_{y}^{y + k - 1} [mu/(mu+k)]^y [k/(k+mu)]^k
where y=0,1,2,…,
and mu > 0 and k > 0.
Note that the dispersion parameter is
1/k, so that as k approaches infinity the negative
binomial distribution approaches a Poisson distribution.
The response has variance Var(Y)=mu*(1+mu/k).
When fitted, the fitted.values
slot of the object contains
the estimated value of the mu parameter, i.e., of the mean
E(Y).
It is common for some to use alpha=1/k as the
ancillary or heterogeneity parameter;
so common alternatives for lsize
are
nloge
and
reciprocal
.
For polya
the density is
f(y;p,k) = C_{y}^{y + k - 1} [1 - p]^y p^k
where y=0,1,2,…, and 0 < p < 1 and k > 0.
The negative binomial distribution can be coerced into the
classical GLM framework with one of the parameters being
of interest and the other treated as a nuisance/scale
parameter (this is implemented in the MASS library). The
VGAM family function negbinomial
treats both
parameters on the same footing, and estimates them both
by full maximum likelihood estimation. Simulated Fisher
scoring is employed as the default (see the nsimEIM
argument).
The parameters mu and k are independent (diagonal EIM), and the confidence region for k is extremely skewed so that its standard error is often of no practical use. The parameter 1/k has been used as a measure of aggregation.
These VGAM family functions handle
multivariate responses, so that a matrix can be
used as the response. The number of columns is the number
of species, say, and setting zero = -2
means that
all species have a k equalling a (different)
intercept only.
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as vglm
,
rrvglm
and vgam
.
The Poisson model corresponds to k equalling
infinity. If the data is Poisson or close to Poisson,
numerical problems will occur. Possibly choosing a
log-log link may help in such cases, otherwise use
poissonff
or quasipoissonff
.
These functions are fragile; the maximum likelihood
estimate of the index parameter is fraught (see Lawless,
1987). In general, the quasipoissonff
is
more robust. Other alternatives to negbinomial
are
to fit a NB-1 or RR-NB model; see Yee (2011). Assigning
values to the isize
argument may lead to a local
solution, and smaller values are preferred over large
values when using this argument.
Yet to do: write a family function which uses the methods of moments estimator for k.
These two functions implement two common parameterizations
of the negative binomial (NB). Some people called the
NB with integer k the Pascal distribution,
whereas if k is real then this is the Polya
distribution. I don't. The one matching the details of
rnbinom
in terms of p
and k is polya()
.
For polya()
the code may fail when p is close
to 0 or 1. It is not yet compatible with cqo
or cao
.
Suppose the response is called ymat
.
For negbinomial()
the diagonal element of the expected information matrix
(EIM) for parameter k
involves an infinite series; consequently simulated Fisher scoring
(see nsimEIM
) is the default. This algorithm should definitely be
used if max(ymat)
is large, e.g., max(ymat) > 300
or there
are any outliers in ymat
.
A second algorithm involving a finite series approximation can be
invoked by setting nsimEIM = NULL
.
Then the arguments
Maxiter
and
cutoff
are pertinent.
Regardless of the algorithm used,
convergence problems may occur, especially when the response has large
outliers or is large in magnitude.
If convergence failure occurs, try using arguments
(in recommended decreasing order)
nsimEIM
,
shrinkage.init
,
imethod
,
Maxiter
,
cutoff
,
isize
,
zero
.
The function negbinomial
can be used by the fast algorithm in
cqo
, however, setting EqualTolerances = TRUE
and
ITolerances = FALSE
is recommended.
In the first example below (Bliss and Fisher, 1953), from each of 6 McIntosh apple trees in an orchard that had been sprayed, 25 leaves were randomly selected. On each of the leaves, the number of adult female European red mites were counted.
There are two special uses of negbinomial
for handling count data.
Firstly,
when used by rrvglm
this
results in a continuum of models in between and
inclusive of quasi-Poisson and negative binomial regression.
This is known as a reduced-rank negative binomial model (RR-NB).
It fits a negative binomial log-linear regression with variance function
Var(Y) = mu + delta1 * mu^delta2
where delta1
and delta2
are parameters to be estimated by MLE.
Confidence intervals are available for delta2,
therefore it can be decided upon whether the
data are quasi-Poisson or negative binomial, if any.
Secondly,
the use of negbinomial
with parallel = TRUE
inside vglm
can result in a model similar to quasipoissonff
.
This is named the NB-1 model.
The dispersion parameter is estimated by MLE whereas
glm
uses the method of moments.
In particular, it fits a negative binomial log-linear regression
with variance function
Var(Y) = phi0 * mu
where phi0
is a parameter to be estimated by MLE.
Confidence intervals are available for phi0.
Thomas W. Yee
Lawless, J. F. (1987) Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics 15, 209–225.
Hilbe, J. M. (2007) Negative Binomial Regression. Cambridge: Cambridge University Press.
Bliss, C. and Fisher, R. A. (1953) Fitting the negative binomial distribution to biological data. Biometrics 9, 174–200.
Yee, T. W. (2011) Two-parameter reduced-rank vector generalized linear models. In preparation.
quasipoissonff
,
poissonff
,
zinegbinomial
,
posnegbinomial
,
invbinomial
,
rnbinom
,
nbolf
,
rrvglm
,
cao
,
cqo
,
CommonVGAMffArguments
.
# Example 1: apple tree data appletree <- data.frame(y = 0:7, w = c(70, 38, 17, 10, 9, 3, 2, 1)) fit <- vglm(y ~ 1, negbinomial, appletree, weights = w) summary(fit) coef(fit, matrix = TRUE) Coef(fit) # Example 2: simulated data with multivariate response ndata <- data.frame(x2 = runif(nn <- 500)) ndata <- transform(ndata, y1 = rnbinom(nn, mu = exp(3+x2), size = exp(1)), y2 = rnbinom(nn, mu = exp(2-x2), size = exp(0))) fit1 <- vglm(cbind(y1, y2) ~ x2, negbinomial, ndata, trace = TRUE) coef(fit1, matrix = TRUE) # Example 3: large counts so definitely use the nsimEIM argument ndata <- transform(ndata, y3 = rnbinom(nn, mu = exp(12+x2), size = exp(1))) with(ndata, range(y3)) # Large counts fit2 <- vglm(y3 ~ x2, negbinomial(nsimEIM = 100), ndata, trace = TRUE) coef(fit2, matrix = TRUE) # Example 4: a NB-1 to estimate a negative binomial with Var(Y) = phi0 * mu nn <- 1000 # Number of observations phi0 <- 10 # Specify this; should be greater than unity delta0 <- 1 / (phi0 - 1) mydata <- data.frame(x2 = runif(nn), x3 = runif(nn)) mydata <- transform(mydata, mu = exp(2 + 3 * x2 + 0 * x3)) mydata <- transform(mydata, y3 = rnbinom(nn, mu = mu, size = delta0 * mu)) ## Not run: plot(y3 ~ x2, data = mydata, pch = "+", col = 'blue', main = paste("Var(Y) = ", phi0, " * mu", sep = ""), las = 1) ## End(Not run) nb1 <- vglm(y3 ~ x2 + x3, negbinomial(parallel = TRUE, zero = NULL), mydata, trace = TRUE) # Extracting out some quantities: cnb1 <- coef(nb1, matrix = TRUE) mydiff <- (cnb1["(Intercept)", "log(size)"] - cnb1["(Intercept)", "log(mu)"]) delta0.hat <- exp(mydiff) (phi.hat <- 1 + 1 / delta0.hat) # MLE of phi summary(nb1) # Obtain a 95 percent confidence interval for phi0: myvec <- rbind(-1, 1, 0, 0) (se.mydiff <- sqrt(t(myvec) %*% vcov(nb1) %*% myvec)) ci.mydiff <- mydiff + c(-1.96, 1.96) * se.mydiff ci.delta0 <- ci.exp.mydiff <- exp(ci.mydiff) (ci.phi0 <- 1 + 1 / rev(ci.delta0)) # The 95 percent conf. interval for phi0 confint_nb1(nb1) # Quick way to get it summary(glm(y3 ~ x2 + x3, quasipoisson, mydata))$disper # cf. moment estimator