dirichlet {VGAM}R Documentation

Fitting a Dirichlet Distribution

Description

Fits a Dirichlet distribution to a matrix of compositions.

Usage

dirichlet(link = "loge", earg=list(), parallel = FALSE, zero=NULL)

Arguments

link

Link function applied to each of the M (positive) shape parameters alpha_j. See Links for more choices. The default gives eta_j=log(alpha_j).

earg

List. Extra argument for the link. See earg in Links for general information.

parallel, zero

See CommonVGAMffArguments for more information.

Details

In this help file the response is assumed to be a M-column matrix with positive values and whose rows each sum to unity. Such data can be thought of as compositional data. There are M linear/additive predictors eta_j.

The Dirichlet distribution is commonly used to model compositional data, including applications in genetics. Suppose (Y_1,…,Y_M)^T is the response. Then it has a Dirichlet distribution if (Y_1,…,Y_{M-1})^T has density

(Gamma(alpha_+) / prod_{j=1}^M gamma(alpha_j)) prod_{j=1}^M y_j^(alpha_j -1)

where alpha_+= alpha_1 + … + alpha_M, alpha_j > 0, and the density is defined on the unit simplex

Delta_M = { (y_1,…,y_M)^T : y_1 > 0, …, y_M > 0, ∑_{j=1}^M y_j = 1 }.

One has E(Y_j) = alpha_j / alpha_{+}, which are returned as the fitted values. For this distribution Fisher scoring corresponds to Newton-Raphson.

The Dirichlet distribution can be motivated by considering the random variables (G_1,…,G_M)^T which are each independent and identically distributed as a gamma distribution with density f(g_j)= g_j^(alpha_j - 1) e^(-g_j) / gamma(alpha_j). Then the Dirichlet distribution arises when Y_j = G_j / (G_1 + ... + G_M).

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, rrvglm and vgam.

When fitted, the fitted.values slot of the object contains the M-column matrix of means.

Note

The response should be a matrix of positive values whose rows each sum to unity. Similar to this is count data, where probably a multinomial logit model (multinomial) may be appropriate. Another similar distribution to the Dirichlet is the Dirichlet-multinomial (see dirmultinomial).

Author(s)

Thomas W. Yee

References

Lange, K. (2002) Mathematical and Statistical Methods for Genetic Analysis, 2nd ed. New York: Springer-Verlag.

Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions, New York: Wiley-Interscience, Third edition.

See Also

rdiric, dirmultinomial, multinomial, simplex.

Examples

y = rdiric(n=1000, shape=exp(c(-1,1,0)))
fit = vglm(y ~ 1, dirichlet, trace = TRUE, crit="c")
Coef(fit)
coef(fit, matrix=TRUE)
head(fitted(fit))

[Package VGAM version 0.8-4 Index]