estimateCommonDisp {edgeR}R Documentation

Estimates the Negative Binomial Common Dispersion by Maximizing the Negative Binomial Conditional Common Likelihood

Description

Maximizes the negative binomial conditional common likelihood to give the estimate of the common dispersion across all tags for the unadjusted counts provided.

Usage

estimateCommonDisp(object, tol=1e-06, rowsum.filter=5)

Arguments

object

DGEList object with (at least) elements counts (table of unadjusted counts), and samples (vector indicating group) and lib.size (vector of library sizes)

tol

numeric scalar providing the tolerance to be passed to optimize; default value is 1e-06

rowsum.filter

numeric scalar giving a value for the filtering out of low abundance tags in the estimation of the common dispersion. Only tags with total sum of counts above this value are used in the estimation of the common dispersion. Low abundance tags can adversely affect the estimation of the common dispersion, so this argument allows the user to select an appropriate filter threshold for the tag abundance.

Details

The method of conditional maximum likelihood assumes that library sizes are equal, which is not true in general, so pseudocounts (counts adjusted so that the library sizes are equal) need to be calculated. The function equalizeLibSizes is called to adjust the counts using a quantile-to-quantile method, but this requires a fixed value for the common dispersion parameter. To obtain a good estimate for the common dispersion, pseudocounts are calculated under the Poisson model (dispersion is zero) and these pseudocounts are used to give an estimate of the common dispersion. This estimate of the common dispersion is then used to recalculate the pseudocounts, which are used to provide a final estimate of the common dispersion.

Value

estimateCommonDisp produces an object of class DGEList with the following components.

common.dispersion

estimate of the common dispersion; the value for phi, the dispersion parameter in the NB model, that maximizes the negative binomial common likelihood on the phi scale

counts

table of unadjusted counts

group

vector indicating the group to which each library belongs

lib.size

vector containing the unadjusted size of each library

pseudo.alt

table of adjusted counts; quantile-to-quantile method (see q2qnbinom) used to adjust the raw counts so that library sizes are equal; adjustment here done under the alternative hypothesis that there is a true difference between groups

conc

list containing the estimates of the concentration of each tag in the underlying sample; conc$p.common gives estimates under the null hypothesis of no difference between groups; conc$p.group gives the estimate of the concentration for each tag within each group; concentration is a measure of abundance and thus expression level for the tags

common.lib.size

the common library size to which the count libraries have been adjusted

Author(s)

Mark Robinson, Davis McCarthy

References

Robinson MD and Smyth GK (2008). Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics, 9, 321-332

See Also

estimateTagwiseDisp can be used to estimate a value for the dispersion parameter for each tag/transcript. The estimates are stabilized by squeezing the estimates towards the common value calculated by estimateCommonDisp.

Examples

# True dispersion is 1/5=0.2
y <- matrix(rnbinom(1000,mu=10,size=5),ncol=4)
d <- DGEList(counts=y,group=c(1,1,2,2),lib.size=c(1000:1003))
cmdisp <- estimateCommonDisp(d)
cmdisp$common.dispersion

[Package edgeR version 2.4.3 Index]