multiecdf {geneplotter}R Documentation

Multiple empirical cumulative distribution functions (ecdf) and densities

Description

Plot multiple empirical cumulative distribution functions (ecdf) and densities with a user interface similar to that of boxplot. The usefulness of multidensity is variable, depending on the data and because of smoothing artifacts. multiecdf will in many cases be preferable. Please see Details.

Usage

multiecdf(x, ...)
## S3 method for class 'formula'
multiecdf(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multiecdf(x, xlab, ...) 
## S3 method for class 'list'
multiecdf(x,
          xlim,
          col = brewer.pal(9, "Set1"),
          main = "ecdf",
          xlab,
          do.points = FALSE,
          subsample = 1000L,
          legend = list(
            x = "right",
            legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
            fill = col),
          ...)

multidensity(x, ...)
## S3 method for class 'formula'
multidensity(formula, data = NULL, xlab, na.action = NULL, ...)
## S3 method for class 'matrix'
multidensity(x, xlab, ...) 
## S3 method for class 'list'
multidensity(x,
             bw = "nrd0",
             xlim,
             ylim,
             col  = brewer.pal(9, "Set1"),
             main = if(length(x)==1) "density" else "densities",
             xlab,
             lty  = 1L,
             legend = list(
               x = "topright",
               legend = if(is.null(names(x))) paste(seq(along=x)) else names(x),
               fill = col),
             ...)

Arguments

formula

a formula, such as y ~ grp, where y is a numeric vector of data values to be split into groups according to the grouping variable grp (usually a factor).

data

a data.frame (or list) from which the variables in formula should be taken.

na.action

a function which indicates what should happen when the data contain NAs. The default is to ignore missing values in either the response or the group.

x

methods exist for: formula, matrix, data.frame, list of numeric vectors.

bw

the smoothing bandwidth, see the manual page for density. The length of bw needs to be either 1 (in which case the same is used for all groups) or the same as the number of groups in x (in which case the corresponding value of bw is used for each group).

xlim

Range of the x axis. If missing, the data range is used.

ylim

Range of the y axis. If missing, the range of the density estimates is used.

col, lty

Line colors and line type.

main

Plot title.

xlab

x-axis label.

do.points

logical; if TRUE, also draw points at the knot locations.

subsample

numeric or logical of length 1. If numeric, and larger than 0, subsamples of that size are used to compute and plot the ecdf for those elements of x with more than that number of observations. If logical and TRUE, a value of 1000 is used for the subsample size.

legend

a list of arguments that is passed to the function legend.

...

Further arguments that get passed on to the plot functions.

Details

The choice of the smoothing bandwidths in multidensity can be problematic, in particular, if the different groups vary with respect to range and/or number of data points. If curves look excessively wiggly or overly smooth, try varying the arguments xlim and bw; note that the argument bw can be a vector, in which case it is expect to align with the groups.

Value

For the multidensity functions, a list of density objects.

Author(s)

Wolfgang Huber

See Also

boxplot, ecdf, density

Examples

  words = strsplit(packageDescription("geneplotter")$Description, " ")[[1]]
  factr = factor(sample(words, 2000, replace = TRUE))
  x = rnorm(length(factr), mean=as.integer(factr))
  
  multiecdf(x ~ factr)
  multidensity(x ~ factr)

[Package geneplotter version 1.32.1 Index]