ddply {plyr} | R Documentation |
For each subset of a data frame, apply function then combine results into a data frame.
ddply(.data, .variables, .fun = NULL, ..., .progress = "none", .drop = TRUE, .parallel = FALSE)
.fun |
function to apply to each piece |
... |
other arguments passed on to |
.progress |
name of the progress bar to use, see
|
.data |
data frame to be processed |
.variables |
variables to split data frame by, as quoted variables, a formula or character vector |
.drop |
should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default) |
.parallel |
if |
A data frame, as described in the output section.
This function splits data frames by variables.
The most unambiguous behaviour is achieved when
.fun
returns a data frame - in that case pieces
will be combined with rbind.fill
. If
.fun
returns an atomic vector of fixed length, it
will be rbind
ed together and converted to a data
frame. Any other values will result in an error.
If there are no results, then this function will return a
data frame with zero rows and columns
(data.frame()
).
Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. http://www.jstatsoft.org/v40/i01/.
Other data frame input: daply
,
dlply
Other data frame output: adply
,
ldply
ddply(baseball, .(year), "nrow") ddply(baseball, .(lg), c("nrow", "ncol")) rbi <- ddply(baseball, .(year), summarise, mean_rbi = mean(rbi, na.rm = TRUE)) with(rbi, plot(year, mean_rbi, type="l")) base2 <- ddply(baseball, .(id), transform, career_year = year - min(year) + 1 )