df.subset {misty} | R Documentation |
Subsetting Data Frames
Description
This function returns subsets of data frames which meet conditions.
Usage
df.subset(data, ..., subset = NULL, drop = TRUE, check = TRUE)
Arguments
data |
a data frame. |
... |
an expression indicating variables to select from the data frame
specified in |
subset |
a logical expression indicating rows to keep, e.g., |
drop |
logical: if |
check |
logical: if |
Details
The argument ...
is used to specify an expression indicating the
variables to select from the data frame specified in data
, e.g.,
df.subset(dat, x1, x2, x3)
. There are seven operators which
can be used in the expression ...
:
- Dot (
.
) Operator The dot operator is used to select all variables from the data frame specified in
data
. For example,df.subset(dat, .)
selects all variables indat
. Note that this operator is similar to the functioneverything()
from the tidyselect package.- Plus (
+
) Operator The plus operator is used to select variables matching a prefix from the data frame specified in
data
. For example,df.subset(dat, +x)
selects all variables with the prefixx
. Note that this operator is equivalent to the functionstarts_with()
from the tidyselect package.- Minus (
-
) Operator The minus operator is used to select variables matching a suffix from the data frame specified in
data
. For example,df.subset(dat, -y)
selects all variables with the suffixy
. Note that this operator is equivalent to the functionends_with()
from the tidyselect package.- Tilde (
~
) Operator The tilde operator is used to select variables containing a word from the data frame specified in
data
. For example,df.subset(dat, ~al)
selects all variables with the wordal
. Note that this operator is equivalent to the functioncontains()
from the tidyselect package.- Colon (
:
) operator The colon operator is used to select a range of consecutive variables from the data frame specified in
data
. For example,df.subset(dat, x:z)
selects all variables fromx
toz
. Note that this operator is equivalent to the:
operator from theselect
function in the dplyr package.- Double Colon (
::
) Operator The double colon operator is used to select numbered variables from the data frame specified in
data
. For example,df.subset(dat, x1::x3)
selects the variablesx1
,x2
, andx3
. Note that this operator is similar to the functionnum_range()
from the tidyselect package.- Exclamation Point (
!
) Operator The exclamation point operator is used to drop variables from the data frame specified in
data
or for taking the complement of a set of variables. For example,df.subset(dat, ., !x)
selects all variables using the dot operator (.
) butx
in 'dat
.,df.subset(dat, ., !~x)
selects all variables but variables with the prefixx
, ordf.subset(dat, x:z, !x1:x3)
selects all variables fromx
toz
but excludes all variables fromx1
tox3
. Note that this operator is equivalent to the!
operator from theselect
function in the dplyr package.
Note that operators can be combined within the same function call. For example,
df.subset(dat, +x, -y, !x2:x4, z)
selects all variables with the prefix
x
and with the suffix y
but excludes variables from x2
to
x4
and select variable z
.
Value
Returns a data frame containing the variables and rows selected in the argument
...
and rows selected in the argument subset
.
Author(s)
Takuya Yanagida takuya.yanagida@univie.ac.at
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
df.duplicated
, df.merge
,
df.move
, df.rbind
,
df.rename
, df.sort
Examples
## Not run:
#----------------------------------------------------------------------------
# Select single variables
# Example 1: Select 'Sepal.Length' and 'Petal.Width'
df.subset(iris, Sepal.Length, Petal.Width)
#----------------------------------------------------------------------------
# Select rows
# Example 2a: Select all variables, select rows with 'Species' equal 'setosa'
df.subset(iris, subset = Species == "setosa")
# Example 2b: Select all variables, select rows with 'Petal.Length' smaller 1.2
df.subset(iris, subset = Petal.Length < 1.2)
#----------------------------------------------------------------------------
# Select variables matching a prefix using the + operator
# Example 3: Select variables with prefix 'Petal'
df.subset(iris, +Petal)
#----------------------------------------------------------------------------
# Select variables matching a suffix using the - operator
# Example 4: Select variables with suffix 'Width'
df.subset(iris, -Width)
#----------------------------------------------------------------------------
# Select variables containing a word using the ~ operator
#
# Example 5: Select variables containing 'al'
df.subset(iris, ~al)
#----------------------------------------------------------------------------
# Select consecutive variables using the : operator
# Example 6: Select all variables from 'Sepal.Width' to 'Petal.Width'
df.subset(iris, Sepal.Width:Petal.Width)
#----------------------------------------------------------------------------
# Select numbered variables using the :: operator
# Example 7: Select all variables from 'x1' to 'x3' and 'y1' to 'y3'
df.subset(anscombe, x1::x3, y1::y3)
#
#----------------------------------------------------------------------------
# Drop variables using the ! operator
# Example 8a: Select all variables but 'Sepal.Width'
df.subset(iris, ., !Sepal.Width)
# Example 8b: Select all variables but 'Sepal.Width' to 'Petal.Width'
df.subset(iris, ., !Sepal.Width:Petal.Width)
#----------------------------------------------------------------------------
# Combine +, -, !, and : operators
# Example 9: Select variables with prefix 'x' and suffix '3', but exclude
# variables from 'x2' to 'x3'
df.subset(anscombe, +x, -3, !x2:x3)
## End(Not run)