Subset a dataset to unique spatial localities or locality-taxon combinations.
Arguments
- dat
A
data.frame
ormatrix
containing taxon names, coordinates, and any associated variables; or a list of such structures.- xy
A vector of two elements, specifying the name or numeric position of columns in
dat
containing coordinates, e.g. longitude and latitude. Coordinates for any shared sampling sites should be identical, and where sites are raster cells, coordinates are usually expected to be cell centroids.- taxVar
The name or numeric position of the column containing taxonomic identifications.
taxVar
must be of same class asxy
, e.g. a numeric column position ifxy
is given as a vector of numeric positions.- na.rm
Should records missing information be removed? Default is yes.
Value
An object with the same class and columns as dat
, containing the
subset of rows representing unique coordinates (if only xy
supplied)
or unique taxon-site combinations (if taxVar
is also supplied).
The first record at each spatial locality is retained,
or if taxVar
is specified, the first record of each taxon at a locality.
Details
The na.rm
argument applies to coordinate values and, if taxVar
is supplied, to taxon values. If na.rm = FALSE
, any NA
values will be
retained and treated as their own value. Note that divvy
ignores any rows
with missing coordinates for the subsampling functions cookies()
,
clustr()
, and bandit()
.
Examples
# generate occurrence data
x <- rep(1, 10)
y <- c(rep(1, 5), 2:6)
sp <- c(rep(letters[1:3], 2),
rep(letters[4:5], 2))
obs <- data.frame(x, y, sp)
# compare original and unique datasets:
# rows 4 and 5 removed as duplicates of rows 1 and 2, respectively
obs
#> x y sp
#> 1 1 1 a
#> 2 1 1 b
#> 3 1 1 c
#> 4 1 1 a
#> 5 1 1 b
#> 6 1 2 c
#> 7 1 3 d
#> 8 1 4 e
#> 9 1 5 d
#> 10 1 6 e
uniqify(obs, taxVar = 3, xy = 1:2)
#> x y sp
#> 1 1 1 a
#> 2 1 1 b
#> 3 1 1 c
#> 6 1 2 c
#> 7 1 3 d
#> 8 1 4 e
#> 9 1 5 d
#> 10 1 6 e
# using taxon identifications or other third variable is optional
uniqify(obs, xy = c('x', 'y'))
#> x y sp
#> 1 1 1 a
#> 6 1 2 c
#> 7 1 3 d
#> 8 1 4 e
#> 9 1 5 d
#> 10 1 6 e
# caution - data outside the taxon and occurrence variables
# will be lost where associated with duplicate occurrences
obs$notes <- letters[11:20]
uniqify(obs, 1:2, 3)
#> x y sp notes
#> 1 1 1 a k
#> 2 1 1 b l
#> 3 1 1 c m
#> 6 1 2 c p
#> 7 1 3 d q
#> 8 1 4 e r
#> 9 1 5 d s
#> 10 1 6 e t
# the notes 'n' and 'o' are absent in the output data