Find unique (taxon) occurrence records

Subset a dataset to unique spatial localities or locality-taxon combinations.

Usage

uniqify(dat, xy, taxVar = NULL, na.rm = TRUE)

Arguments

dat: A data.frame or matrix containing taxon names, coordinates, and any associated variables; or a list of such structures.
xy: A vector of two elements, specifying the name or numeric position of columns in dat containing coordinates, e.g. longitude and latitude. Coordinates for any shared sampling sites should be identical, and where sites are raster cells, coordinates are usually expected to be cell centroids.
taxVar: The name or numeric position of the column containing taxonomic identifications. taxVar must be of same class as xy, e.g. a numeric column position if xy is given as a vector of numeric positions.
na.rm: Should records missing information be removed? Default is yes.

Value

An object with the same class and columns as dat, containing the subset of rows representing unique coordinates (if only xy supplied) or unique taxon-site combinations (if taxVar is also supplied). The first record at each spatial locality is retained, or if taxVar is specified, the first record of each taxon at a locality.

Details

The na.rm argument applies to coordinate values and, if taxVar is supplied, to taxon values. If na.rm = FALSE, any NA values will be retained and treated as their own value. Note that divvy ignores any rows with missing coordinates for the subsampling functions cookies(), clustr(), and bandit().

Examples

# generate occurrence data
x  <- rep(1, 10)
y  <- c(rep(1, 5), 2:6)
sp <- c(rep(letters[1:3], 2),
        rep(letters[4:5], 2))
obs <- data.frame(x, y, sp)

# compare original and unique datasets:
# rows 4 and 5 removed as duplicates of rows 1 and 2, respectively
obs
#>    x y sp
#> 1  1 1  a
#> 2  1 1  b
#> 3  1 1  c
#> 4  1 1  a
#> 5  1 1  b
#> 6  1 2  c
#> 7  1 3  d
#> 8  1 4  e
#> 9  1 5  d
#> 10 1 6  e
uniqify(obs, taxVar = 3, xy = 1:2)
#>    x y sp
#> 1  1 1  a
#> 2  1 1  b
#> 3  1 1  c
#> 6  1 2  c
#> 7  1 3  d
#> 8  1 4  e
#> 9  1 5  d
#> 10 1 6  e

# using taxon identifications or other third variable is optional
uniqify(obs, xy = c('x', 'y'))
#>    x y sp
#> 1  1 1  a
#> 6  1 2  c
#> 7  1 3  d
#> 8  1 4  e
#> 9  1 5  d
#> 10 1 6  e

# caution - data outside the taxon and occurrence variables
# will be lost where associated with duplicate occurrences
obs$notes <- letters[11:20]
uniqify(obs, 1:2, 3)
#>    x y sp notes
#> 1  1 1  a     k
#> 2  1 1  b     l
#> 3  1 1  c     m
#> 6  1 2  c     p
#> 7  1 3  d     q
#> 8  1 4  e     r
#> 9  1 5  d     s
#> 10 1 6  e     t
# the notes 'n' and 'o' are absent in the output data