Summarise the geographic scope and position of occurrence data, and optionally estimate diversity and evenness
Usage
sdSumry(
dat,
xy,
taxVar,
crs = "epsg:4326",
collections = NULL,
quotaQ = NULL,
quotaN = NULL,
omitDom = FALSE
)Arguments
- dat
A
data.frameormatrixcontaining taxon names, coordinates, and any associated variables; or a list of such structures.- xy
A vector of two elements, specifying the name or numeric position of columns in
datcontaining coordinates, e.g. longitude and latitude. Coordinates for any shared sampling sites should be identical, and where sites are raster cells, coordinates are usually expected to be cell centroids.- taxVar
The name or numeric position of the column containing taxonomic identifications.
taxVarmust be of same class asxy, e.g. a numeric column position ifxyis given as a vector of numeric positions.- crs
Coordinate reference system as a GDAL text string, EPSG code, or object of class
crs. Default is latitude-longitude (EPSG:4326).- collections
The name or numeric position of the column containing unique collection IDs, e.g. 'collection_no' in PBDB data downloads.
- quotaQ
A numeric value for the coverage (quorum) level at which to perform coverage-based rarefaction (shareholder quorum subsampling).
- quotaN
A numeric value for the quota of taxon occurrences to subsample in classical rarefaction.
- omitDom
If
omitDom = TRUEandquotaQorquotaNis supplied, remove the most common taxon prior to rarefaction. ThenTaxandevennessreturned are unaffected.
Value
A matrix of spatial and optional diversity metrics. If dat is a
list of data.frame objects, output rows correspond to input elements.
Details
sdSumry() compiles metadata about a sample or list of samples,
before or after spatial subsampling. The function counts the number
of collections (if requested), taxon presences (excluding repeat incidences
of a taxon at a given site), and unique spatial sites;
it also calculates site centroid coordinates, latitudinal range (degrees),
great circle distance (km), mean pairwise distance (km), and summed
minimum spanning tree length (km).
Coordinates and their distances are computed with respect to the original
coordinate reference system if supplied, except in calculation of latitudinal
range, for which projected coordinates are transformed to geodetic ones.
If crs is unspecified, by default points are assumed to be given in
latitude-longitude and distances are calculated with spherical geometry.
The first two diversity variables returned are the raw count of observed taxa and the Summed Common species/taxon Occurrence Rate (SCOR). SCOR reflects the degree to which taxa are common/widespread and is decoupled from richness or abundance (Hannisdal et al. 2012). SCOR is calculated as the sum across taxa of the log probability of incidence, \(\lambda\). For a given taxon, \(\lambda = -ln(1 - p)\), where \(p\) is estimated as the fraction of occupied sites. Very widespread taxa make a large contribution to an assemblage SCOR, while rare taxa have relatively little influence.
If quotaQ is supplied, sdSumry() rarefies richness at the
given coverage value and returns the point estimate of richness (Hill number 0)
and its 95% confidence interval, as well as estimates of evenness (Pielou's J)
and frequency-distribution sample coverage (given by iNEXT$DataInfo).
If quotaN is supplied, sdSumry() rarefies richness to the given
number of occurrence counts and returns the point estimate of richness
and its 95% confidence interval.
Coverage-based and classical rarefaction are both calculated with
iNEXT::estimateD() internally. For details, such as how diversity
is extrapolated if sample coverage is insufficient to achieve a specified
rarefaction level, consult Chao and Jost (2012) and Hsieh et al. (2016).
Examples
# generate occurrences
set.seed(9)
x <- sample(rep(1:5, 10))
y <- sample(rep(1:5, 10))
# make some species 2x or 4x as common
abund <- c(rep(4, 5), rep(2, 5), rep(1, 10))
sp <- sample(letters[1:20], 50, replace = TRUE, prob = abund)
obs <- data.frame(x, y, sp)
# minimum sample data returned
sdSumry(obs, c('x','y'), 'sp')
#> nOcc nLoc centroidX centroidY latRange greatCircDist meanPairDist
#> [1,] 45 22 3.045289 2.909986 4 628.5192 297.7363
#> minSpanTree SCOR nTax
#> [1,] 2332.473 2.234449 17
# also calculate evenness and coverage-based rarefaction diversity estimates
sdSumry(obs, xy = c('x','y'), taxVar = 'sp', quotaQ = 0.7)
#> nOcc nLoc centroidX centroidY latRange greatCircDist meanPairDist minSpanTree
#> 1 45 22 3.045289 2.909986 4 628.5192 297.7363 2332.473
#> SCOR nTax evenness coverage SQSdiv SQSlow95 SQSupr95
#> 1 2.234449 17 0.9405151 0.8708 12.17405 8.323042 16.02506