Summarise the geographic scope and position of occurrence data, and optionally estimate diversity and evenness
Usage
sdSumry(
dat,
xy,
taxVar,
crs = "epsg:4326",
collections = NULL,
quotaQ = NULL,
quotaN = NULL,
omitDom = FALSE
)
Arguments
- dat
A
data.frame
ormatrix
containing taxon names, coordinates, and any associated variables; or a list of such structures.- xy
A vector of two elements, specifying the name or numeric position of columns in
dat
containing coordinates, e.g. longitude and latitude. Coordinates for any shared sampling sites should be identical, and where sites are raster cells, coordinates are usually expected to be cell centroids.- taxVar
The name or numeric position of the column containing taxonomic identifications.
taxVar
must be of same class asxy
, e.g. a numeric column position ifxy
is given as a vector of numeric positions.- crs
Coordinate reference system as a GDAL text string, EPSG code, or object of class
crs
. Default is latitude-longitude (EPSG:4326
).- collections
The name or numeric position of the column containing unique collection IDs, e.g. 'collection_no' in PBDB data downloads.
- quotaQ
A numeric value for the coverage (quorum) level at which to perform coverage-based rarefaction (shareholder quorum subsampling).
- quotaN
A numeric value for the quota of taxon occurrences to subsample in classical rarefaction.
- omitDom
If
omitDom = TRUE
andquotaQ
orquotaN
is supplied, remove the most common taxon prior to rarefaction. ThenTax
andevenness
returned are unaffected.
Value
A matrix
of spatial and optional diversity metrics. If dat
is a
list
of data.frame
objects, output rows correspond to input elements.
Details
sdSumry()
compiles metadata about a sample or list of samples,
before or after spatial subsampling. The function counts the number
of collections (if requested), taxon presences (excluding repeat incidences
of a taxon at a given site), and unique spatial sites;
it also calculates site centroid coordinates, latitudinal range (degrees),
great circle distance (km), mean pairwise distance (km), and summed
minimum spanning tree length (km).
Coordinates and their distances are computed with respect to the original
coordinate reference system if supplied, except in calculation of latitudinal
range, for which projected coordinates are transformed to geodetic ones.
If crs
is unspecified, by default points are assumed to be given in
latitude-longitude and distances are calculated with spherical geometry.
The first two diversity variables returned are the raw count of observed taxa and the Summed Common species/taxon Occurrence Rate (SCOR). SCOR reflects the degree to which taxa are common/widespread and is decoupled from richness or abundance (Hannisdal et al. 2012). SCOR is calculated as the sum across taxa of the log probability of incidence, \(\lambda\). For a given taxon, \(\lambda = -ln(1 - p)\), where \(p\) is estimated as the fraction of occupied sites. Very widespread taxa make a large contribution to an assemblage SCOR, while rare taxa have relatively little influence.
If quotaQ
is supplied, sdSumry()
rarefies richness at the
given coverage value and returns the point estimate of richness (Hill number 0)
and its 95% confidence interval, as well as estimates of evenness (Pielou's J)
and frequency-distribution sample coverage (given by iNEXT$DataInfo
).
If quotaN
is supplied, sdSumry()
rarefies richness to the given
number of occurrence counts and returns the point estimate of richness
and its 95% confidence interval.
Coverage-based and classical rarefaction are both calculated with
iNEXT::estimateD()
internally. For details, such as how diversity
is extrapolated if sample coverage is insufficient to achieve a specified
rarefaction level, consult Chao and Jost (2012) and Hsieh et al. (2016).
References
Chao A, Jost L (2012). “Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size.” Ecology, 93(12), 2533--2547. doi:10.1890/11-1952.1 .
Hannisdal B, Henderiks J, Liow LH (2012). “Long-term evolutionary and ecological responses of calcifying phytoplankton to changes in atmospheric CO2.” Global Change Biology, 18(12), 3504--3516. doi:10.1111/gcb.12007 .
Hsieh TC, Ma KH, Chao A (2016). “iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers).” Methods in Ecology and Evolution, 7(12), 1451--1456. doi:10.1111/2041-210X.12613 .
Examples
# generate occurrences
set.seed(9)
x <- sample(rep(1:5, 10))
y <- sample(rep(1:5, 10))
# make some species 2x or 4x as common
abund <- c(rep(4, 5), rep(2, 5), rep(1, 10))
sp <- sample(letters[1:20], 50, replace = TRUE, prob = abund)
obs <- data.frame(x, y, sp)
# minimum sample data returned
sdSumry(obs, c('x','y'), 'sp')
#> nOcc nLoc centroidX centroidY latRange greatCircDist meanPairDist
#> [1,] 45 22 3.045289 2.909986 4 628.5192 297.7363
#> minSpanTree SCOR nTax
#> [1,] 2332.473 2.234449 17
# also calculate evenness and coverage-based rarefaction diversity estimates
sdSumry(obs, xy = c('x','y'), taxVar = 'sp', quotaQ = 0.7)
#> nOcc nLoc centroidX centroidY latRange greatCircDist meanPairDist minSpanTree
#> 1 45 22 3.045289 2.909986 4 628.5192 297.7363 2332.473
#> SCOR nTax evenness coverage SQSdiv SQSlow95 SQSupr95
#> 1 2.234449 17 0.9405151 0.8708 12.17405 8.323042 16.02506