Simulate species data for creating Area of Habitat data (Brooks et al. 2019). Specifically, data are simulated to define species geographic ranges, summary information, and habitat preferences.
Usage
simulate_spp_data(
n,
boundary_data,
habitat_data = NULL,
elevation_data = NULL,
crosswalk_data = NULL,
rf_scale_min = 0.5,
rf_scale_max = 0.7,
cache_dir = tempdir(),
habitat_version = "latest",
force = FALSE,
omit_habitat_codes = iucn_habitat_codes_marine(),
verbose = TRUE
)
Arguments
- n
integer
Number of species to simulate.- boundary_data
sf::st_sf()
Spatial object delineating the spatial extent and boundary for simulating species ranges.- habitat_data
terra::rast()
Raster data indicating the presence of different habitat classes across world (e.g., Jung et al. 2020a,b; Lumbierres et al. 2021). Each grid cell should contain aninteger
value that specifies which habitat class is present within the cell (based on the argument tocrosswalk_data
). Defaults toNULL
such that data are automatically obtained (usingget_lumb_cgls_habitat_data()
).- elevation_data
terra::rast()
Raster data delineating the worldwide elevation data (e.g., Robinson et al. 2014). Defaults toNULL
such that data are automatically obtained (usingget_global_elevation_data()
). If the data are obtained automatically, then a preprocessed version of the habitat data will be used to reduce processing time.- crosswalk_data
data.frame()
Table containing data that indicate which grid cell values in the argument tohabitat_data
correspond to which IUCN habitat classification codes. The argument should contain acode
column that specifies a set of IUCN habitat classification codes (seeiucn_habitat_data()
, and avalue
column that specifies different values in the argument tohabitat_data
. Defaults toNULL
such that the crosswalk for the default habitat data are used (i.e.,crosswalk_lumb_cgls_data()
).- rf_scale_min
numeric
Minimum scaling parameter used to control the smallest possible level of spatial auto-correlation for simulated species ranges. Defaults to 0.5.- rf_scale_max
numeric
Minimum scaling parameter used to control the largest possible level of spatial auto-correlation for simulated species ranges. Defaults to0.7
.- cache_dir
character
Folder path for downloading and caching data. By default, a temporary directory is used (i.e.,tempdir()
). To avoid downloading the same data multiple times, it is strongly recommended to specify a persistent storage location (see Examples below).- habitat_version
character
Version of the habitat dataset that should be used. See documentation for the theversion
parameter in theget_lumb_cgls_habitat_data()
function for further details. This parameter is only used if habitat data are obtained automatically (i.e., the argument tohabitat_data
isNULL
). Defaults to"latest"
such that the most recent version of the dataset is used if data need to be obtained.- force
logical
Should the data be downloaded even if the the data are already available? Defaults toFALSE
.- omit_habitat_codes
character
Habitat classification codes to omit from resulting Area of Habitat data. Please see the IUCN Red List Habitat Classification Scheme for the full range of habitat classification codes. For example, if the aim is to identify natural places that contain suitable conditions, then areas classified as anthropogenically modified (iucn_habitat_codes_artificial()
), introduced vegetation (iucn_habitat_codes_introduced()
, or unknown habitat (iucn_habitat_codes_misc()
) should be excluded. Defaults toiucn_habitat_codes_marine()
, such that marine habitats are excluded.- verbose
logical
Should progress be displayed while processing data? Defaults toTRUE
.
Value
A list
object containing simulated data that are formatted following
conventions used by the
International Union for Conservation of Nature (IUCN) Red List of Threatened Species.
It contains the following elements:
- spp_range_data
A
sf::st_sf()
object containing the species' geographic range data.- spp_summary_data
A
tibble::tibble()
object containing summary information about the species (including elevational limit information.- spp_habitat_data
A
tibble::tibble()
object containing habitat preferences for the species.
References
Brooks TM, Pimm SL, Akçakaya HR, Buchanan GM, Butchart SHM, Foden W, Hilton-Taylor C, Hoffmann M, Jenkins CN, Joppa L, Li BV, Menon V, Ocampo-Peñuela N, Rondinini C (2019) Measuring terrestrial Area of Habitat (AOH) and its utility for the IUCN Red List. Trends in Ecology & Evolution, 34, 977–986. doi:10.1016/j.tree.2019.06.009
See also
See create_spp_aoh_data()
for creating Area of Habitat maps using
data for real or simulated species.
Examples
# please ensure that the fields and smoothr packages are installed
# to run these examples
# \dontrun{
# define persistent storage location
download_dir <- rappdirs::user_data_dir("aoh")
# create download directory if needed
if (!file.exists(download_dir)) {
dir.create(download_dir, showWarnings = FALSE, recursive = TRUE)
}
# specify file path for boundary data
boundary_path <- system.file("shape/nc.shp", package = "sf")
# import boundary data to simulate species data
boundary_data <- sf::st_union(sf::read_sf(boundary_path))
# set random number generator seeds for consistency
set.seed(500)
# simulate data for 5 species
x <- simulate_spp_data(
n = 5, boundary_data = boundary_data, cache_dir = download_dir
)
#> ℹ importing global elevation data
#> ✔ importing global elevation data [7.6s]
#>
#> ℹ importing global habitat data
#> ✔ importing global habitat data [1m 18.3s]
#>
# preview species range data
print(x$spp_range_data)
#> Simple feature collection with 17 features and 26 fields
#> Geometry type: GEOMETRY
#> Dimension: XY
#> Bounding box: xmin: -84.31763 ymin: 33.88392 xmax: -75.45658 ymax: 36.5881
#> Geodetic CRS: WGS 84
#> First 10 features:
#> id_no seasonal presence origin geometry
#> 1 799 1 1 1 POLYGON ((-78.71844 34.0082...
#> 2 799 1 3 2 POLYGON ((-78.40331 34.2545...
#> 3 2102 1 1 1 MULTIPOLYGON (((-76.49551 3...
#> 4 2102 3 1 1 MULTIPOLYGON (((-76.86322 3...
#> 5 2102 4 1 1 MULTIPOLYGON (((-82.74959 3...
#> 6 2102 3 4 4 MULTIPOLYGON (((-77.15074 3...
#> 7 2102 3 3 3 MULTIPOLYGON (((-77.22261 3...
#> 8 4082 1 1 1 MULTIPOLYGON (((-78.57828 3...
#> 9 4082 1 1 6 MULTIPOLYGON (((-78.49659 3...
#> 10 4082 1 4 2 MULTIPOLYGON (((-78.2597 35...
#> binomial compiler yrcompiled citation subspecies subpop source
#> 1 Simulus spp. 799 Simulation NA <NA> <NA> <NA> <NA>
#> 2 Simulus spp. 799 Simulation NA <NA> <NA> <NA> <NA>
#> 3 Simulus spp. 2102 Simulation NA <NA> <NA> <NA> <NA>
#> 4 Simulus spp. 2102 Simulation NA <NA> <NA> <NA> <NA>
#> 5 Simulus spp. 2102 Simulation NA <NA> <NA> <NA> <NA>
#> 6 Simulus spp. 2102 Simulation NA <NA> <NA> <NA> <NA>
#> 7 Simulus spp. 2102 Simulation NA <NA> <NA> <NA> <NA>
#> 8 Simulus spp. 4082 Simulation NA <NA> <NA> <NA> <NA>
#> 9 Simulus spp. 4082 Simulation NA <NA> <NA> <NA> <NA>
#> 10 Simulus spp. 4082 Simulation NA <NA> <NA> <NA> <NA>
#> island tax_comm dist_comm generalisd legend kingdom phylum class order_
#> 1 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 2 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 3 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 4 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 5 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 6 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 7 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 8 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 9 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> 10 <NA> <NA> <NA> NA <NA> <NA> <NA> <NA> <NA>
#> family genus category marine terrestial freshwater
#> 1 <NA> Simulus LC false true false
#> 2 <NA> Simulus LC false true false
#> 3 <NA> Simulus LC false true false
#> 4 <NA> Simulus LC false true false
#> 5 <NA> Simulus LC false true false
#> 6 <NA> Simulus LC false true false
#> 7 <NA> Simulus LC false true false
#> 8 <NA> Simulus VU false true false
#> 9 <NA> Simulus VU false true false
#> 10 <NA> Simulus VU false true false
# preview species habitat preference data
print(x$spp_habitat_data)
#> # A tibble: 21 × 6
#> id_no code habitat suitability season majorimportance
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 799 4.4 Grassland - Temperate Suitable Resid… NA
#> 2 799 14.3 Plantations Suitable Resid… NA
#> 3 799 14.1 Arable Land Suitable Resid… NA
#> 4 2102 4.4 Grassland - Temperate Suitable Resid… NA
#> 5 2102 3.4 Shrubland - Temperate Suitable Resid… NA
#> 6 2102 5.15 Wetlands (inland) - Seasonal/… Suitable Resid… NA
#> 7 2102 3.4 Shrubland - Temperate Suitable Non-b… NA
#> 8 2102 14.1 Arable Land Suitable Non-b… NA
#> 9 2102 14.3 Plantations Suitable Non-b… NA
#> 10 2102 14.3 Plantations Suitable Passa… NA
#> # ℹ 11 more rows
# preview species summary data
print(x$spp_summary_data)
#> # A tibble: 5 × 31
#> id_no taxonid scientific_name kingdom phylum class order family genus
#> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 799 799 Simulus spp. 799 NA NA NA NA NA Simulus
#> 2 2102 2102 Simulus spp. 2102 NA NA NA NA NA Simulus
#> 3 4082 4082 Simulus spp. 4082 NA NA NA NA NA Simulus
#> 4 5167 5167 Simulus spp. 5167 NA NA NA NA NA Simulus
#> 5 5479 5479 Simulus spp. 5479 NA NA NA NA NA Simulus
#> # ℹ 22 more variables: main_common_name <chr>, authority <chr>,
#> # published_year <chr>, assessment_date <chr>, category <chr>,
#> # criteria <chr>, population_trend <chr>, marine_system <chr>,
#> # freshwater_system <chr>, terrestrial_system <chr>, assessor <chr>,
#> # reviewer <chr>, aoo_km2 <chr>, eoo_km2 <chr>, elevation_upper <dbl>,
#> # elevation_lower <dbl>, depth_upper <dbl>, depth_lower <dbl>,
#> # errata_flag <chr>, errata_reason <chr>, amended_flag <chr>, …
# }