Prioritize funding for conservation projects with species weights and using exact algorithms. Unlike other algorithms for solving the 'Project Prioritization Protocol' (Joseph, Maloney & Possingham 2009), this method can identify solutions that are guaranteed to be optimal (or within a pre-specified optimality gap; see Underhill 1994; Rodrigues & Gaston 2002). As a consequence, it is strongly recommended to use this method for developing project prioritizations.

ppp_exact_spp_solution(x, y, spp, budget, project_column_name,
  success_column_name, action_column_name, cost_column_name,
  species_column_name, weight_column_name, locked_in_column_name = NULL,
  locked_out_column_name = NULL, gap = 1e-06, threads = 1L,
  number_solutions = 1L, time_limit = .Machine$integer.max,
  number_approx_points = 300, verbose = FALSE)

Arguments

x

data.frame or tibble table containing project data. Here, each row should correspond to a different project and columns should contain data that correspond to each project. This object should contain data that denote (i) the name of each project (specified in the argument to project_column_name), (ii) the probability that each project will succeed if all of its actions are funded (specified in the argument to success_column_name), (iii) the enhanced probability that each species will persist if it is funded, and (iv) and which actions are associated with which projects (specified in the action names in the argument to y). To account for the combined benefits of multiple actions (e.g. baiting and trapping different invasive species in the same area), additional projects should be created that indicate the combined cost and corresponding species' persistence probabilities. Furthermore, this object must have a baseline project, with a zero cost, that represents the probability that each species will persist if no other conservation project is funded.

y

data.frame or tibble table containing the action data. Here, each row should correspond to a different action and columns should contain data that correspond to each action. This object should contain data that denote (i) the name of each action (specified in the argument to action_column_name), (ii) the cost of each action (specified in the argument to cost_column_name). If certain actions should be locked in or out of the solution, then this object should also contain data that denote (iii) which actions should be locked in (specified using the argument to locked_in_column_name if relevant) and (iv) which actions should be locked out (specified using the argument to locked_out_column_name if relevant).

spp

data.frame or tibble table containing the species data. Here, each row should correspond to a different species and columns should contain data that correspond to each species. This object should contain data that denote (i) the name of each species (specified in the argument to species_column_name). It may also contain (ii) the weight for each species (specified in the argument to weight_column_name if relevant).

budget

numeric value that represents the total budget available for funding conservation actions.

project_column_name

character name of column that contains the name for each conservation project. This argument corresponds to the argument to x. Note that the project names must not contain any duplicates or missing values.

success_column_name

character name of column that denotes the probability that each project will succeed. This argument corresponds to the argument to x. This column must have numeric values which lay between zero and one. No missing values are permitted.

action_column_name

character name of column that contains the name for each conservation action. This argument corresponds to the argument to y. Note that the project names must not contain any duplicates or missing values.

cost_column_name

character name of column that indicates the cost for funding each action. This argument corresponds to the argument to y. This column must have numeric values which are equal to or greater than zero. No missing values are permitted.

species_column_name

character name of the column that contains the name for each species. This argument corresponds to the argument to spp.

weight_column_name

character name of the column that contains the weight for each species. This argument corresponds to the argument to spp. This argument defaults to NULL, such that all species are assigned an equal weighting.

locked_in_column_name

character name of column that indicates which actions should be locked into the funding scheme. This argument corresponds to the argument to y. For example, it may be desirable to mandate that projects for iconic species are funded in the prioritization. This column should contain logical values, and projects associated with TRUE values are locked into the solution. No missing values are permitted. Defaults to NULL such that no projects are locked into the solution.

locked_out_column_name

character name of column that indicates which actions should be locked out of the funding scheme. This argument corresponds to the argument to y. For example, it may be desirable to lock out projects for certain species that are expected to have little support from the public. This column should contain logical values, and projects associated with TRUE values are locked out of the solution. No missing values are permitted. Defaults to NULL such that no projects are locked out of the solution.

gap

numeric optimality gap. This gap should be expressed as a proportion. For example, to find a solution that is within 10 % of optimality, then 0.1 should be supplied. No missing values are permitted. Defaults to 0, so that the optimal solution will be returned.

threads

numeric number of threads for computational processing. No missing values are permitted. Defaults to 1.

number_solutions

numeric number of solutions to return. If the argument is greater than 1, then the output will contain the set number of solutions that are closest to optimality. No missing values are permitted. Defaults to 1.

time_limit

numeric maximum number of seconds that should be spent searching for a solution after formatting the data. Effectively, defaults to no time limit (but specifically is .Machine$integer.max). No missing values are permitted.

number_approx_points

numeric number of points to use for approximating the probability that branches will go extent. Larger values increase the precision of these calculations. No missing values are permitted. Defaults to 300.

verbose

logical should information be printed while solving the problem? No missing values are permitted. Defaults to FALSE.

Value

A tibble object containing the solution(s) data. Each row corresponds to a different solution, and each column describes a different property of the solution. The object contains a column for each project (based on the argument to project_column_name) which contains logical values indicating if the project was prioritized for funded (TRUE) or not (FALSE) in a given solution. Additionally, the object also contains the following columns:

"solution"

integer solution identifier.

"method"

character name of method used to produce the solution(s).)

"budget"

numeric budget used for generating each of the of the solution(s).

"obj"

numeric objective value. If phylogenetic data were input, then this column contains the expected phylogenetic diversity (Faith 2008) associated with each of the solutions. Otherwise, this column contains the expected weighted species richness (i.e. the sum of the product between the species' persistence probabilities and their weights.

"cost"

numeric total cost associated with each of of the solution(s).

"optimal"

logical indicating if each of the solution(s) is known to be optimal (TRUE) or not (FALSE). Missing values (NA) indicate that optimality is unknown (i.e. because the method used to produce the solution(s) does not provide any bounds on their quality).

Details

This function works by formulating the 'Project Prioritization Protocol' as a mixed integer programming problem (MIP) and solving it using the Gurobi optimization software suite. Although Gurobi is a commercial software, academics can obtain a special license for no cost. After downloading and installing the hrefhttps://www.gurobi.comGurobi software suite, the gurobi package will also need to be installed (see instructions for Linux, Mac OSX, and Windows operating systems). Finally, the gurobi package will also need to be installed (see instructions for Linux, Mac OSX, and Windows operating systems).

This problem aims to maximize expected species weighted richness given a budget. Let \(S\) denote the set of species (indexed by \(s\)), and let \(W_s\) denote the weight for each species. Additionally, let E_s denote the probability that each species will go extinct given the funded conservation projects. The objective can be expressed as: $$ \sum_{s}^{S} (1 - E_s) W_s $$ For the complete mathematical formulation, please refer to the formulation for maximizing expected phylogenetic diversity (i.e. ppp_exact_phylo_solution). This is because maximizing expected weighted species richness is merely a special-case of expected phylogenetic diversity---instead of using a complete phylogeny, expected weighted species richness simply uses a star phylogeny with branch lengths set according to the species' weights.

References

Faith DP (2008) Threatened species and the potential loss of phylogenetic diversity: conservation scenarios based on estimated extinction probabilities and phylogenetic risk analysis. Conservation Biology, 22: 1461--1470.

Joseph LN, Maloney RF & Possingham HP (2009) Optimal allocation of resources among threatened species: A project prioritization protocol. Conservation Biology, 23, 328--338.

Rodrigues AS & Gaston KJ (2002) Optimisation in reserve selection procedures---why not? Biological Conservation, 107: 123-129.

Underhill LG (1994) Optimal and suboptimal reserve selection algorithms. Biological Conservation, 70: 85--87.

See also

For other methods for solving the 'Project Prioritization Protocol' problem, see ppp_heuristic_phylo_solution, ppp_manual_phylo_solution, and ppp_random_phylo_solution. To visualize the effectiveness of a particular solution, see ppp_plot_phylo_solution.

Examples

# load built-in data data(sim_project_data, sim_action_data, sim_species_data) # print simulated project data set print(sim_project_data)
#> # A tibble: 6 x 13 #> name success S1 S2 S3 S4 S5 S1_action S2_action S3_action #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> #> 1 S1_p~ 0.919 0.791 0 0 0 0 TRUE FALSE FALSE #> 2 S2_p~ 0.923 0 0.888 0 0 0 FALSE TRUE FALSE #> 3 S3_p~ 0.829 0 0 0.502 0 0 FALSE FALSE TRUE #> 4 S4_p~ 0.848 0 0 0 0.690 0 FALSE FALSE FALSE #> 5 S5_p~ 0.814 0 0 0 0 0.617 FALSE FALSE FALSE #> 6 base~ 1 0.298 0.250 0.0865 0.249 0.182 FALSE FALSE FALSE #> # ... with 3 more variables: S4_action <lgl>, S5_action <lgl>, #> # baseline_action <lgl>
# print simulated action data print(sim_action_data)
#> # A tibble: 6 x 4 #> name cost locked_in locked_out #> <chr> <dbl> <lgl> <lgl> #> 1 S1_action 94.4 FALSE FALSE #> 2 S2_action 101. FALSE FALSE #> 3 S3_action 103. TRUE FALSE #> 4 S4_action 99.2 FALSE FALSE #> 5 S5_action 99.9 FALSE TRUE #> 6 baseline_action 0 FALSE FALSE
# print simulated species data print(sim_species_data)
#> # A tibble: 5 x 2 #> name weight #> <chr> <dbl> #> 1 S3 0.211 #> 2 S1 0.211 #> 3 S5 0.221 #> 4 S2 0.630 #> 5 S4 1.59
# verify if guorbi package is installed if (!require(gurobi, quietly = TRUE)) stop("the gurobi R package is not installed.") # find a solution that meets a budget of 300 s1 <- ppp_exact_spp_solution(sim_project_data, sim_action_data, sim_species_data, 300, "name", "success", "name", "cost", "name", "weight") # print solution print(s1)
#> # A tibble: 1 x 12 #> solution method obj budget cost optimal S1_action S2_action S3_action #> <int> <chr> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> <lgl> #> 1 1 exact 1.66 300 295. TRUE TRUE TRUE FALSE #> # ... with 3 more variables: S4_action <lgl>, S5_action <lgl>, #> # baseline_action <lgl>
# plot solution ppp_plot_spp_solution(sim_project_data, sim_action_data, sim_species_data, s1, "name", "success", "name", "cost", "name", "weight")
# find a solution that meets a budget of 300 and allocates # funding for the "S3_action" project. For instance, species "S3" might # be an iconic species that has cultural and economic importance. sim_action_data2 <- sim_action_data sim_action_data2$locked_in <- sim_action_data2$name == "S3_action" s2 <- ppp_exact_spp_solution(sim_project_data, sim_action_data2, sim_species_data, 300, "name", "success", "name", "cost", "name", "weight", locked_in_column_name = "locked_in") # print solution print(s2)
#> # A tibble: 1 x 12 #> solution method obj budget cost optimal S1_action S2_action S3_action #> <int> <chr> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> <lgl> #> 1 1 exact 1.37 300 297. TRUE TRUE FALSE TRUE #> # ... with 3 more variables: S4_action <lgl>, S5_action <lgl>, #> # baseline_action <lgl>
# plot solution ppp_plot_spp_solution(sim_project_data, sim_action_data2, sim_species_data, s2, "name", "success", "name", "cost", "name", "weight")
# find a solution that meets a budget of 300 and does not allocate # funding for the "S2_action" project. For instance, species "S2" # might have very little cultural or economic importance. Broadly speaking, # though, it is better to "lock in" "important" species rather than # "lock out" unimportant species. sim_action_data3 <- sim_action_data sim_action_data3$locked_out <- sim_action_data3$name == "S2_action" s3 <- ppp_exact_spp_solution(sim_project_data, sim_action_data3, sim_species_data, 300, "name", "success", "name", "cost", "name", "weight", locked_out_column_name = "locked_out") # print solution print(s3)
#> # A tibble: 1 x 12 #> solution method obj budget cost optimal S1_action S2_action S3_action #> <int> <chr> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> <lgl> #> 1 1 exact 1.37 300 294. TRUE TRUE FALSE FALSE #> # ... with 3 more variables: S4_action <lgl>, S5_action <lgl>, #> # baseline_action <lgl>
# plot solution ppp_plot_spp_solution(sim_project_data, sim_action_data3, sim_species_data, s3, "name", "success", "name", "cost", "name", "weight")
# find the top solutions s4 <- ppp_exact_spp_solution(sim_project_data, sim_action_data, sim_species_data, 300, "name", "success", "name", "cost", "name", "weight", number_solutions = 1000)
#> Warning: although 1000 requested, only 18 solutions exist.
# print solution print(s4)
#> # A tibble: 18 x 12 #> solution method obj budget cost optimal S1_action S2_action S3_action #> <int> <chr> <dbl> <dbl> <dbl> <lgl> <lgl> <lgl> <lgl> #> 1 1 exact 1.66 300 295. TRUE TRUE TRUE FALSE #> 2 2 exact 1.60 300 295. FALSE TRUE TRUE FALSE #> 3 3 exact 1.57 300 200. FALSE FALSE TRUE FALSE #> 4 4 exact 1.45 300 200. FALSE FALSE TRUE FALSE #> 5 5 exact 1.37 300 294. FALSE TRUE FALSE FALSE #> 6 6 exact 1.37 300 297. FALSE TRUE FALSE TRUE #> 7 7 exact 1.30 300 194. FALSE TRUE FALSE FALSE #> 8 8 exact 1.28 300 199. FALSE FALSE FALSE FALSE #> 9 9 exact 1.28 300 202. FALSE FALSE FALSE TRUE #> 10 10 exact 1.21 300 99.2 FALSE FALSE FALSE FALSE #> 11 11 exact 1.20 300 294. FALSE TRUE FALSE FALSE #> 12 12 exact 1.20 300 295. FALSE TRUE TRUE FALSE #> 13 13 exact 1.19 300 299. FALSE TRUE TRUE TRUE #> 14 14 exact 1.17 300 297. FALSE TRUE FALSE TRUE #> 15 15 exact 1.12 300 195. FALSE TRUE TRUE FALSE #> 16 16 exact 1.10 300 201. FALSE FALSE TRUE FALSE #> 17 17 exact 1.10 300 204. FALSE FALSE TRUE TRUE #> 18 18 exact 1.08 300 194. FALSE TRUE FALSE FALSE #> # ... with 3 more variables: S4_action <lgl>, S5_action <lgl>, #> # baseline_action <lgl>