A guide to Bayesian model selection for ecologists

Published on Feb 1, 2015in Ecological Monographs7.698
· DOI :10.1890/14-0661.1
Mevin B. Hooten30
Estimated H-index: 30
N.T. Hobbs1
Estimated H-index: 1
(CSU: Colorado State University)
The steady upward trend in the use of model selection and Bayesian methods in ecological research has made it clear that both approaches to inference are important for modern analysis of models and data. However, in teaching Bayesian methods and in working with our research colleagues, we have noticed a general dissatisfaction with the available literature on Bayesian model selection and multimodel inference. Students and researchers new to Bayesian methods quickly find that the published advice on model selection is often preferential in its treatment of options for analysis, frequently advocating one particular method above others. The recent appearance of many articles and textbooks on Bayesian modeling has provided welcome background on relevant approaches to model selection in the Bayesian framework, but most of these are either very narrowly focused in scope or inaccessible to ecologists. Moreover, the methodological details of Bayesian model selection approaches are spread thinly throughout the literature, appearing in journals from many different fields. Our aim with this guide is to condense the large body of literature on Bayesian approaches to model selection and multimodel inference and present it specifically for quantitative ecologists as neutrally as possible. We also bring to light a few important and fundamental concepts relating directly to model selection that seem to have gone unnoticed in the ecological literature. Throughout, we provide only a minimal discussion of philosophy, preferring instead to examine the breadth of approaches as well as their practical advantages and disadvantages. This guide serves as a reference for ecologists using Bayesian methods, so that they can better understand their options and can make an informed choice that is best aligned with their goals for inference.
  • References (80)
  • Citations (288)
📖 Papers frequently viewed together
126k Citations
1 Author (Martyn Plummer)
1,950 Citations
2,458 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Bani K. MallickH-Index: 36
SUMMARY. A simple method for subset selection of independent variables in regression models is proposed. We expand the usual regression equation to an equation that incorporates all possible subsets of predictors by adding indicator variables as parameters. The vector of indicator variables dictates which predictors to include. Several choices of priors can be employed for the unknown regression coefficients and the unknown indicator parameters. The posterior distribution of the indicator vector...
340 Citations
#1Andrew Gelman (Columbia University)H-Index: 79
#2Jessica Hwang (Harvard University)H-Index: 2
Last. Aki Vehtari (Aalto University)H-Index: 30
view all 3 authors...
515 CitationsSource
126k Citations
#1I Darryl MacKenzieH-Index: 1
Occupancy estimation and modeling , Occupancy estimation and modeling , کتابخانه دیجیتال جندی شاپور اهواز
166 Citations
#1Richard J. Barker (University of Otago)H-Index: 26
#2William A. Link (Patuxent Wildlife Research Center)H-Index: 5
Bayesian multimodel inference treats a set of candidate models as the sample space of a latent categorical random variable, sampled once; the data at hand are modeled as having been generated according to the sampled model. Model selection and model averaging are based on the posterior probabilities for the model set. Reversible-jump Markov chain Monte Carlo (RJMCMC) extends ordinary MCMC methods to this meta-model. We describe a version of RJMCMC that intuitively represents the process as Gibbs...
21 CitationsSource
#1Devin S. Johnson (NOAA: National Oceanic and Atmospheric Administration)H-Index: 24
#2Paul B. Conn (NOAA: National Oceanic and Atmospheric Administration)H-Index: 19
Last. Bruce A. Pond (Ontario Ministry of Natural Resources)H-Index: 15
view all 5 authors...
Since its development, occupancy modeling has become a popular and useful tool for ecologists wishing to learn about the dynamics of species occurrence over time and space. Such models require presence–absence data to be collected at spatially indexed survey units. However, only recently have researchers recognized the need to correct for spatially induced overdisperison by explicitly accounting for spatial autocorrelation in occupancy probability. Previous efforts to incorporate such autocorrel...
84 CitationsSource
#1Andrew Gelman (Columbia University)H-Index: 79
#2Cosma Rohilla Shalizi (CMU: Carnegie Mellon University)H-Index: 25
A substantial school in the philosophy of science identies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and th...
253 CitationsSource
#1Herb SutterH-Index: 1
The major processor manufacturers and architectures, from Intel and AMD to Sparc and PowerPC, have run out of room with most of their traditional approaches to boosting CPU performance. Instead of driving clock speeds and straight-line instruction throughput ever higher, they are instead turning en masse to hyperthreading and multicore architectures. Both of these features are already available on chips today; in particular, multicore is available on current PowerPC and Sparc IV processors, and ...
597 Citations
#1Sumio Watanabe (TITech: Tokyo Institute of Technology)H-Index: 20
A statistical model or a learning machine is called regular if the map taking a parameter to a probability distribution is one-to-one and if its Fisher information matrix is always positive definite. If otherwise, it is called singular. In regular statistical models, the Bayes free energy, which is defined by the minus logarithm of Bayes marginal likelihood, can be asymptotically approximated by the Schwarz Bayes information criterion (BIC), whereas in singular models such approximation does not...
157 Citations
#1Howard D. Bondell (NCSU: North Carolina State University)H-Index: 19
#2Brian J. Reich (NCSU: North Carolina State University)H-Index: 28
For high-dimensional data, particularly when the number of predictors greatly exceeds the sample size, selection of relevant predictors for regression is a challenging problem. Methods such as sure screening, forward selection, or penalized regressions are commonly used. Bayesian variable selection methods place prior distributions on the parameters along with a prior over model space, or equivalently, a mixture prior on the parameters having mass at zero. Since exhaustive enumeration is not fea...
50 CitationsSource
Cited By288
#1Will SowersbyH-Index: 1
Last. Björn RogellH-Index: 14
view all 3 authors...
Abstract Hierarchical models are used to study the relationship between a response variable and a predictor in structured data. Random effects are meant to capture the structured part of variability among groups of observations. In ecology, random effects are usually incorporated into the intercept. Their application to the other parameters of the curve, especially in nonlinear curves, has been understudied. However, applying random effects to different parameters of the function is of interest,...
#1Michael J. Hooker (UGA: University of Georgia)H-Index: 1
#2Richard B. Chandler (UGA: University of Georgia)H-Index: 24
Last. Michael J. Chamberlain (UGA: University of Georgia)H-Index: 14
view all 4 authors...
#2Daniel C. Gwinn (UWA: University of Western Australia)H-Index: 1
Last. Robert A. McCleeryH-Index: 15
view all 4 authors...
The invasive Burmese python (Python molurus bivittatus) is causing declines in the numbers and diversity of native mammals in the Greater Everglades Ecosystem (GEE). However, limited evidence suggests that some species may be less susceptible to pythons than others. This difference in susceptibility may be a function of different life-history traits. We analysed incidence data with a multi-species hierarchical occupancy model to evaluate the influence of pythons on native mammals and examine the...
#1Ruichang Zhang (University of Tübingen)H-Index: 1
#2Katja Tielbörger (University of Tübingen)H-Index: 28
Facilitation studies typically compare plants under differential stress levels with and without neighbors, while the density of neighbors has rarely been addressed. However, recent empirical studies indicate that facilitation may be density-dependent too and peak at intermediate neighbor densities. Here, we propose a conceptual model to incorporate density-dependence into theory about changes of plant–plant interactions under stress. To test our predictions, we combine an individual-based model ...
#1Katharine M. Banner (MSU: Montana State University)H-Index: 2
#2Kathryn M. Irvine (USGS: United States Geological Survey)H-Index: 13
Last. Thomas J. Rodhouse (NPS: National Park Service)H-Index: 12
view all 3 authors...
#1Quresh S. LatifH-Index: 7
#2Victoria A. SaabH-Index: 25
Last. Kim Mellen-McLeanH-Index: 4
view all 5 authors...
Salvage logging in burned forests can negatively affect habitat for white-headed woodpeckers (Dryobates albolarvatus), a species of conservation concern, but also meets socioeconomic demands for timber and human safety. Habitat suitability index (HSI) models can inform forest management activities to help meet habitat conservation objectives. Informing post-fire forest management, however, involves model application at new locations as wildfires occur, requiring evaluation of predictive performa...
#1Scott SchlossbergH-Index: 3
#2Kathleen S. Gobush (UW: University of Washington)H-Index: 6
Last. Edward M. Kohi (Tanzania Wildlife Research Institute)H-Index: 3
view all 6 authors...
Populations of African savannah elephants (Loxodonta africana) have been declining due to poaching, human-elephant conflict, and habitat loss. Understanding the causes of these declines could aid in stabilizing elephant populations. We used data from the Great Elephant Census, a 19-country aerial survey of savannah elephants conducted in 2014 and 2015, to examine effects of a suite of variables on elephant mortality. Independent variables included spatially explicit measures of natural processes...
#1Ann Raiho (ND: University of Notre Dame)H-Index: 1
#2Michael C. Dietze (BU: Boston University)H-Index: 36
Last. Jason S. McLachlan (ND: University of Notre Dame)H-Index: 20
view all 6 authors...
Predictions from ecological models necessarily include five different uncertainties: demographic stochasticity, initial conditions, external forcing (i.e., drivers/covariates), parameters, and model processes. However, most predictions from process-based ecological models only account for a subset of these uncertainties (e.g. only demographic stochasticity). This underestimation of uncertainty runs the risk of producing precise, but inaccurate predictions. To address these limitations, we create...