Thinking about prior distributions 

HOME 
Bayesian analysis requires priors. I see that as a huge
practical advantage, especially when producing information for
management: we can incorporate information derived from past
research over more than 100 years. The tradition in ecology,
however, is to use priors which give minimal information. Truly
'uninformative' priors do not exist  with the possible
exception of a uniform prior for a probability  so care is
needed to provide information which makes biological sense
within the context of your own analysis. As a colleague has pointed out, there are many specific types of priors described in the literature, and rarely a proper discussion of the reasons for choosing that particular distribution and those parameters. This is understandable with large complex models, such as multispecies occupancy models, with many priors and hyperpriors, but remains a problem. This post aims to give you some ways to explore possible prior distributions and assess their suitability for your analysis. Stick to what you knowSet priors on quantities that have biological meaning. You will have an intuitive feel for the plausible range of values. They may be transformed  perhaps as a log or a logit  in the model, but set priors for the real values. This applies specifically to standard deviations (SDs) for
normal and t distributions, where JAGS uses precision,
When is uniform informative?Probabilities have to be between 0 and 1, so a uniform probability distribution is proper, and it should not be a informative, right? But once you transform your probabilities to the logit scale, they look very different:
On the logit scale our "uninformative" uniform prior becomes a really tight distribution with an SD of 1.8, which would be judged to be highly informative! For a prior on the logit scale, we'd want a normal prior with SD 5 or 10. Let's see what that means on the probability scale:
Now our minimallyinformative, broad prior on the logit scale becomes highly informative  and nonsensical  on the probability scale. I can't think of an example where values near 0 and 1 are most plausible and those in the middle are least plausible. In a logistic regression model the intercept is the logitscale equivalent of the probability when all covariates = 0. With centred covariates that corresponds to something meaningful, and you should put a prior on the probability then convert to the logit form. For example:
Similar logic applies to logtransformed density or abundance: put a prior on the biological quantity and convert as necessary for the regression.
Look at the distribution you useA key part of assessing the suitability of a prior distribution is to plot it. I also find it helpful to generate 2030 random values from the distribution and plot them as a rug. In a recent post on
crossvalidation we used a halfCauchy hyperprior with
scale parameter 2.25 for the coefficients of
In the plot we see that most of the mass for the Cauchy distribution is below 5, but the tails allow for very large values  we have values in the thousands, which are surely preposterous. The t4 distribution gives much more sensible values. The
tdistribution is coded in

Updated 28 Oct 2019 by Mike Meredith 