Probabilistic model of species co-occurrence

The probabilistic model of species co-occurrence is a straightforward way of testing the statistical significance of species co-occurrence patterns.  This paper (Veech_2006_J_Biogeography) was an intial step in developing the probabilistic model, although it still relied on data randomization (i.e., null models) to test for non-random co-occurrence between species. The model was more fully developed and presented in these papers (Veech_2013_GEB and Veech_2014_J_Biogeography) as an entirely analytical solution that does not require data randomization and statistical distributions. The model relies solely on basic probability theory that involves combinatorics. For a given observed frequency of co-occurrence of two species among a set of sampling or survey sites, the model provides the probability that two species would co-occur at a greater (or lesser) number of  sites if they were distributed independently of one another.  In this way, the model can be used to identify species pairs that are positively, negatively, and randomly associated. The model is essentially equivalent to Fisher’s exact test, developed in the early 1920s by the famous statistician Sir Ronald Fisher.  Although the FET has been around awhile, it had previously rarely been used to test for non-random species associations. Daniel Griffith, Charlie Marsh, and I developed an R package, co-occur, for running the model (test) and displaying graphical results (see Griffith_et_al_2016). The package continues to be very widely used, including in applications beyond just ecological studies.

In another collaboration, Giovanni Strona (now at the University of Helsinki) and I modified the model such that it can be used to examine whether two species share a greater or lesser number of “partner” species than would be expected by random or chance interaction. In this way, the model forms the basis of a new metric for measuring structure in ecological networks. Giovanni came up with this idea several years ago and recruited me to help out. We have published several papers describing the metric (Strona_and_Veech_MEE_2015, Strona_and_Veech_2017).

In a separate collaboration, Dan McGarvey (Virginia Commonwealth University) used the probabilistic model to build and analyze fish co-occurrence networks. The extent to which a pair of species co-occurs at a given number of sampling or survey sites can be used to link the species in a network (McGarvey and Veech 2018).