Not seeing the forest for the trees: Generalised linear model out-performs random forest in species distribution modelling for Southeast Asian felids
Paper by Luca Chiaverini, David W. Macdonald, Andrew J. Hearn, Żaneta Kaszta, Eric Ash, Helen M. Bothwell, Özgün Emre Can, Phan Channa, Gopalasamy Reuben Clements, Iding Achmad Haidir, Pyae Phyoe Kyaw, Jonathan H. Moore, Akchousanh Rasphone, Cedric Kai Wei Tan, Samuel A. Cushman.
Habitat suitability models represent a precious tool to inform conservation actions by highlighting the fundamental environmental factors driving species’ presence, as well as by showing where species’ suitable habitats occur. One of the premises of habitat suitability models is that the modelling framework adopted would unambiguously and objectively resolve the species-habitat relationships. However, this is not what has been observed in our recent work published in Ecological Informatics, where we compared two algorithms (generalised linear model and random forest) and four modelling frameworks to model habitat suitability for Southeast Asian felid species.
For the same species, the differences between the outcomes of the varying modelling frameworks were substantial, sometimes showing even negative correlations. We then assessed whether, among those that we tested, there was an algorithm that would consistently produce statistically stronger models than the other. We started from the assumption that, in recent years, machine learning algorithms such as random forest have often been shown to be stronger than other parametric algorithms, such as generalised linear model. Also, in this case, however, we found it surprising to prove that models produced with generalised linear model showed higher validation metrics in more cases than random forest.
These results are cause of concern: if there are large and systematic differences in the predicted suitability models produced between algorithms, irrespective of the underlying ecological relationships, then we raised a red flag on the fact that there is likely a much higher degree of uncertainty in habitat suitability modelling than previously appreciated. Our first recommendation is that much more care has to be taken when choosing the modelling framework and the algorithm to adopt to model habitat suitability, and that multiple methods should be compared, selecting the one that is more appropriate and effective in a specific system and for a specific species.
Link to full article: Not seeing the forest for the trees: Generalised linear model out-performs random forest in species distribution modelling for Southeast Asian felids – ScienceDirect