Literature thesis: Building a framework for retrieving information on multispecies interactions from published literature
The generation of new global hypothesis, destined to understand our current global biodiversity crisis, requires a large amount of information. Our knowledge in Ecology is principally contained in the form of published articles. This global body of literature holds a significant amount of primary data on species distributions and interactions across a large geographical and temporal scale. In this literature review, I explore the use of different computational tools in text mining and machine learning to facilitate the task of search and classify articles with information on species interactions from published literature. Semantic relatedness between unrelated scientific articles partially limits automatic classification of literature, but further research is needed. Automatic extraction of specific entities such as species names or geographical locations reduces the effort of a researcher to manually identify articles of interest from a pool of literature. In general, text mining and machine learning frameworks applied to Ecology offers a new alternative to conducting literature reviews in an efficient and reproducible way. Open Access and free mining access to journals will definitely improve the implementation of such frameworks and will lead towards a general global understanding of biodiversity on Earth
The online version of this document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.