Literature thesis: Building a framework for retrieving information on multispecies interactions from published literature
Gabriel MuñozUniversiteit van Amsterdam, MSc. Biological Sciences, track Ecology and Evolution
Supervisor: Dr. Emiel van LoonUniversiteit van Amsterdam, Instituut voor Biodiversiteit en Ecosysteem Dynamica (IBED)
Examiner: Dr. Crystal Mc MichaelUniversiteit van Amsterdam, Instituut voor Biodiversiteit en Ecosysteem Dynamica (IBED)
The generation of new global hypothesis, destined to understand our current global biodiversity crisis, requires a large amount of information. Our knowledge in Ecology is principally contained in the form of published articles. This global body of literature holds a significant amount of primary data on species distributions and interactions across a large geographical and temporal scale. In this literature review I explore the use of different computational tools in text mining and machine learning to facilitate the task of search and classify articles with information on species interactions from published literature. Semantic relatedness between unrelated scientific articles partially limits automatic classification of literature, but further research is needed. Automatic extraction of specific entities such as species names or geographical locations reduces the effort of a researcher to manually identify articles of interest from a pool of literature. In general, text mining and machine learning frameworks applied to Ecology offers a new alternative to conduct literature reviews in a efficient and reproductible way. Open Access and free mining access to journals will definetly improve the implementation of such frameworks, and will lead towards a general global understanding of biodiversity on Earth.
The online version of this document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.