Health Metrics and the Spread of Infectious Diseases

Machine Learning Applications and Spatial Modelling Analysis with R

Author

Federica Gazzelloni

Published

June 9, 2023

Preface

Last updated: 2024-08-27 10:58:02 CEST

❗️The book is continuously being refined and expanded.


This book will teach you about health metrics such as Disability Adjusted Life Years (DALYs), Years of Life Lost (YLLs), Years Lived with Disabilities (YLDs), and others. It explains how to calculate these metrics and discusses their components in detail. You will explore the machine learning framework and learn how to apply the influence of infectious disease dynamics on trends in these metrics . The book equips you with all the necessary tools for data collection, analysis, visualization and modelling.

Think of it as a toolbox for assessing the health of a country and comparing it with others. You will also learn how to select the best tools for predicting future health trends using statistics, visualizing data, and working with maps in the R programming language.

Consider this book your guide to understanding the health status of a population at both global and country levels, leveraging expertise in managing various statistical tools.

Audience and Utility of the Book

This book serves as both a manual and a textbook for introductory courses in health metrics data analysis. Additionally, it provides valuable source code for practitioners and data scientists. The book offers a comprehensive set of tools for analyzing various models through tailored case studies. It focuses on health data, providing an overview of the burden of diseases to facilitate comparisons between populations’ health status. By combining theoretical insights with practical applications, the book aims to equip readers with the necessary skills to conduct health data analyses and make informed decisions.

Prerequisites

Before delving into the book, it is beneficial for readers to have a basic understanding of health concepts and terminologies. Familiarity with fundamental statistical concepts can aid in comprehending the metrics discussed. Additionally, a grasp of basic epidemiological principles and awareness of global health challenges will enhance the reader’s engagement. However, it is important to note that if this knowledge is not already in place, a dedicated section in the book provides the necessary background.

A basic knowledge of R programming language is required for those interested in the technical aspects of health data analysis, which cover a large part of this book. An open mindset and curiosity about the evolving field of health metrics are key prerequisites, as the book covers a spectrum from historical perspectives to modern machine learning applications.

Overall, a multidisciplinary approach, combining aspects of health sciences, statistics, and technology, will enrich the reader’s experience.

Acknowledgements

A big thank you to all my friends and colleagues for their support throughout this journey. Your encouragement and belief in my work have been invaluable.

This book is the result of nearly four years of dedicated work, research, and continuous learning in the field of data science and public health. My journey began as a consequence of the COVID-19 outbreak. Witnessing the global impact of this pandemic inspired me to investigate the spread of infectious diseases and contribute to the understanding and management of public health crises.

I would also like to acknowledge the invaluable assistance of ChatGPT, an AI language model developed by OpenAI, which provided essential inputs, helped refine complex ideas, and offered suggestions that greatly enhanced the content of this book. The ability to leverage such advanced technology has significantly contributed to the clarity and depth of the material presented.

Finally, I would like to express my profound gratitude to the researchers, data scientists, and public health experts whose work and insights have been referenced and built upon throughout this book. Your pioneering efforts and dedication to improving public health have laid the foundation for the analyses and models discussed herein. Your contributions have been a source of inspiration and guidance.

About the Author

The author of this book is Federica Gazzelloni1, Actuary and Statistician by education and training. She is also a collaborator2 at the Institute for Health Metrics and Evaluation (IHME), which inspired this work to serve as a manual providing formulas and code for working with health metrics. Federica began focusing on modeling health data, particularly infectious diseases, following the Covid-19 outbreak, when this deadly disease was rapidly spreading worldwide. Before that, she briefly worked in corporate and academic environments, serving as a research practitioner actuary and teaching mathematics to high school students and computer science to university students. She is actively involved with the open source tech community, collaborating with organizations such as Actex learning, The Carpentries, Bioconductor, and the R Consortium. More about her ongoing projects here: https://federicagazzelloni.com/

Data Sources

All data used in this book are from the Institute for Health Metrics and Evaluation (IHME). GBD Results. Seattle, WA: IHME, University of Washington, 2020 (https://vizhub.healthdata.org/gbd-results/ - accessed January 2023), the World Health Organization (WHO), Global Health Observatory data repository (https://apps.who.int/gho/data/), and the European Centre for Disease Prevention and Control (ECDC) (https://opendata.ecdc.europa.eu/covid19/).

In particular, for reproducibility all data used throughout the book is collected in the hmsidwR R package, years span from 2000 to 2021, and both the GBD 2019 and 2021 is used. Careful consideration is to be done when data is extracted before the next GBD study data is released, as then all data is updated to consider the estimates of latest GBD study. In some parts of the book the data is extracted from the IHME API, the GHO and Athena WHO API, the code is provided in the book.

Code of Conduct

Please note that this book is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.


  1. To learn more about the author’s ongoing projects, please visit here: https://www.federicagazzelloni.com↩︎

  2. If you’d like to know more about who are GBD Collaborators have a look here: https://www.healthdata.org/research-analysis/gbd/collaborator-network↩︎