Lời tựa



“The problems are solved, not by giving new information, but by arranging what we have known since long.”

Ludwig Wittgenstein, Philosophical Investigations


L ời trích dẫn trên cho thấy thế giới của chúng ta rất phức tạp. Nghiên cứu khoa học cũng không ngoại lệ; trong hầu hết các lĩnh vực nghiên cứu, chúng ta thường phải đối mặt với một cơ thể dường như không thể vượt qua của nghiên cứu trước đó. Bằng chứng từ các nghiên cứu khác nhau có thể mâu thuẫn với nhau và có thể khó hiểu từ các nguồn thông tin khác nhau.

Do đó, các phương pháp tổng hợp bằng chứng đóng một vai trò quan trọng trong nhiều ngành, ví dụ như khoa học xã hội, y học, sinh học hoặc kinh tế lượng. Phân tích gộp, một quy trình thống kê được sử dụng để kết hợp các kết quả của các nghiên cứu hoặc phân tích khác nhau, đã trở thành một công cụ không thể thiếu trong nhiều lĩnh vực nghiên cứu. Phân tích gộp có tầm quan trọng to lớn, đặc biệt nếu chúng giúp định hướng việc ra quyết định thực tế hoặc các nỗ lực nghiên cứu trong tương lai. Do đó, nhiều nhà nghiên cứu ứng dụng thường có kỹ năng phân tích gộp trong các “lựa chọn phân tích thống kê của mình”, trong khi số khác muốn học cách thực hiện chúng trong lĩnh vực nghiên cứu của riêng họ. Phân tích gộp đã trở nên phổ biến đến mức nhiều sinh viên đại học và sau đại học đã học cách thực hiện phân tích gộp trong khóa luận của mình - với các mức độ chuyên sâu khác nhau.

Cũng giống như các thuật toán thống kê nói chung, các phương pháp phân tích gộp đã có những thay đổi lớn trong những thập kỷ qua. Việc này là do sự gia tăng nhanh chóng của các phần mềm thống kê mã nguồn mở, điển hình là Ngôn ngữ lập trình R. Hệ sinh thái R cho phép các nhà nghiên cứu và nhà thống kê ở khắp mọi nơi xây dựng các packages của riêng mình và cung cấp miễn phí cho cộng đồng. Điều này đã dẫn đến sự gia tăng đáng kể các package thống kê sẵn có cho ngôn ngữ R. Trong khi chúng tôi viết bài này, Chế độ xem tác vụ của CRAN đã liệt kê hơn 130 packages chỉ dành riêng cho mục đích phân tích gộp.

Với R, bạn có thể làm bất cứ điều gì một cách đúng nghĩa. Do là một ngôn ngữ lập trình đầy đủ, vì vậy nếu bạn không tìm thấy một hàm nào đó mà bạn cần, bạn có thể dễ dàng tự viết nó. Tuy nhiên đối với phân tích gộp, hầu như bạn không cần phải làm điều này nữa. Chỉ một tập hợp nhỏ các packages trong R đã cung cấp tất cả các chức năng mà bạn có thể tìm thấy trong các phần mềm phân tích gộp “xịn xò” , và nó miễn phí. Thậm chí hơn thế, có rất nhiều phương pháp phân tích gộp hiện mới chỉ có thể được áp dụng trong R. Tóm lại: môi trường R cung cấp cho các nhà nghiên cứu nhiều công cụ hơn cho các phân tích gộp. Trong trường hợp tốt nhất, nó cho phép chúng ta đưa ra kết luận chắc chắn hơn từ dữ liệu của mình và do đó cung cấp thông tin tốt hơn cho việc ra quyết định.

Một số người đặt câu hỏi rằng tại sao không phải tất cả mọi người đều sử dụng R cho các phân tích gộp. Chúng tôi nghĩ rằng có hai lý do chính: sự tiện lợisự lo lắng (và đôi khi là sự kết hợp của cả hai). Cả hai lý do đều rất dễ hiểu. Hầu hết những người thực hiện phân tích gộp là những nhà nghiên cứu ứng dụng; không phải nhà thống kê học hay lập trình viên. Ý nghĩ về việc học một ngôn ngữ lập trình khó hiểu và có vẻ phức tạp có thể đóng vai trò là một rào cản lớn. Điều này cũng đúng đối với các phương pháp phân tích gộp, với nền tảng lý thuyết đặc biệt của chúng, vô số lựa chọn phân tích và các số liệu thống kê khác nhau cần được diễn giải một cách chính xác.

Với cuốn sách này, chúng tôi muốn chứng minh rằng nhiều lo ngại trong số này là không có cơ sở và việc học cách phân tích gộp bằng ngôn ngữ R là rất đáng để bạn nỗ lực. Chúng tôi hy vọng rằng cuốn sách này sẽ giúp bạn học các kỹ năng cần thiết để làm chủ dự án phân tích gộp của riêng bạn trong R. Chúng tôi cũng hy vọng rằng cuốn sách này sẽ giúp bạn dễ dàng hơn trong việc hiểu khi nào cần sử dụng phương pháp phân tích gộp , và lý do tại sao chúng ta áp dụng chúng. Cuối cùng nhưng không kém phần quan trọng, chúng tôi coi cuốn sách này là một nỗ lực để cho bạn thấy rằng các phương pháp phân tích gộp và lập trình với R sẽ không còn là sự bất tiện, mà trở thành một chủ đề hấp dẫn để khám phá.


This Book Is for Mortals


This guide was not written for meta-analysis experts or statisticians. We do not assume that you have any special background knowledge on meta-analytic methods. Only basic knowledge of fundamental mathematical and statistical concepts is needed. For example, we assume that you have heard before what things like a “mean”, “standard deviation”, “correlation”, “regression”, “\(p\)-value” or a “normal distribution” are. If these terms ring a bell, you should be good to go. If you are really starting from scratch, you may want to first have a look at Robert Stinerock’s statistics beginner’s guide (Stinerock 2018) for a thorough introduction including hands-on examples in R – or some other introductory statistics textbook of your choice.

Although we tried to keep it as minimal as possible, we will use mathematical formulas and statistical notation at times. But do not panic. Formulas and greek letters can seem confusing at first glance, but they are often a very good way to precisely describe the idea behind some meta-analysis methods. Having seen those formulas, and knowing what they represent, will also make it easier for you to understand more advanced texts you may want to read further down the line. And of course, we tried our best to always explain in detail what certain symbols or letters stand for, and what a specific formula wants to tell us. In appendix C of this book, you can find a list of the symbols we use, and what they stand for. In later chapters, especially the Advanced Methods section, we need to become a little more technical to explain the idea behind some of the applied techniques. Nevertheless, we made sure to always include some background information on the mathematical and statistical concepts used in these sections.

No prior knowledge of R (or programming in general) is required. In the guide, we try to provide a gentle introduction into basic R skills you need to code your own meta-analysis. We also provide references to adequate resources to keep on learning. Furthermore, we will show you how you can set up a free computer program which allows you use R conveniently on your PC or Mac.

As it says in the title, our book focuses on the “doing” part of meta-analysis. Our guide aims to be an accessible resource which meets the needs of applied researchers, students and data scientists who want to get going with their analyses using R. Meta-analysis, however, is a vast and multi-faceted topic, so it is natural that not everything can be covered in this guide. For this book, limitations particularly pertain to three areas:

  • Although we provide a short primer on these topics, we do not cover in detail how to define research questions, systematically search and include studies for your meta-analysis, as well as how to assess their quality. Each of these topics merits books of their own, and luckily many helpful resources already exist. We therefore only give an overview of important considerations and pitfalls when collecting the data for your meta-analysis, and will refer you to adequate resources dealing with the nitty-gritty details.

  • The second limitation of this guide pertains to its level of technicality. This book is decidedly written for “mortals”. We aim to show you when, how and why to apply certain meta-analytic techniques, along with their pitfalls. We also try to provide an easily accessible, conceptual understanding of the techniques we cover, resorting to more technical details only if it benefits this mission. Quite naturally, this means that some parts of the guide will not contain a deep dive into technicalities that expert-level meta-analysts and statisticians may desire. Nevertheless, we include references to more advanced resources and publications in each chapter for the interested reader.

  • Contents of a book will always to some extent reflect the background and experience of its authors. We are confident that the methods we cover here are applicable and relevant to a vast range of research areas and disciplines. Nevertheless, we wanted to disclose that the four authors of this book are primarily versed in current research in psychology, psychiatry, medicine, and intervention research. “Real-world” use cases and examples we cover in the book therefore concentrate on topics where we know our way around. The good news is that meta-analytic methods (provided some assumptions, which we will cover) are largely agnostic to the research field from which data stem from, and can be used for various types of outcome measures. Nonetheless, and despite our best intentions to make this guide as broadly applicable to as many applied research disciplines as possible, it may still be possible that some of the methods covered in this book are more relevant for some research areas than others.


Topics Covered in the Book


Among other things, this guide will cover the following topics:

  • What a meta-analysis is, and why it was invented.

  • Advantages and common problems with meta-analysis.

  • How research questions for meta-analyses are specified, and how the search for studies can be conducted.

  • How you can set up R, and a computer program which allows you to use R in a convenient way.

  • How you can import your meta-analysis data into R, and how to manipulate it through code.

  • What effect sizes are, and how they are calculated.

  • How to pool effect sizes in fixed-effect and random-effects meta-analyses.

  • How to analyze the heterogeneity of your meta-analysis, and how to explore it using subgroup analyses and meta-regression.

  • Problems with selective outcome reporting, and how to tackle them.

  • How to perform advanced types of meta-analytic techniques, such as “multilevel” meta-analysis, meta-analytic structural equation modeling, network meta-analysis, or Bayesian meta-analysis.

  • How to report your meta-analysis results, and make them reproducible.


Cách sử dụng cuốn sách này


Work Flow


This book is intended to be read in a “linear” fashion. We recommend that you start with the first chapters on meta-analysis and R basics, and then keep on working yourself through the book one chapter after another. Jumping straight to the hands-on chapters may be tempting, but it is not generally recommended. Teaching students and researchers how to perform their first meta-analyses, we found that a basic familiarity with this technique, as well as the R Studio environment, is a necessary evil to avoid frustrations later on. This is particularly true if you have no previous experience with meta-analysis and R programming. Experienced R users may skip Chapter 2, which introduces R and R Studio. However, it will certainly do no harm to work through the chapter anyway as a quick refresher.

While all chapters are virtually self-contained, we do sometimes make references to topics covered in previous chapters. Chapters in the Advanced Methods section in particular assume that you are familiar with theoretical concepts we have covered before.

The last section of this book contains helpful tools for your meta-analysis. This does not mean that these topics are the final things you have to consider when performing a meta-analysis. We simply put these chapters in the end because they primarily serve as reference works for your own meta-analysis projects. We link to these tools throughout the book in sections where they are thematically relevant.


Companion R Package


This book comes with a companion R package called {dmetar}. This package mainly serves two functions. First, it aims to make your life easier. Although there are fantastic R packages for meta-analysis out there with a vast range of functionality, there are still a few things which are currently not easy to implement in R, at least for beginners.

The {dmetar} package aims to bridge this gap by providing a few extra functions facilitating exactly those things. Secondly, the package also contains all the data sets we are using for the hands-on examples shown in this book. In chapter 2.3, the {dmetar} package is introduced in detail, and we show you how to install the package step by step. Although we will make sure that there are no substantial changes, {dmetar} is still under active development, so it may be helpful to have a look at the package website now and then to check if there are new or improved functionalities which you can use for your meta-analysis.

Although advised, it is not essential that you install the package. Wherever we make use of {dmetar} in the book, we will also provide you with the raw code for the function, or a download link to the data set we are using.


Text Boxes


Throughout the book, a set of text boxes is used.

General Note

General notes contain relevant background information, insights, anecdotes, considerations or take-home messages pertaining to the covered topic.

Important Information

These boxes contain information on caveats, problems, drawbacks or pitfalls you have to keep in mind.

Questions

After each chapter, this box will contain a few questions through which you can test your knowledge. Answers to these questions can be found at the end of the book in Appendix A.

{dmetar} Note

The {dmetar} note boxes appear whenever functions or data sets contained in the companion R package are used. These boxes also contain URLs to the function code, or data set download links for readers who did not install the package.

How Can I Report This?

These boxes contain recommendations on how you can report R output in your thesis or research article.


Conventions


A few conventions are followed throughout the book.

\[~\]

{packages}

All R packages are written in bold and are put into curly brackets. This is a common way to write package names in the R community.

\[~\]

R Code

All R code or objects we define in R are written in this monospace font.

\[~\]

## R Output

The same monospace font is used for the output we receive after running R code. However, we use two number signs (hashes) to differentiate it from R input.

\[~\]

\(Formula\)

This serif font is reserved for formulas, statistics, and other forms of mathematical notation.


Làm gì khi bạn đang bế tắc


Undeniably, the road to doing meta-analyses in R can be a rocky path at times. Although we think this is sometimes exaggerated, R’s learning curve is steep. Statistics is hard. We did our best to make your experience of learning how to perform meta-analyses using R as painless as possible. Nevertheless, this will not shield you from being frustrated sometimes. This is all but natural. We all had to start from scratch somewhere down the line. From our own experience, we can assure you that we have never met anyone who was not able to learn R, or how to do a meta-analysis. It only takes practice, and the understanding that there will be no point in time when you are “done” learning. We believe in you.

If you are looking for something a little more practical than this motivational message: here are a few things you can do once you stumble upon things that this guide cannot answer.


Đừng hoảng loạn


Making their first steps in R, many people are terrified when the first red error messages start popping up. That is not necessary. Everyone gets error messages all the time. Instead of becoming panicky or throwing your computer out the window, take a deep breath and take a closer look at the error message. Very often, it only takes a few tweaks to make the error messages disappear. Have you misspelled something in your code? Have you forgotten to close a bracket, or to put something into quotation marks?

Also, make sure that your output actually is an error message. R distinguishes between Errors, Warnings, and plain messages. Only the first means that your code could not be executed. Warnings mean that your code did run, but that something may have gone awry. Messages mean that your code did run completely, and are usually shown when a function simply wants to bring your attention to something it has done for you under the hood. For this reason, they are also called diagnostic messages.


Google


A software developer friend once told the first author this joke about his profession: “A programmer is someone who can google better than Average Joe”. This observation certainly also applies to R programming. If you find yourself in a situation in which you cannot make sense out of an error or warning message you receive, do not hesitate to simply copy and paste it, and do a Google search. Adding “R” to your search is often helpful to improve the results. Most content on the Internet is in English; so if your error message in R is in another language, run Sys.setenv(LANGUAGE = "en") and then rerun your code again.

There is a large R community out there, and it is very likely that someone had the same problem as you before. Google is also helpful if there is something specific you want to do with your data, but do not know what R commands you can use to do this. Even for experts, it is absolutely normal to use Google dozens of times when writing R code. Do not hesitate to do the same whenever you get stuck.


StackOverflow & CrossValidated


When searching for R-related questions on Google, you will soon find out that many of the first hits will link you to a website called StackOverflow. StackOverflow is a large community-based forum for questions related to programming in general. On StackOverflow, everyone (including you) can ask and answer questions.

In contrast to many other forums on the Internet, answers you get on StackOverflow are usually goal-oriented and helpful. If searching Google did not help you to solve your problem, addressing it there might be a good solution. However, there are a few things to keep in mind. First, when asking a question, always tag your question with [R] so that people know which programming language you are talking about. Also, run sessionInfo() in R and attach the output you get to your question. This lets people know which R and package versions you are using, and might be helpful to locate the problem.

Lastly, do not expect overwhelming kindness. Many StackOverflow users are experienced programmers who may be willing to point you at certain solutions; but do not expect anyone to solve your problem for you. It is also possible that someone will simply inform you that this topic has already been covered elsewhere, send you the link, and then move on. Nevertheless, using StackOverflow is usually the best way to get high-quality support for specific problems you are dealing with.

StackOverflow, by the way, is primarily for questions on programming. If your question also has a statistics background, you can use CrossValidated instead. CrossValidated works like StackOverflow, but is primarily used by statistics and machine learning experts.


Liên hệ với chúng tôi


If you have the feeling that your problem has something to do with this guide itself, you can also contact us. This particularly pertains to issues with the companion R package for this guide, {dmetar}. If you have trouble installing the package, or using some if its functions, you can go to our website, where you can find ways to report your issue. When certain problems come up frequently, we usually try to have a look at them and search for fixes. Known issues will also be displayed in the Corrections & Remarks section in the online version of the guide (see Work Flow section). Please do not be disappointed if we do not answer your question personally, or if takes long until we get back to you. We receive many questions related to meta-analysis and our package every day, so it is sometimes not possible to directly answer each and every one.


Lời cảm ơn


We would like to thank David Grubbs and Chapmann & Hall/CRC Press for approaching us with the wonderful idea of turning our online guide into the printed book you are reading right now, and for their invaluable editorial support.

Many researchers and students have shared their feedback and experiences working with this guide with us since we began writing a preliminary online version of it in late 2018. This feedback has been incredibly valuable, and has helped us considerably to tailor this book further to the needs of the ones reading it. Thank you to all of you.

We owe a great debt of gratitude to all researchers involved in the development of the R meta-analysis infrastructure presented in this guide; but first and foremost to Guido Schwarzer and Wolfgang Viechtbauer, maintainers of the {meta} and {metafor} package, respectively. This guide, like the whole R meta-analysis community, would not exist without your effort and dedication.

Furthermore, particular thanks go to Luke McGuinness, author of the gorgeous {robvis} package, for writing an additional chapter on Risk of Bias visualization, which you can find on this book’s companion website. Luke, we are incredibly grateful for your continuous support for this project.

Last but not least, we want to thank Lea Schuurmans and Paula Kuper for supporting us in the development and compilation of this book.


Erlangen, Amsterdam, Kyoto and Munich

Mathias, Pim, Toshi & David