Chapter 3 Missing data

Basic analysis of missing data

Missing data overview

The analysis of missing data reveals critical gaps on some indicators and years.

Number of countries/economies with collected data:

## [1] 194

Some countries include only basic economic indicators such as GDP per capita. What rule should we take to decide which series will be imputed? What is a good amount of missing values?

  • Countries with at least 3 data points available by series:
## [1] 69
  • Countries with at least 2 data points available by series:
## [1] 107
  • Countries with at least 1 data point available by series:
## [1] 129

Number of series fully missing by country:

##    ISO3 count_NA
## 1   BIH        1
## 2   DJI        1
## 3   FJI        1
## 4   GNB        1
## 5   JAM        1
## 6   KHM        1
## 7   LBN        1
## 8   MKD        1
## 9   MLT        1
## 10  NZL        1
## 11  PNG        1
## 12  QAT        1
## 13  SGP        1
## 14  STP        1
## 15  SWZ        1
## 16  SYR        1
## 17  TLS        1
## 18  VNM        1
## 19  AFG        2
## 20  BHR        2
## 21  CPV        2
## 22  HTI        2
## 23  KWT        2
## 24  OMN        2
## 25  SAU        2
## 26  SLB        2
## 27  SOM        2
## 28  YEM        2
## 29  ZMB        2
## 30  BHS        3
## 31  BRB        3
## 32  GNQ        3
## 33  GUY        3
## 34  LCA        3
## 35  SUR        3
## 36  TON        3
## 37  TTO        3
## 38  VEN        3
## 39  VUT        3
## 40  WSM        3
## 41  BLZ        4
## 42  BRN        4
## 43  LBY        4
## 44  SSD        4
## 45  VCT        4
## 46  CUB        5
## 47  PSE        5
## 48  HKG        6
## 49  TKM        7
## 50  ERI        8
## 51  MAC       10
## 52  SYC       13
## 53  DMA       14
## 54  KIR       14
## 55  GRD       15
## 56  MHL       17
## 57  PLW       17
## 58  TUV       18
## 59  ABW       20
## 60  GUM       20
## 61  BMU       21
## 62  CYM       21
## 63  AND       22
## 64  SMR       22
## 65  CUW       24

Countries with only one missing data series. Which one is missing?

##    ISO3    Series_missing
## 1   BIH SE.ENR.SECO.FM.ZS
## 2   DJI       SE.SEC.ENRR
## 3   FJI      FB_BNK_ACCSS
## 4   GNB      FB_BNK_ACCSS
## 5   JAM       MAR_AGE_MAL
## 6   KHM       SI.POV.LMIC
## 7   LBN SE.ENR.SECO.FM.ZS
## 8   MKD       MAR_AGE_MAL
## 9   MLT NY.ADJ.NNTY.PC.KD
## 10  NZL       SI.POV.LMIC
## 11  PNG      FB_BNK_ACCSS
## 12  QAT             PALMA
## 13  SGP       SI.POV.LMIC
## 14  STP      FB_BNK_ACCSS
## 15  SWZ    LP.LPI.OVRL.XQ
## 16  SYR       EG_EGY_PRIM
## 17  TLS      FB_BNK_ACCSS
## 18  VNM SE.ENR.SECO.FM.ZS

By indicator

Indicators with lowest coverage include:

  • Indicator 2.7: Proportion of adults (15 years and older) with an account at a financial institution or with a mobile money-service company.
  • Indicator 3.8: Average marriage age by sex
  • Indicator 2.1: Logistics performance indicator
  • Indicator 2.6: Universal health coverage - only available for a few years.

By country

Large economies such as China and India were not included in previous editions due to missing data. However, data availability and coverage improved for both countries to be included.

3.0.1 China

China was not included in the SDG Pulse analysis. What is the situation with missing data?

  • PALMA ratio available for 2 data points

3.0.2 India

India was not included in the SDG Pulse analysis. What is the situation with missing data?

  • PALMA ratio available for 3 data points

3.0.3 Other most populated countries (United States, Indonesia, Pakistan, Nigeria)

All included in the SDG Pulse 2022

To see other countries, data availability by country is available on SharePoint.

Comparison between collected and imputed data by country is available on SharePoint.

SDG Pulse 2022

Countries included in the SDG Pulse 2022

## [1] 97

Countries selected based on the rule of “at least 2 data points available” but not included in the SDG Pulse 2022.

##  [1] "Algeria"            "Australia"         
##  [3] "Bhutan"             "Myanmar"           
##  [5] "El Salvador"        "Djibouti"          
##  [7] "Iceland"            "Jordan"            
##  [9] "Nigeria"            "Panama"            
## [11] "Paraguay"           "Russian Federation"
## [13] "Singapore"          "Thailand"          
## [15] "Türkiye"            "Uruguay"

Countries selected based on the rule of “at least 1 data point available” but not included in the SDG Pulse 2022.

## [1] "Djibouti"  "Singapore"

Proposed decision rule:

  • take countries with “at least 1 data point available”
  • check countries included in the SDG Pulse
  • all problematic countries resolved except of Singapore

Countries to investigate

## [[1]]
## Warning: Removed 241 rows containing missing values
## (`geom_point()`).

## 
## [[2]]
## Warning: Removed 256 rows containing missing values
## (`geom_point()`).

## 
## [[3]]
## Warning: Removed 239 rows containing missing values
## (`geom_point()`).

## 
## [[4]]
## Warning: Removed 275 rows containing missing values
## (`geom_point()`).

## 
## [[5]]
## Warning: Removed 169 rows containing missing values
## (`geom_point()`).

## 
## [[6]]
## Warning: Removed 294 rows containing missing values
## (`geom_point()`).

## 
## [[7]]
## Warning: Removed 189 rows containing missing values
## (`geom_point()`).

## 
## [[8]]
## Warning: Removed 233 rows containing missing values
## (`geom_point()`).

## 
## [[9]]
## Warning: Removed 245 rows containing missing values
## (`geom_point()`).

## 
## [[10]]
## Warning: Removed 207 rows containing missing values
## (`geom_point()`).

## 
## [[11]]
## Warning: Removed 185 rows containing missing values
## (`geom_point()`).

## 
## [[12]]
## Warning: Removed 211 rows containing missing values
## (`geom_point()`).

## 
## [[13]]
## Warning: Removed 263 rows containing missing values
## (`geom_point()`).

## 
## [[14]]
## Warning: Removed 199 rows containing missing values
## (`geom_point()`).

## 
## [[15]]
## Warning: Removed 211 rows containing missing values
## (`geom_point()`).

## 
## [[16]]
## Warning: Removed 170 rows containing missing values
## (`geom_point()`).