Chapter 2 Introduction

When deriving a Visual Predictive check (VPC) you must:

  • Have both observed and simulated datasets that include x & y variables, typically TIME & DV.

  • Compute Prediction Intervals on Simulated versus Observed Data

When deriving a VPC you may want to:

  • Stratify over variables in your model.

  • Censor data below LLOQ.

  • Perform prediction correction (pcVPC).

The tidyvpc package makes these steps fast and easy:

  • By providing readable syntax using the %>% operator from magrittr.

  • It uses efficient backend computation, taking advantage of data.table parallelization.

  • By providing traditional binning methods and new binless methods using additive quantile regression and loess for pcVPC.

  • By using ggplot2 graphics engine to visualize the results of the VPC.

This document introduces you to tidyvpc’s set of tools, and shows you how to apply them to tidyvpcobj to derive VPC.

All of the tidyvpc functions take a tidyvpcobj as the first argument, with the exception of the first function observed() in the piping chain, which takes a data.frame or data.table of the observed dataset. Rather than forcing the user to either save intermediate objects or nest functions, tidyvpc provides the %>% operator from magrittr. The result from one step is then “piped” into the next step, with the final function in the piping chain always vpcstats(). You can use the pipe to rewrite multiple operations that you can read left-to-right, top-to-bottom (reading the pipe operator as “then”).

2.1 Data

To explore the functionality of tidyvpc, we’ll use an altered version of obs_data(vpc::simple_data$obs) & sim_data(vpc::simple_data$sim) from the vpc package. These datasets contains all necessary variables to explore the functionality of tidyvpc including:

  • DV (y variable)

  • TIME (x variable)

  • NTIME (nominal time for binning on x-variable)

  • GENDER (gender variable for stratification, “M”, “F”)

  • STUDY (study for stratification, “Study A”, “Study B”)

  • PRED (prediction variable for pcVPC)

  • MDV (Missing DV)

## 1  1 0.0000000  0.0 150  150   1  0.00      M Study A
## 2  1 0.2157624 37.3   0  150   0  0.25      M Study A
## 3  1 0.4694366 62.2   0  150   0  0.50      M Study A
## 4  1 0.8271844 74.1   0  150   0  1.00      M Study A
## 5  1 1.7724895 75.1   0  150   0  1.50      M Study A
## 6  1 1.7142415 58.3   0  150   0  2.00      M Study A

2.1.1 Preprocessing data

First we’ll need to subset our data by filtering MDV == 0 which removes rows where both DV == 0 & TIME == 0.

Next we’ll add the prediction variable from the first replicate of simulated data into our observed data.

Now that we have our data ready to derive VPC, proceed to the next chapter to learn about using the various functions in the tidyvpc package.