Skip to contents

Add expected records as new observations for each 'by group' when the dataset contains missing observations.

Usage

derive_expected_records(
  dataset,
  dataset_expected_obs,
  by_vars = NULL,
  set_values_to = NULL
)

Arguments

dataset

Input dataset

A data frame, the columns from dataset_expected_obs and specified by the by_vars parameter are expected.

dataset_expected_obs

Expected observations dataset

Data frame with the expected observations, e.g., all the expected combinations of PARAMCD, PARAM, AVISIT, AVISITN, ...

by_vars

Grouping variables

For each group defined by by_vars those observations from dataset_expected_obs are added to the output dataset which do not have a corresponding observation in the input dataset.

set_values_to

Variables to be set

The specified variables are set to the specified values for the new observations.

A list of variable name-value pairs is expected.

  • LHS refers to a variable.

  • RHS refers to the values to set to the variable. This can be a string, a symbol, a numeric value or NA, e.g., exprs(PARAMCD = "TDOSE", PARCAT1 = "OVERALL"). More general expression are not allowed.

Value

The input dataset with the missed expected observations added for each by_vars. Note, a variable will only be populated in the new parameter rows if it is specified in by_vars or set_values_to.

Details

For each group (the variables specified in the by_vars parameter), those records from dataset_expected_obs that are missing in the input dataset are added to the output dataset.

Examples

library(tibble)

adqs <- tribble(
  ~USUBJID, ~PARAMCD, ~AVISITN, ~AVISIT, ~AVAL,
  "1",      "a",             1, "WEEK 1",   10,
  "1",      "b",             1, "WEEK 1",   11,
  "2",      "a",             2, "WEEK 2",   12,
  "2",      "b",             2, "WEEK 2",   14
)

# Example 1. visit variables are parameter independent
parm_visit_ref <- tribble(
  ~AVISITN, ~AVISIT,
  1,        "WEEK 1",
  2,        "WEEK 2"
)

derive_expected_records(
  dataset = adqs,
  dataset_expected_obs = parm_visit_ref,
  by_vars = exprs(USUBJID, PARAMCD),
  set_values_to = exprs(DTYPE = "DERIVED")
)
#> # A tibble: 8 x 6
#>   USUBJID PARAMCD AVISITN AVISIT  AVAL DTYPE  
#>   <chr>   <chr>     <dbl> <chr>  <dbl> <chr>  
#> 1 1       a             1 WEEK 1    10 NA     
#> 2 1       a             2 WEEK 2    NA DERIVED
#> 3 1       b             1 WEEK 1    11 NA     
#> 4 1       b             2 WEEK 2    NA DERIVED
#> 5 2       a             1 WEEK 1    NA DERIVED
#> 6 2       a             2 WEEK 2    12 NA     
#> 7 2       b             1 WEEK 1    NA DERIVED
#> 8 2       b             2 WEEK 2    14 NA     

# Example 2. visit variables are parameter dependent
parm_visit_ref <- tribble(
  ~PARAMCD, ~AVISITN, ~AVISIT,
  "a",             1, "WEEK 1",
  "a",             2, "WEEK 2",
  "b",             1, "WEEK 1"
)

derive_expected_records(
  dataset = adqs,
  dataset_expected_obs = parm_visit_ref,
  by_vars = exprs(USUBJID, PARAMCD),
  set_values_to = exprs(DTYPE = "DERIVED")
)
#> # A tibble: 7 x 6
#>   USUBJID PARAMCD AVISITN AVISIT  AVAL DTYPE  
#>   <chr>   <chr>     <dbl> <chr>  <dbl> <chr>  
#> 1 1       a             1 WEEK 1    10 NA     
#> 2 1       a             2 WEEK 2    NA DERIVED
#> 3 1       b             1 WEEK 1    11 NA     
#> 4 2       a             1 WEEK 1    NA DERIVED
#> 5 2       a             2 WEEK 2    12 NA     
#> 6 2       b             1 WEEK 1    NA DERIVED
#> 7 2       b             2 WEEK 2    14 NA