Skip to contents

Add the first or last observation for each by group as new observations. It can be used for example for adding the maximum or minimum value as a separate visit. All variables of the selected observation are kept. This distinguish derive_extreme_records() from derive_summary_records(), where only the by variables are populated for the new records.

Usage

derive_extreme_records(
  dataset,
  by_vars = NULL,
  order,
  mode,
  check_type = "warning",
  filter = NULL,
  set_values_to
)

Arguments

dataset

Input dataset

The variables specified by the order and the by_vars parameter are expected.

by_vars

Grouping variables

Default: NULL

Permitted Values: list of variables created by exprs()

order

Sort order

Within each by group the observations are ordered by the specified order.

Permitted Values: list of variables or desc(<variable>) function calls created by exprs(), e.g., exprs(ADT, desc(AVAL))

mode

Selection mode (first or last)

If "first" is specified, the first observation of each by group is added to the input dataset. If "last" is specified, the last observation of each by group is added to the input dataset.

Permitted Values: "first", "last"

check_type

Check uniqueness?

If "warning" or "error" is specified, the specified message is issued if the observations of the input dataset are not unique with respect to the by variables and the order.

Default: "warning"

Permitted Values: "none", "warning", "error"

filter

Filter for observations to consider

Only observations fulfilling the specified condition are taken into account for selecting the first or last observation. If the parameter is not specified, all observations are considered.

Default: NULL

Permitted Values: a condition

set_values_to

Variables to be set

The specified variables are set to the specified values for the new observations.

A list of variable name-value pairs is expected.

  • LHS refers to a variable.

  • RHS refers to the values to set to the variable. This can be a string, a symbol, a numeric value or NA, e.g., exprs(PARAMCD = "TDOSE", PARCAT1 = "OVERALL"). More general expression are not allowed.

Value

The input dataset with the first or last observation of each by group added as new observations.

Details

  1. The input dataset is restricted as specified by the filter parameter.

  2. For each group (with respect to the variables specified for the by_vars parameter) the first or last observation (with respect to the order specified for the order parameter and the mode specified for the mode parameter) is selected.

  3. The variables specified by the set_values_to parameter are added to the selected observations.

  4. The observations are added to input dataset.

Examples

library(tibble)

adlb <- tribble(
  ~USUBJID, ~AVISITN, ~AVAL, ~LBSEQ,
  "1",      1,          113,      1,
  "1",      2,          113,      2,
  "1",      3,          117,      3,
  "2",      1,          101,      1,
  "2",      2,          101,      2,
  "2",      3,           95,      3
)

# Add a new record for each USUBJID storing the minimum value (first AVAL).
# If multiple records meet the minimum criterion, take the first value by
# AVISITN. Set AVISITN = 97 and DTYPE = MINIMUM for these new records.
derive_extreme_records(
  adlb,
  by_vars = exprs(USUBJID),
  order = exprs(AVAL, AVISITN),
  mode = "first",
  filter = !is.na(AVAL),
  set_values_to = exprs(
    AVISITN = 97,
    DTYPE = "MINIMUM"
  )
)
#> # A tibble: 8 x 5
#>   USUBJID AVISITN  AVAL LBSEQ DTYPE  
#>   <chr>     <dbl> <dbl> <dbl> <chr>  
#> 1 1             1   113     1 NA     
#> 2 1             2   113     2 NA     
#> 3 1             3   117     3 NA     
#> 4 2             1   101     1 NA     
#> 5 2             2   101     2 NA     
#> 6 2             3    95     3 NA     
#> 7 1            97   113     1 MINIMUM
#> 8 2            97    95     3 MINIMUM

# Add a new record for each USUBJID storing the maximum value (last AVAL).
# If multiple records meet the maximum criterion, take the first value by
# AVISITN. Set AVISITN = 98 and DTYPE = MAXIMUM for these new records.
derive_extreme_records(
  adlb,
  by_vars = exprs(USUBJID),
  order = exprs(desc(AVAL), AVISITN),
  mode = "first",
  filter = !is.na(AVAL),
  set_values_to = exprs(
    AVISITN = 98,
    DTYPE = "MAXIMUM"
  )
)
#> # A tibble: 8 x 5
#>   USUBJID AVISITN  AVAL LBSEQ DTYPE  
#>   <chr>     <dbl> <dbl> <dbl> <chr>  
#> 1 1             1   113     1 NA     
#> 2 1             2   113     2 NA     
#> 3 1             3   117     3 NA     
#> 4 2             1   101     1 NA     
#> 5 2             2   101     2 NA     
#> 6 2             3    95     3 NA     
#> 7 1            98   117     3 MAXIMUM
#> 8 2            98   101     1 MAXIMUM

# Add a new record for each USUBJID storing for the last value.
# Set AVISITN = 99 and DTYPE = LOV for these new records.
derive_extreme_records(
  adlb,
  by_vars = exprs(USUBJID),
  order = exprs(AVISITN),
  mode = "last",
  set_values_to = exprs(
    AVISITN = 99,
    DTYPE = "LOV"
  )
)
#> # A tibble: 8 x 5
#>   USUBJID AVISITN  AVAL LBSEQ DTYPE
#>   <chr>     <dbl> <dbl> <dbl> <chr>
#> 1 1             1   113     1 NA   
#> 2 1             2   113     2 NA   
#> 3 1             3   117     3 NA   
#> 4 2             1   101     1 NA   
#> 5 2             2   101     2 NA   
#> 6 2             3    95     3 NA   
#> 7 1            99   117     3 LOV  
#> 8 2            99    95     3 LOV