Merge a categorization variable from a dataset to the input dataset. The observations to merge can be selected by a condition and/or selecting the first or last observation for each by group.
Usage
derive_var_merged_cat(
dataset,
dataset_add,
by_vars,
order = NULL,
new_var,
source_var,
cat_fun,
filter_add = NULL,
mode = NULL,
missing_value = NA_character_
)Arguments
- dataset
Input dataset
The variables specified by the
by_varsargument are expected.- dataset_add
Additional dataset
The variables specified by the
by_vars, thesource_var, and theorderargument are expected.- by_vars
Grouping variables
The input dataset and the selected observations from the additional dataset are merged by the specified by variables. The by variables must be a unique key of the selected observations. Variables from the additional dataset can be renamed by naming the element, i.e.,
by_vars = exprs(<name in input dataset> = <name in additional dataset>), similar to the dplyr joins.Permitted Values: list of variables created by
exprs()- order
Sort order
If the argument is set to a non-null value, for each by group the first or last observation from the additional dataset is selected with respect to the specified order.
Default:
NULLPermitted Values: list of variables or
desc(<variable>)function calls created byexprs(), e.g.,exprs(ADT, desc(AVAL))orNULL- new_var
New variable
The specified variable is added to the additional dataset and set to the categorized values, i.e.,
cat_fun(<source variable>).- source_var
Source variable
- cat_fun
Categorization function
A function must be specified for this argument which expects the values of the source variable as input and returns the categorized values.
- filter_add
Filter for additional dataset (
dataset_add)Only observations fulfilling the specified condition are taken into account for merging. If the argument is not specified, all observations are considered.
Default:
NULLPermitted Values: a condition
- mode
Selection mode
Determines if the first or last observation is selected. If the
orderargument is specified,modemust be non-null.If the
orderargument is not specified, themodeargument is ignored.Default:
NULLPermitted Values:
"first","last",NULL- missing_value
Values used for missing information
The new variable is set to the specified value for all by groups without observations in the additional dataset.
Default:
NA_character_
Value
The output dataset contains all observations and variables of the
input dataset and additionally the variable specified for new_var derived
from the additional dataset (dataset_add).
Details
The additional dataset is restricted to the observations matching the
filter_addcondition.The categorization variable is added to the additional dataset.
If
orderis specified, for each by group the first or last observation (depending onmode) is selected.The categorization variable is merged to the input dataset.
See also
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag(),
derive_var_joined_exist_flag(),
derive_var_last_dose_amt(),
derive_var_last_dose_date(),
derive_var_last_dose_grp(),
derive_var_merged_character(),
derive_var_merged_exist_flag(),
derive_var_merged_summary(),
derive_var_obs_number(),
derive_var_relative_flag(),
derive_vars_joined(),
derive_vars_last_dose(),
derive_vars_merged_lookup(),
derive_vars_merged(),
derive_vars_transposed(),
get_summary_records()
Examples
library(admiral.test)
library(dplyr, warn.conflicts = FALSE)
data("admiral_dm")
data("admiral_vs")
wgt_cat <- function(wgt) {
case_when(
wgt < 50 ~ "low",
wgt > 90 ~ "high",
TRUE ~ "normal"
)
}
derive_var_merged_cat(
admiral_dm,
dataset_add = admiral_vs,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(VSDTC, VSSEQ),
filter_add = VSTESTCD == "WEIGHT" & substr(VISIT, 1, 9) == "SCREENING",
new_var = WGTBLCAT,
source_var = VSSTRESN,
cat_fun = wgt_cat,
mode = "last"
) %>%
select(STUDYID, USUBJID, AGE, AGEU, WGTBLCAT)
#> # A tibble: 306 x 5
#> STUDYID USUBJID AGE AGEU WGTBLCAT
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 CDISCPILOT01 01-701-1015 63 YEARS normal
#> 2 CDISCPILOT01 01-701-1023 64 YEARS normal
#> 3 CDISCPILOT01 01-701-1028 71 YEARS high
#> 4 CDISCPILOT01 01-701-1033 74 YEARS normal
#> 5 CDISCPILOT01 01-701-1034 77 YEARS normal
#> 6 CDISCPILOT01 01-701-1047 85 YEARS normal
#> 7 CDISCPILOT01 01-701-1057 59 YEARS NA
#> 8 CDISCPILOT01 01-701-1097 68 YEARS normal
#> 9 CDISCPILOT01 01-701-1111 81 YEARS normal
#> 10 CDISCPILOT01 01-701-1115 84 YEARS normal
#> # … with 296 more rows
# defining a value for missing VS data
derive_var_merged_cat(
admiral_dm,
dataset_add = admiral_vs,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(VSDTC, VSSEQ),
filter_add = VSTESTCD == "WEIGHT" & substr(VISIT, 1, 9) == "SCREENING",
new_var = WGTBLCAT,
source_var = VSSTRESN,
cat_fun = wgt_cat,
mode = "last",
missing_value = "MISSING"
) %>%
select(STUDYID, USUBJID, AGE, AGEU, WGTBLCAT)
#> # A tibble: 306 x 5
#> STUDYID USUBJID AGE AGEU WGTBLCAT
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 CDISCPILOT01 01-701-1015 63 YEARS normal
#> 2 CDISCPILOT01 01-701-1023 64 YEARS normal
#> 3 CDISCPILOT01 01-701-1028 71 YEARS high
#> 4 CDISCPILOT01 01-701-1033 74 YEARS normal
#> 5 CDISCPILOT01 01-701-1034 77 YEARS normal
#> 6 CDISCPILOT01 01-701-1047 85 YEARS normal
#> 7 CDISCPILOT01 01-701-1057 59 YEARS MISSING
#> 8 CDISCPILOT01 01-701-1097 68 YEARS normal
#> 9 CDISCPILOT01 01-701-1111 81 YEARS normal
#> 10 CDISCPILOT01 01-701-1115 84 YEARS normal
#> # … with 296 more rows
