Derive Query Variables
Arguments
- dataset
Input dataset.
- dataset_queries
A dataset containing required columns
VAR_PREFIX
,QUERY_NAME
,TERM_LEVEL
,TERM_NAME
,TERM_ID
, and optional columnsQUERY_ID
,QUERY_SCOPE
,QUERY_SCOPE_NUM
.The content of the dataset will be verified by
assert_valid_queries()
.create_query_data()
can be used to create the dataset.
Details
This function can be used to derive CDISC variables such as
SMQzzNAM
, SMQzzCD
, SMQzzSC
, SMQzzSCN
, and CQzzNAM
in ADAE and
ADMH, and variables such as SDGzzNAM
, SDGzzCD
, and SDGzzSC
in ADCM.
An example usage of this function can be found in the
OCCDS vignette.
A query dataset is expected as an input to this function. See the
Queries Dataset Documentation vignette
for descriptions, or call data("queries")
for an example of a query dataset.
For each unique element in VAR_PREFIX
, the corresponding "NAM"
variable will be created. For each unique VAR_PREFIX
, if QUERY_ID
is
not "" or NA, then the corresponding "CD" variable is created; similarly,
if QUERY_SCOPE
is not "" or NA, then the corresponding "SC" variable will
be created; if QUERY_SCOPE_NUM
is not "" or NA, then the corresponding
"SCN" variable will be created.
For each record in dataset
, the "NAM" variable takes the value of
QUERY_NAME
if the value of TERM_NAME
or TERM_ID
in dataset_queries
matches
the value of the respective TERM_LEVEL in dataset
.
Note that TERM_NAME
in dataset_queries
dataset may be NA only when TERM_ID
is non-NA and vice versa.
The "CD", "SC", and "SCN" variables are derived accordingly based on
QUERY_ID
, QUERY_SCOPE
, and QUERY_SCOPE_NUM
respectively,
whenever not missing.
See also
create_query_data()
assert_valid_queries()
OCCDS Functions:
derive_var_trtemfl()
,
derive_vars_atc()
,
get_terms_from_db()
Examples
library(tibble)
data("queries")
adae <- tribble(
~USUBJID, ~ASTDTM, ~AETERM, ~AESEQ, ~AEDECOD, ~AELLT, ~AELLTCD,
"01", "2020-06-02 23:59:59", "ALANINE AMINOTRANSFERASE ABNORMAL",
3, "Alanine aminotransferase abnormal", NA_character_, NA_integer_,
"02", "2020-06-05 23:59:59", "BASEDOW'S DISEASE",
5, "Basedow's disease", NA_character_, 1L,
"03", "2020-06-07 23:59:59", "SOME TERM",
2, "Some query", "Some term", NA_integer_,
"05", "2020-06-09 23:59:59", "ALVEOLAR PROTEINOSIS",
7, "Alveolar proteinosis", NA_character_, NA_integer_
)
derive_vars_query(adae, queries)
#> # A tibble: 4 x 24
#> USUBJID ASTDTM AETERM AESEQ AEDECOD AELLT AELLTCD SMQ02NAM SMQ02CD SMQ02SC
#> <chr> <chr> <chr> <dbl> <chr> <chr> <int> <chr> <int> <chr>
#> 1 01 2020-0… ALANINE… 3 Alanine… NA NA NA NA NA
#> 2 02 2020-0… BASEDOW… 5 Basedow… NA 1 NA NA NA
#> 3 03 2020-0… SOME TE… 2 Some qu… Some… NA NA NA NA
#> 4 05 2020-0… ALVEOLA… 7 Alveola… NA NA NA NA NA
#> # … with 14 more variables: SMQ02SCN <dbl>, SMQ03NAM <chr>, SMQ03CD <int>,
#> # SMQ03SC <chr>, SMQ03SCN <dbl>, SMQ05NAM <chr>, SMQ05CD <int>,
#> # SMQ05SC <chr>, SMQ05SCN <dbl>, CQ01NAM <chr>, CQ04NAM <chr>, CQ04CD <int>,
#> # CQ06NAM <chr>, CQ06CD <int>