Introduction
To support the safety analysis, it is quite common to define specific grouping of events. One of the most common ways is to group events or medications by a specific medical concept such as a Standard MedDRA Queries (SMQs) or WHO-Drug Standardized Drug Groupings (SDGs).
To help with the derivation of these variables, the {admiral}
function derive_vars_query() can be used. This function
takes as input the dataset (dataset) where the grouping
must occur (e.g ADAE) and a dataset containing the required
information to perform the derivation of the grouping variables
(dataset_queries).
The dataset passed to the dataset_queries argument of
the derive_vars_query() function can be created by the
create_query_data() function. For SMQs and SDGs
company-specific functions for accessing the SMQ and SDG database need
to be passed to the create_query_data() function (see the
description of the get_terms_fun argument for details).
This vignette describes the expected structure and content of the
dataset passed to the dataset_queries argument in the
derive_vars_query() function.
Structure of the Query Dataset
Variables
| Variable | Scope | Type | Example Value |
|---|---|---|---|
| VAR_PREFIX | The prefix used to define the grouping variables | Character | "SMQ01" |
| QUERY_NAME | The value provided to the grouping variables name | Character | "Immune-Mediated Guillain-Barre Syndrome" |
| TERM_LEVEL | The variable used to define the grouping. Used in conjunction with TERM_NAME | Character | "AEDECOD" |
| TERM_NAME | A term used to define the grouping. Used in conjunction with TERM_LEVEL | Character | "GUILLAIN-BARRE SYNDROME" |
| TERM_ID | A code used to define the grouping. Used in conjunction with TERM_LEVEL | Integer | 10018767 |
| QUERY_ID | Id number of the query. This could be a SMQ identifier | Integer | 20000131 |
| QUERY_SCOPE | Scope (Broad/Narrow) of the query | Character |
BROAD, NARROW, NA
|
| QUERY_SCOPE_NUM | Scope (Broad/Narrow) of the query | Integer |
1, 2, NA
|
| VERSION | The version of the dictionary | Character | "20.1" |
Bold variables are required in
dataset_queries: an error is issued if any of these
variables is missing. Other variables are optional.
The VERSION variable is not used by
derive_vars_query() but can be used to check if the
dictionary version of the queries dataset and the analysis dataset are
in line.
Required Content
Each row must be unique within the dataset.
As described above, the variables VAR_PREFIX,
QUERY_NAME, TERM_LEVEL, TERM_NAME
and TERM_ID are required. The combination of these
variables will allow the creation of the grouping variable.
Input
VAR_PREFIXmust be a character string starting with 2 or 3 letters, followed by a 2-digits number (e.g. “CQ01”).QUERY_NAMEmust be a non missing character string and it must be unique withinVAR_PREFIX.-
TERM_LEVELmust be a non missing character string.Each value in
TERM_LEVELrepresents a variable fromdatasetused to define the grouping variables (e.g.AEDECOD,AEBODSYS,AELLTCD).The function
derive_vars_query()will check that each value given inTERM_LEVELhas a corresponding variable in the inputdatasetand issue an error otherwise.Different
TERM_LEVELvariables may be specified within aVAR_PREFIX.
TERM_NAMEmust be a character string. This must be populated ifTERM_IDis missing.TERM_IDmust be an integer. This must be populated ifTERM_NAMEis missing.
Output
-
VAR_PREFIXwill be used to create the grouping variable appending the suffix “NAM”. This variable will now be referred to asABCzzNAM: the name of the grouping variable.E.g.
VAR_PREFIX == "SMQ01"will create theSMQ01NAMvariable.For each
VAR_PREFIX, a newABCzzNAMvariable is created indataset.
QUERY_NAMEwill be used to populate the correspondingABCzzNAMvariable.TERM_LEVELwill be used to identify the variables fromdatasetused to perform the grouping (e.g.AEDECOD,AEBODSYS,AELLTCD).TERM_NAME(for character variables),TERM_ID(for numeric variables) will be used to identify the records meeting the criteria indatasetbased on the variable defined inTERM_LEVEL.-
Result:
For each record in
dataset, where the variable defined byTERM_LEVELmatch a term from theTERM_NAME(for character variables) orTERM_ID(for numeric variables) in thedatasets_queries,ABCzzNAMis populated withQUERY_NAME.Note: The type (numeric or character) of the variable defined in
TERM_LEVELis checked indataset. If the variable is a character variable (e.g.AEDECOD), it is expected thatTERM_NAMEis populated, if it is a numeric variable (e.g.AEBDSYCD), it is expected thatTERM_IDis populated, otherwise an error is issued.
Example
In this example, one standard MedDRA query
(VAR_PREFIX = "SMQ01") and one customized query
(VAR_PREFIX = "CQ02") are defined to analyze the adverse
events.
The standard MedDRA query variable
SMQ01NAM[VAR_PREFIX] will be populated with “Standard Query 1” [QUERY_NAME] if any preferred term (AEDECOD) [TERM_LEVEL] indatasetis equal to “AE1” or “AE2” [TERM_NAME]The customized query (
CQ02NAM) [VAR_PREFIX] will be populated with “Query 2” [QUERY_NAME] if any Low Level Term Code (AELLTCD) [TERM_LEVEL] indatasetis equal to 10 [TERM_ID] or any preferred term (AEDECOD) [TERM_LEVEL] indatasetis equal to “AE4” [TERM_NAME].
Query Dataset (ds_query)
| VAR_PREFIX | QUERY_NAME | TERM_LEVEL | TERM_NAME | TERM_ID | |
|---|---|---|---|---|---|
| SMQ01 | Standard Query 1 | AEDECOD | AE1 | ||
| SMQ01 | Standard Query 1 | AEDECOD | AE2 | ||
| CQ02 | Query 2 | AELLTCD | 10 | ||
| CQ02 | Query 2 | AEDECOD | AE4 |
Adverse Event Dataset (ae)
| USUBJID | AEDECOD | AELLTCD |
|---|---|---|
| 0001 | AE1 | 101 |
| 0001 | AE3 | 10 |
| 0001 | AE4 | 120 |
| 0001 | AE5 | 130 |
Output Dataset
Generated by calling
derive_vars_query(dataset = ae, dataset_queries = ds_query).
| USUBJID | AEDECOD | AELLTCD | SMQ01NAM | CQ02NAM |
|---|---|---|---|---|
| 0001 | AE1 | 101 | Standard Query 1 | |
| 0001 | AE3 | 10 | Query 2 | |
| 0001 | AE4 | 120 | Query 2 | |
| 0001 | AE5 | 130 |
Subject 0001 has one event meeting the Standard Query 1 criteria
(AEDECOD = "AE1") and two events meeting the customized
query (AELLTCD = 10 and AEDECOD = "AE4").
Optional Content
When standardized MedDRA Queries are added to the dataset, it is
expected that the name of the query (ABCzzNAM) is populated
along with its number code (ABCzzCD), and its Broad or
Narrow scope (ABCzzSC).
The following variables can be added to queries_datset
to derive this information.
Input
QUERY_IDmust be an integer.QUERY_SCOPEmust be a character string. Possible values are: “BROAD”, “NARROW” orNA.QUERY_SCOPE_NUMmust be an integer. Possible values are:1,2orNA.
Output
-
QUERY_ID,QUERY_SCOPEandQUERY_SCOPE_NUMwill be used in the same way asQUERY_NAME(see here) and will help in the creation of theABCzzCD,ABCzzSCandABCzzSCNvariables.
Output Variables
These variables are optional and if not populated in
dataset_queries, the corresponding output variable will not
be created:
| VAR_PREFIX | QUERY_NAME | QUERY_ID | QUERY_SCOPE | QUERY_SCOPE_NUM | Variables created |
|---|---|---|---|---|---|
| SMQ01 | Query 1 | XXXXXXXX | NARROW | 2 |
SMQ01NAM, SMQ01CD, SMQ01SC,
SMQ01SCN
|
| SMQ02 | Query 2 | XXXXXXXX | BROAD |
SMQ02NAM, SMQ02CD,
SMQ02SC
|
|
| SMQ03 | Query 3 | XXXXXXXX | 1 |
SMQ03NAM, SMQ03CD,
SMQ03SCN
|
|
| SMQ04 | Query 4 | XXXXXXXX |
SMQ04NAM, SMQ04CD
|
||
| SMQ05 | Query 5 | SMQ05NAM |
