| Title: | Creation of Mock Observational Medical Outcomes Partnership Common Data Model |
|---|---|
| Description: | Creates mock data for testing and package development for the Observational Medical Outcomes Partnership common data model. The package offers functions crafted with pipeline-friendly implementation, enabling users to effortlessly include only the necessary tables for their testing needs. |
| Authors: | Mike Du [aut, cre] (ORCID: <https://orcid.org/0000-0002-9517-8834>), MartĂ CatalĂ [aut] (ORCID: <https://orcid.org/0000-0003-3308-9905>), Edward Burn [aut] (ORCID: <https://orcid.org/0000-0002-9286-1128>), Nuria Mercade-Besora [aut] (ORCID: <https://orcid.org/0009-0006-7948-3747>), Xihang Chen [aut] (ORCID: <https://orcid.org/0009-0001-8112-8959>) |
| Maintainer: | Mike Du <[email protected]> |
| License: | Apache License (>= 2) |
| Version: | 0.6.2.9000 |
| Built: | 2026-05-29 08:15:22 UTC |
| Source: | https://github.com/ohdsi/omock |
List the available datasets
availableMockDatasets()availableMockDatasets()
A character vector with the available datasets.
library(omock) availableMockDatasets()library(omock) availableMockDatasets()
Download an OMOP Synthetic dataset.
downloadMockDataset(datasetName = "GiBleed", path = NULL, overwrite = NULL)downloadMockDataset(datasetName = "GiBleed", path = NULL, overwrite = NULL)
datasetName |
Name of the mock dataset. See |
path |
Path where to download the dataset. |
overwrite |
Whether to overwrite the dataset if it is already downloaded. If NULL the used is asked whether to overwrite. |
The path to the downloaded dataset.
library(omock) isMockDatasetDownloaded("GiBleed") downloadMockDataset("GiBleed") isMockDatasetDownloaded("GiBleed")library(omock) isMockDatasetDownloaded("GiBleed") downloadMockDataset("GiBleed") isMockDatasetDownloaded("GiBleed")
Check if a certain dataset is downloaded.
isMockDatasetDownloaded(datasetName = "GiBleed")isMockDatasetDownloaded(datasetName = "GiBleed")
datasetName |
Name of the mock dataset. See |
Whether the dataset is available or not.
library(omock) isMockDatasetDownloaded("GiBleed") downloadMockDataset("GiBleed") isMockDatasetDownloaded("GiBleed")library(omock) isMockDatasetDownloaded("GiBleed") downloadMockDataset("GiBleed") isMockDatasetDownloaded("GiBleed")
local cdm_reference from a dataset.Create a local cdm_reference from a dataset.
mockCdmFromDataset( datasetName = "GiBleed", source = "local", cdmVersion = NULL )mockCdmFromDataset( datasetName = "GiBleed", source = "local", cdmVersion = NULL )
datasetName |
Name of the mock dataset. See |
source |
Choice between |
cdmVersion |
Version of the OMOP CDM, can either be '5.3' or '5.4'. By default if not specified in databaseName the cdmVersion will be '5.4'. |
A local cdm_reference object.
library(omock) mockDatasetsFolder(tempdir()) downloadMockDataset(datasetName = "GiBleed") cdm <- mockCdmFromDataset(datasetName = "GiBleed") cdmlibrary(omock) mockDatasetsFolder(tempdir()) downloadMockDataset(datasetName = "GiBleed") cdm <- mockCdmFromDataset(datasetName = "GiBleed") cdm
This function takes an existing CDM reference (which can be empty) and a list of additional named tables to create a more complete mock CDM object. It ensures that all provided observations fit within their respective observation periods and that all individual records are consistent with the entries in the person table. This is useful for creating reliable and realistic healthcare data simulations for development and testing within the OMOP CDM framework.
mockCdmFromTables( cdm = mockCdmReference(), tables = list(), maxObservationalPeriodEndDate = as.Date("01-01-2024", "%d-%m-%Y"), seed = NULL )mockCdmFromTables( cdm = mockCdmReference(), tables = list(), maxObservationalPeriodEndDate = as.Date("01-01-2024", "%d-%m-%Y"), seed = NULL )
cdm |
A local |
tables |
A named list of data frames representing additional tables to be integrated into the CDM. These tables can include both standard OMOP tables such as 'drug_exposure' or 'condition_occurrence', as well as cohort-specific tables that are not part of the standard OMOP model but are necessary for specific analyses. Each table should be named according to its intended table name in the CDM structure. |
maxObservationalPeriodEndDate |
A |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock cohort table cohort <- tibble( cohort_definition_id = c(1, 1, 2, 2, 1, 3, 3, 3, 1, 3), subject_id = c(1, 4, 2, 3, 5, 5, 4, 3, 3, 1), cohort_start_date = as.Date(c( "2020-04-01", "2021-06-01", "2022-05-22", "2010-01-01", "2019-08-01", "2019-04-07", "2021-01-01", "2008-02-02", "2009-09-09", "2021-01-01" )), cohort_end_date = cohort_start_date ) # Generate a mock CDM from preexisting CDM structure and cohort table cdm <- mockCdmFromTables(cdm = mockCdmReference(), tables = list(cohort = cohort)) # Access the newly integrated cohort table and the standard person table in the CDM cdm$cohort |> glimpse() cdm$person |> glimpse()library(omock) library(dplyr) # Create a mock cohort table cohort <- tibble( cohort_definition_id = c(1, 1, 2, 2, 1, 3, 3, 3, 1, 3), subject_id = c(1, 4, 2, 3, 5, 5, 4, 3, 3, 1), cohort_start_date = as.Date(c( "2020-04-01", "2021-06-01", "2022-05-22", "2010-01-01", "2019-08-01", "2019-04-07", "2021-01-01", "2008-02-02", "2009-09-09", "2021-01-01" )), cohort_end_date = cohort_start_date ) # Generate a mock CDM from preexisting CDM structure and cohort table cdm <- mockCdmFromTables(cdm = mockCdmReference(), tables = list(cohort = cohort)) # Access the newly integrated cohort table and the standard person table in the CDM cdm$cohort |> glimpse() cdm$person |> glimpse()
This function initializes an empty CDM reference with a specified name and populates it with mock vocabulary tables based on the provided vocabulary set. It is particularly useful for setting up a simulated environment for testing and development purposes within the OMOP CDM framework.
mockCdmReference( cdmName = "mock database", vocabularySet = "mock", conceptSet = NULL, includeRelated = TRUE )mockCdmReference( cdmName = "mock database", vocabularySet = "mock", conceptSet = NULL, includeRelated = TRUE )
cdmName |
A character string specifying the name of the CDM object to be created.This name can be used to identify the CDM object within a larger simulation or testing framework. Default is "mock database". |
vocabularySet |
A character string specifying the name of the vocabulary set to be used when creating the vocabulary tables for the CDM. Options are "mock" or "eunomia":
|
conceptSet |
An optional numeric vector of concept IDs used to subset the vocabulary tables attached to the returned CDM. |
includeRelated |
Whether to retain vocabulary concepts directly related
to |
Returns a CDM object that is initially empty but includes mock vocabulary tables.The object structure is compliant with OMOP CDM standards, making it suitable for further population with mock data like person, visit, and observation records.
library(omock) # Create a new empty mock CDM reference cdm <- mockCdmReference() # Display the structure of the newly created CDM cdmlibrary(omock) # Create a new empty mock CDM reference cdm <- mockCdmReference() # Display the structure of the newly created CDM cdm
This function generates synthetic cohort data and adds it to a given CDM (Common Data Model) reference. It allows for creating multiple cohorts with specified properties and simulates the frequency of observations for individuals.
mockCohort( cdm, name = "cohort", numberCohorts = 1, cohortName = paste0("cohort_", seq_len(numberCohorts)), recordPerson = 1, seed = NULL )mockCohort( cdm, name = "cohort", numberCohorts = 1, cohortName = paste0("cohort_", seq_len(numberCohorts)), recordPerson = 1, seed = NULL )
cdm |
A local |
name |
A string specifying the name of the table within the CDM where the cohort data will be stored. Defaults to "cohort". This name will be used to reference the new table in the CDM. |
numberCohorts |
An integer specifying the number of different cohorts to create within the table. Defaults to 1. This parameter allows for the creation of multiple cohorts, each with a unique identifier. |
cohortName |
A character vector specifying the names of the cohorts to
be created. If not provided, default names based on a sequence
(e.g., "cohort_1", "cohort_2", ...) will be generated. The length of this
vector must match the value of |
recordPerson |
An integer or a vector of integers specifying the
expected number of records per person within each cohort. If a single
integer is provided, it applies to all cohorts. If a vector is provided, its
length must match the value of |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) cdm <- mockCdmReference() |> mockPerson(nPerson = 100) |> mockObservationPeriod() |> mockCohort( name = "omock_example", numberCohorts = 2, cohortName = c("omock_cohort_1", "omock_cohort_2") ) cdmlibrary(omock) cdm <- mockCdmReference() |> mockPerson(nPerson = 100) |> mockObservationPeriod() |> mockCohort( name = "omock_example", numberCohorts = 2, cohortName = c("omock_cohort_1", "omock_cohort_2") ) cdm
mockConcepts(cdm, conceptSet, domain = "Condition", seed = NULL)mockConcepts(cdm, conceptSet, domain = "Condition", seed = NULL)
cdm |
A local |
conceptSet |
A numeric vector of concept IDs to be added or updated in the concept table.These IDs should be unique within the context of the provided domain to avoid unintended overwriting unless that is the intended effect. |
domain |
A character string specifying the domain of the concepts being added.Only accepts "Condition", "Drug", "Measurement", or "Observation". This defines under which category the concepts fall and affects which vocabulary is used for them. |
seed |
An optional integer used to set the random seed for reproducibility. If |
This function inserts new concept entries into a specified domain within the concept table of a CDM object.It supports four domains: Condition, Drug, Measurement, and Observation. Existing entries with the same concept IDs will be overwritten, so caution should be used when adding data to prevent unintended data loss.
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add concepts in the 'Condition' domain cdm <- mockCdmReference() |> mockConcepts( conceptSet = c(100, 200), domain = "Condition" ) # View the updated concept entries for the 'Condition' domain cdm$concept |> filter(domain_id == "Condition")library(omock) library(dplyr) # Create a mock CDM reference and add concepts in the 'Condition' domain cdm <- mockCdmReference() |> mockConcepts( conceptSet = c(100, 200), domain = "Condition" ) # View the updated concept entries for the 'Condition' domain cdm$concept |> filter(domain_id == "Condition")
This function simulates condition occurrences for individuals within a specified cohort. It helps create a realistic dataset by generating condition records for each person, based on the number of records specified per person.The generated data are aligned with the existing observation periods to ensure that all conditions are recorded within valid observation windows.
mockConditionOccurrence(cdm, recordPerson = 1, seed = NULL)mockConditionOccurrence(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
Numeric multiplier used to determine how many condition
occurrence records to generate relative to the number of
people in |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add condition occurrences cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockConditionOccurrence(recordPerson = 2) # View the generated condition occurrence data cdm$condition_occurrence |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add condition occurrences cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockConditionOccurrence(recordPerson = 2) # View the generated condition occurrence data cdm$condition_occurrence |> glimpse()
These are the mock OMOP CDM Synthetic Datasets that are available to download
using the omock package.
mockDatasetsmockDatasets
A data frame with 4 variables:
Name of the dataset.
url to download the dataset.
Name of the cdm reference created.
OMOP CDM version of the dataset.
Size in bytes of the dataset.
Size in Mega bytes of the dataset.
Number individuals in the dataset.
Total number of records in the dataset.
Distinct number of concepts in the dataset.
mockDatasetsmockDatasets
Deprecated
mockDatasetsFolder(path = NULL)mockDatasetsFolder(path = NULL)
path |
Path to a folder to store the synthetic datasets. If NULL the current OMOP_DATASETS_FOLDER is returned. |
The dataset folder.
mockDatasetsFolder() mockDatasetsFolder(file.path(tempdir(), "OMOP_DATASETS")) mockDatasetsFolder()mockDatasetsFolder() mockDatasetsFolder(file.path(tempdir(), "OMOP_DATASETS")) mockDatasetsFolder()
Check the availability of the OMOP CDM datasets.
mockDatasetsStatus()mockDatasetsStatus()
A message with the availability of the datasets.
library(omock) mockDatasetsStatus()library(omock) mockDatasetsStatus()
This function simulates death records for individuals within a specified cohort. It creates a realistic dataset by generating death records according to the specified number of records per person. The function ensures that each death record is associated with a valid person within the observation period to maintain the integrity of the data.
mockDeath(cdm, recordPerson = 1, seed = NULL)mockDeath(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
An integer specifying the expected number of death records to generate per person. This parameter helps simulate varying frequencies of death occurrences among individuals in the cohort, reflecting the variability seen in real-world medical data. Typically, this would be set to 1 or 0, assuming most datasets would only record a single death date per individual if at all. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add death records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockDeath(recordPerson = 1) # View the generated death data cdm$death |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add death records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockDeath(recordPerson = 1) # View the generated death data cdm$death |> glimpse()
This function simulates drug exposure records for individuals within a specified cohort. It creates a realistic dataset by generating drug exposure records based on the specified number of records per person. Each drug exposure record is correctly associated with an individual within valid observation periods, ensuring the integrity of the data.
mockDrugExposure(cdm, recordPerson = 1, seed = NULL)mockDrugExposure(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
An integer specifying the expected number of drug exposure records to generate per person. This parameter allows for the simulation of varying drug usage frequencies among individuals in the cohort, reflecting real-world variability in medication administration. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add drug exposure records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockDrugExposure(recordPerson = 3) # View the generated drug exposure data cdm$drug_exposure |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add drug exposure records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockDrugExposure(recordPerson = 3) # View the generated drug exposure data cdm$drug_exposure |> glimpse()
This function simulates measurement records for individuals within a specified cohort. It creates a realistic dataset by generating measurement records based on the specified number of records per person. Each measurement record is correctly associated with an individual within valid observation periods, ensuring the integrity of the data.
mockMeasurement(cdm, recordPerson = 1, seed = NULL)mockMeasurement(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
An integer specifying the expected number of measurement records to generate per person. This parameter allows for the simulation of varying frequencies of health measurements among individuals in the cohort, reflecting real-world variability in patient monitoring and diagnostic testing. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add measurement records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockMeasurement(recordPerson = 5) # View the generated measurement data cdm$measurement |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add measurement records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockMeasurement(recordPerson = 5) # View the generated measurement data cdm$measurement |> glimpse()
This function simulates observation records for individuals within a specified cohort. It creates a realistic dataset by generating observation records based on the specified number of records per person. Each observation record is correctly associated with an individual within valid observation periods, ensuring the integrity of the data.
mockObservation(cdm, recordPerson = 1, seed = NULL)mockObservation(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
An integer specifying the expected number of observation records to generate per person. This parameter allows for the simulation of varying frequencies of healthcare observations among individuals in the cohort, reflecting real-world variability in patient monitoring and health assessments. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add observation records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockObservation(recordPerson = 3) # View the generated observation data cdm$observation |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add observation records cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockObservation(recordPerson = 3) # View the generated observation data cdm$observation |> glimpse()
This function simulates observation periods for individuals based on their date of birth recorded in the 'person' table of the CDM object. It assigns random start and end dates for each observation period within a realistic timeframe up to a specified or default maximum date.
mockObservationPeriod(cdm, seed = NULL)mockObservationPeriod(cdm, seed = NULL)
cdm |
A local |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add observation periods cdm <- mockCdmReference() |> mockPerson(nPerson = 100) |> mockObservationPeriod() # View the generated observation period data cdm$observation_period |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add observation periods cdm <- mockCdmReference() |> mockPerson(nPerson = 100) |> mockObservationPeriod() # View the generated observation period data cdm$observation_period |> glimpse()
This function creates a mock person table with specified characteristics for each individual, including a randomly assigned date of birth within a given range and gender based on specified proportions. It populates the CDM object's person table with these entries, ensuring each record is uniquely identified.
mockPerson( cdm = mockCdmReference(), nPerson = 10, birthRange = as.Date(c("1950-01-01", "2000-12-31")), proportionFemale = 0.5, seed = NULL )mockPerson( cdm = mockCdmReference(), nPerson = 10, birthRange = as.Date(c("1950-01-01", "2000-12-31")), proportionFemale = 0.5, seed = NULL )
cdm |
A local |
nPerson |
An integer specifying the number of mock persons to create in the person table. This defines the scale of the simulation and allows for the creation of datasets with varying sizes. |
birthRange |
A date range within which the birthdays of the mock persons will be randomly generated.
This should be provided as a vector of two dates ( |
proportionFemale |
A numeric value between 0 and 1 indicating the proportion of the persons who are female. For example, a value of 0.5 means approximately 50% of the generated persons will be female. This helps simulate realistic demographic distributions. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) cdm <- mockPerson(cdm = mockCdmReference(), nPerson = 10) # View the generated person data cdm$person |> glimpse()library(omock) library(dplyr) cdm <- mockPerson(cdm = mockCdmReference(), nPerson = 10) # View the generated person data cdm$person |> glimpse()
This function simulates condition occurrences for individuals within a specified cohort. It helps create a realistic dataset by generating condition records for each person, based on the number of records specified per person.The generated data are aligned with the existing observation periods to ensure that all conditions are recorded within valid observation windows.
mockProcedureOccurrence(cdm, recordPerson = 1, seed = NULL)mockProcedureOccurrence(cdm, recordPerson = 1, seed = NULL)
cdm |
A local |
recordPerson |
An integer specifying the expected number of condition records to generate per person.This parameter allows the simulation of varying frequencies of condition occurrences among individuals in the cohort, reflecting the variability seen in real-world medical data. |
seed |
An optional integer used to set the random seed for reproducibility. If |
A modified cdm_reference object.
library(omock) library(dplyr) # Create a mock CDM reference and add condition occurrences cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockProcedureOccurrence(recordPerson = 2) # View the generated condition occurrence data cdm$procedure_occurrence |> glimpse()library(omock) library(dplyr) # Create a mock CDM reference and add condition occurrences cdm <- mockCdmReference() |> mockPerson() |> mockObservationPeriod() |> mockProcedureOccurrence(recordPerson = 2) # View the generated condition occurrence data cdm$procedure_occurrence |> glimpse()
mockVisitOccurrence(cdm, seed = NULL, visitDetail = FALSE)mockVisitOccurrence(cdm, seed = NULL, visitDetail = FALSE)
cdm |
A local |
seed |
An optional integer used to set the random seed for reproducibility. If |
visitDetail |
TRUE/FALSE it add the corresponding visit_detail table for the mock visit occurrence created. |
A modified cdm_reference object.
library(omock)library(omock)
This function create specified vocabulary tables to a CDM object. It can either populate the tables with provided data frames or initialize empty tables if no data is provided. This is useful for setting up a testing environment with controlled vocabulary data.
mockVocabularySet( cdm = mockCdmReference(), vocabularySet = "GiBleed", conceptSet = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender") )mockVocabularySet( cdm = mockCdmReference(), vocabularySet = "GiBleed", conceptSet = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender") )
cdm |
A local |
vocabularySet |
A character string that specifies a prefix or a set name used to initialize mock data tables. This allows for customization of the source data or structure names when generating vocabulary tables. |
conceptSet |
An optional numeric vector of concept IDs used to subset the loaded vocabulary tables. |
includeRelated |
Whether to retain vocabulary concepts directly related
to |
keepDomains |
Character vector of |
A modified cdm_reference object.
library(omock) # Create a mock CDM reference and populate it with mock vocabulary tables cdm <- mockCdmReference() |> mockVocabularySet(vocabularySet = "GiBleed") # View the names of the tables added to the CDM names(cdm)library(omock) # Create a mock CDM reference and populate it with mock vocabulary tables cdm <- mockCdmReference() |> mockVocabularySet(vocabularySet = "GiBleed") # View the names of the tables added to the CDM names(cdm)
mockVocabularyTables( cdm = mockCdmReference(), vocabularySet = "mock", cdmSource = NULL, concept = NULL, vocabulary = NULL, conceptRelationship = NULL, conceptSynonym = NULL, conceptAncestor = NULL, drugStrength = NULL, conceptSet = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender") )mockVocabularyTables( cdm = mockCdmReference(), vocabularySet = "mock", cdmSource = NULL, concept = NULL, vocabulary = NULL, conceptRelationship = NULL, conceptSynonym = NULL, conceptAncestor = NULL, drugStrength = NULL, conceptSet = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender") )
cdm |
A local |
vocabularySet |
A character string specifying the name of the vocabulary set to be used when creating the vocabulary tables for the CDM. Options are "mock" or "eunomia":
|
cdmSource |
An optional data frame representing the CDM source table.
If provided, it will be used directly; otherwise, a mock table will be generated based on the |
concept |
An optional data frame representing the concept table. If provided, it will be used directly; if NULL, a mock table will be generated. |
vocabulary |
An optional data frame representing the vocabulary table. If provided, it will be used directly; if NULL, a mock table will be generated. |
conceptRelationship |
An optional data frame representing the concept relationship table. If provided, it will be used directly; if NULL, a mock table will be generated. |
conceptSynonym |
An optional data frame representing the concept synonym table. If provided, it will be used directly; if NULL, a mock table will be generated. |
conceptAncestor |
An optional data frame representing the concept ancestor table. If provided, it will be used directly; if NULL, a mock table will be generated. |
drugStrength |
An optional data frame representing the drug strength table. If provided, it will be used directly; if NULL, a mock table will be generated. |
conceptSet |
An optional numeric vector of concept IDs used to subset the vocabulary after it has been assembled. When supplied, the function keeps the requested concepts and directly related vocabulary rows such as synonyms, relationships, ancestors, and drug strength records. |
includeRelated |
Whether to retain vocabulary concepts directly related
to |
keepDomains |
Character vector of |
This function adds specified vocabulary tables to a CDM object. It can either populate the tables with provided data frames or initialize empty tables if no data is provided. This is useful for setting up a testing environment with controlled vocabulary data.
A modified cdm_reference object.
library(omock) # Create a mock CDM reference and populate it with mock vocabulary tables cdm <- mockCdmReference() |> mockVocabularyTables(vocabularySet = "mock") # View the names of the tables added to the CDM names(cdm)library(omock) # Create a mock CDM reference and populate it with mock vocabulary tables cdm <- mockCdmReference() |> mockVocabularyTables(vocabularySet = "mock") # View the names of the tables added to the CDM names(cdm)
Restricts the vocabulary tables in a cdm_reference to a target concept set
while optionally retaining directly related concepts and selected
domain_id values. Non-vocabulary OMOP tables are then filtered so rows that
reference removed concepts are also dropped.
subsetVocabularyTables( cdm = NULL, conceptSet = NULL, cdmTables = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender", "Type Concept") )subsetVocabularyTables( cdm = NULL, conceptSet = NULL, cdmTables = NULL, includeRelated = TRUE, keepDomains = c("Unit", "Visit", "Gender", "Type Concept") )
cdm |
A |
conceptSet |
Numeric vector of concept IDs to retain. |
cdmTables |
Optional named list of vocabulary tables to subset instead
of a full |
includeRelated |
Whether to retain concepts directly related to
|
keepDomains |
Character vector of |
A modified cdm_reference object.
cdm <- mockCdmFromDataset() cdm <- cdm |> subsetVocabularyTables(conceptSet = c(35208414))cdm <- mockCdmFromDataset() cdm <- cdm |> subsetVocabularyTables(conceptSet = c(35208414))