Title: | Assess Study Cohorts Using a Common Data Model |
---|---|
Description: | Phenotype study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model. Diagnostics are run at the database, code list, cohort, and population level to assess whether study cohorts are ready for research. |
Authors: | Edward Burn [aut, cre] , Marti Catala [aut] , Xihang Chen [aut] , Marta Alcalde-Herraiz [aut] , Albert Prats-Uribe [aut] |
Maintainer: | Edward Burn <[email protected]> |
License: | Apache License (>= 2) |
Version: | 0.1.0 |
Built: | 2024-12-14 12:39:33 UTC |
Source: | https://github.com/ohdsi/phenotyper |
'addCodelistAttribute()' allows the users to add a codelist to a cohort in OMOP CDM.
This is particularly important for the use of 'codelistDiagnostics()', as the underlying assumption is that the cohort that is fed into 'codelistDiagnostics()' has a cohort_codelist attribute attached to it.
addCodelistAttribute(cohort, codelist, cohortName = names(codelist))
addCodelistAttribute(cohort, codelist, cohortName = names(codelist))
cohort |
Cohort table in a cdm reference |
codelist |
Named list of concepts |
cohortName |
For each element of the codelist, the name of the cohort in 'cohort' to which the codelist refers |
A cohort
library(PhenotypeR) cdm <- mockPhenotypeR() cohort <- addCodelistAttribute(cohort = cdm$my_cohort, codelist = list("cohort_1" = 1L)) attr(cohort, "cohort_codelist") CDMConnector::cdm_disconnect(cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() cohort <- addCodelistAttribute(cohort = cdm$my_cohort, codelist = list("cohort_1" = 1L)) attr(cohort, "cohort_codelist") CDMConnector::cdm_disconnect(cdm)
'codelistDiagnostics()' runs phenotypeR diagnostics on the cohort_codelist attribute on the cohort. Thus codelist attribute of the cohort must be populated. If it is missing then it could be populated using 'addCodelistAttribute()' function.
Furthermore 'codelistDiagnostics()' requires achilles tables to be present in the cdm so that concept counts could be derived.
codelistDiagnostics(cohort)
codelistDiagnostics(cohort)
cohort |
A cohort table in a cdm reference. The cohort_codelist attribute must be populated. The cdm reference must contain achilles tables as these will be used for deriving concept counts. |
A summarised result
library(CohortConstructor) library(PhenotypeR) cdm <- mockPhenotypeR() cdm$arthropathies <- conceptCohort(cdm, conceptSet = list("arthropathies" = c(40475132)), name = "arthropathies") result <- codelistDiagnostics(cdm$arthropathies) CDMConnector::cdmDisconnect(cdm = cdm)
library(CohortConstructor) library(PhenotypeR) cdm <- mockPhenotypeR() cdm$arthropathies <- conceptCohort(cdm, conceptSet = list("arthropathies" = c(40475132)), name = "arthropathies") result <- codelistDiagnostics(cdm$arthropathies) CDMConnector::cdmDisconnect(cdm = cdm)
Runs phenotypeR diagnostics on the cohort. The diganostics include: * Age groups and sex summarised. * A summary of visits of everyone in the cohort using visit_occurrence table. * A summary of age and sex density of the cohort. * Attritions of the cohorts. * Overlap between cohorts (if more than one cohort is being used).
cohortDiagnostics(cohort)
cohortDiagnostics(cohort)
cohort |
Cohort table in a cdm reference |
A summarised result
library(PhenotypeR) cdm <- mockPhenotypeR() result <- cohortDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() result <- cohortDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
phenotypeR diagnostics on the cdm object.
Diagnostics include: * Summarise a cdm_reference object, creating a snapshot with the metadata of the cdm_reference object. * Summarise the observation period table getting some overall statistics in a summarised_result object.
databaseDiagnostics(cdm)
databaseDiagnostics(cdm)
cdm |
CDM reference |
A summarised result
library(PhenotypeR) cdm <- mockPhenotypeR() result <- databaseDiagnostics(cdm) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() result <- databaseDiagnostics(cdm) CDMConnector::cdmDisconnect(cdm = cdm)
A summary of the cohort that is matched to the original cohort that has been given by the user. Such summary contains basic cohort summary including number of visits within one year prior of the cohort_start_date, as well as a large scale charactersitics using the following domians of OMOP CDM:
* condition_occurrence * visit_occurrence * measurement * procedure_occurrence * observation * drug_exposure
matchedDiagnostics(cohort, matchedSample = 1000)
matchedDiagnostics(cohort, matchedSample = 1000)
cohort |
Cohort table in a cdm reference |
matchedSample |
The number of people to take a random sample for matching. If NULL, no sampling will be performed. |
A summarised result
library(PhenotypeR) cdm <- mockPhenotypeR() result <- matchedDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() result <- matchedDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
'mockPhenotypeR()' creates an example dataset that can be used to show how the package works
mockPhenotypeR( nPerson = 100, con = DBI::dbConnect(duckdb::duckdb()), writeSchema = "main", seed = 111 )
mockPhenotypeR( nPerson = 100, con = DBI::dbConnect(duckdb::duckdb()), writeSchema = "main", seed = 111 )
nPerson |
number of people in the cdm. |
con |
A DBI connection to create the cdm mock object. |
writeSchema |
Name of an schema on the same connection with writing permissions. |
seed |
seed to use when creating the mock data. |
cdm object
library(PhenotypeR) cdm <- mockPhenotypeR() cdm
library(PhenotypeR) cdm <- mockPhenotypeR() cdm
This comprises all the diagnostics that are being offered in this package, this includes:
* A diagnostics on the database via 'databaseDiagnostics'. * A diagnostics on the cohort_codelist attribute of the cohort via 'codelistDiagnostics'. * A diagnostics on the cohort via 'cohortDiagnostics'. * A diagnostics on the population via 'populationDiagnostics'. * A diagnostics on the matched cohort via 'matchedDiagnostics'.
phenotypeDiagnostics( cohort, databaseDiagnostics = TRUE, codelistDiagnostics = TRUE, cohortDiagnostics = TRUE, populationDiagnostics = TRUE, populationSample = 1e+06, populationDateRange = as.Date(c(NA, NA)), matchedDiagnostics = TRUE, matchedSample = 1000 )
phenotypeDiagnostics( cohort, databaseDiagnostics = TRUE, codelistDiagnostics = TRUE, cohortDiagnostics = TRUE, populationDiagnostics = TRUE, populationSample = 1e+06, populationDateRange = as.Date(c(NA, NA)), matchedDiagnostics = TRUE, matchedSample = 1000 )
cohort |
Cohort table in a cdm reference |
databaseDiagnostics |
If TRUE, database diagnostics will be run. |
codelistDiagnostics |
If TRUE, codelist diagnostics will be run. |
cohortDiagnostics |
If TRUE, cohort diagnostics will be run. |
populationDiagnostics |
If TRUE, population diagnostics will be run. |
populationSample |
Number of people from the cdm to sample. If NULL no sampling will be performed |
populationDateRange |
Two dates. The first indicating the earliest cohort start date and the second indicating the latest possible cohort end date. If NULL or the first date is set as missing, the earliest observation_start_date in the observation_period table will be used for the former. If NULL or the second date is set as missing, the latest observation_end_date in the observation_period table will be used for the latter. |
matchedDiagnostics |
If TRUE, cohort to population diagnostics will be run. |
matchedSample |
The number of people to take a random sample for matching. If NULL, no sampling will be performed. |
A summarised result
library(PhenotypeR) cdm <- mockPhenotypeR() result <- phenotypeDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() result <- phenotypeDiagnostics(cdm$my_cohort) CDMConnector::cdmDisconnect(cdm = cdm)
phenotypeR diagnostics on the cohort of input with relation to a denomination population. Diagnostics include:
* Incidence * Prevalence
populationDiagnostics( cohort, populationSample = 1e+06, populationDateRange = as.Date(c(NA, NA)) )
populationDiagnostics( cohort, populationSample = 1e+06, populationDateRange = as.Date(c(NA, NA)) )
cohort |
Cohort table in a cdm reference |
populationSample |
Number of people from the cdm to sample. If NULL no sampling will be performed |
populationDateRange |
Two dates. The first indicating the earliest cohort start date and the second indicating the latest possible cohort end date. If NULL or the first date is set as missing, the earliest observation_start_date in the observation_period table will be used for the former. If NULL or the second date is set as missing, the latest observation_end_date in the observation_period table will be used for the latter. |
A summarised result
library(PhenotypeR) library(dplyr) cdm <- mockPhenotypeR() dateStart <- cdm$my_cohort |> summarise(start = min(cohort_start_date, na.rm = TRUE)) |> pull("start") dateEnd <- cdm$my_cohort |> summarise(start = max(cohort_start_date, na.rm = TRUE)) |> pull("start") result <- cdm$my_cohort |> populationDiagnostics(populationDateRange = c(dateStart, dateEnd)) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) library(dplyr) cdm <- mockPhenotypeR() dateStart <- cdm$my_cohort |> summarise(start = min(cohort_start_date, na.rm = TRUE)) |> pull("start") dateEnd <- cdm$my_cohort |> summarise(start = max(cohort_start_date, na.rm = TRUE)) |> pull("start") result <- cdm$my_cohort |> populationDiagnostics(populationDateRange = c(dateStart, dateEnd)) CDMConnector::cdmDisconnect(cdm = cdm)
A shiny app that is designed for any diagnostics results from phenotypeR, this includes:
* A diagnostics on the database via 'databaseDiagnostics'. * A diagnostics on the cohort_codelist attribute of the cohort via 'codelistDiagnostics'. * A diagnostics on the cohort via 'cohortDiagnostics'. * A diagnostics on the population via 'populationDiagnostics'. * A diagnostics on the matched cohort via 'matchedDiagnostics'.
shinyDiagnostics(result, directory, open = rlang::is_interactive())
shinyDiagnostics(result, directory, open = rlang::is_interactive())
result |
A summarised result |
directory |
Directory where to save report |
open |
If TRUE, the shiny app will be launched in a new session. If FALSE, the shiny app will be created but not launched. |
A shiny app
library(PhenotypeR) cdm <- mockPhenotypeR() result <- phenotypeDiagnostics(cdm$my_cohort) shinyDiagnostics(result, tempdir()) CDMConnector::cdmDisconnect(cdm = cdm)
library(PhenotypeR) cdm <- mockPhenotypeR() result <- phenotypeDiagnostics(cdm$my_cohort) shinyDiagnostics(result, tempdir()) CDMConnector::cdmDisconnect(cdm = cdm)