Package 'PhenotypeR'

Title: Assess Study Cohorts Using a Common Data Model
Description: Phenotype study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model. Diagnostics are run at the database, code list, cohort, and population level to assess whether study cohorts are ready for research.
Authors: Edward Burn [aut, cre] , Marti Catala [aut] , Xihang Chen [aut] , Marta Alcalde-Herraiz [aut] , Albert Prats-Uribe [aut]
Maintainer: Edward Burn <[email protected]>
License: Apache License (>= 2)
Version: 0.1.0
Built: 2024-12-14 12:39:33 UTC
Source: https://github.com/ohdsi/phenotyper

Help Index


Adds the cohort_codelist attribute to a cohort

Description

'addCodelistAttribute()' allows the users to add a codelist to a cohort in OMOP CDM.

This is particularly important for the use of 'codelistDiagnostics()', as the underlying assumption is that the cohort that is fed into 'codelistDiagnostics()' has a cohort_codelist attribute attached to it.

Usage

addCodelistAttribute(cohort, codelist, cohortName = names(codelist))

Arguments

cohort

Cohort table in a cdm reference

codelist

Named list of concepts

cohortName

For each element of the codelist, the name of the cohort in 'cohort' to which the codelist refers

Value

A cohort

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

cohort <- addCodelistAttribute(cohort = cdm$my_cohort, codelist = list("cohort_1" = 1L))
attr(cohort, "cohort_codelist")

CDMConnector::cdm_disconnect(cdm)

Run codelist-level diagnostics

Description

'codelistDiagnostics()' runs phenotypeR diagnostics on the cohort_codelist attribute on the cohort. Thus codelist attribute of the cohort must be populated. If it is missing then it could be populated using 'addCodelistAttribute()' function.

Furthermore 'codelistDiagnostics()' requires achilles tables to be present in the cdm so that concept counts could be derived.

Usage

codelistDiagnostics(cohort)

Arguments

cohort

A cohort table in a cdm reference. The cohort_codelist attribute must be populated. The cdm reference must contain achilles tables as these will be used for deriving concept counts.

Value

A summarised result

Examples

library(CohortConstructor)
library(PhenotypeR)

cdm <- mockPhenotypeR()

cdm$arthropathies <- conceptCohort(cdm,
                                   conceptSet = list("arthropathies" = c(40475132)),
                                   name = "arthropathies")

result <- codelistDiagnostics(cdm$arthropathies)

CDMConnector::cdmDisconnect(cdm = cdm)

Run cohort-level diagnostics

Description

Runs phenotypeR diagnostics on the cohort. The diganostics include: * Age groups and sex summarised. * A summary of visits of everyone in the cohort using visit_occurrence table. * A summary of age and sex density of the cohort. * Attritions of the cohorts. * Overlap between cohorts (if more than one cohort is being used).

Usage

cohortDiagnostics(cohort)

Arguments

cohort

Cohort table in a cdm reference

Value

A summarised result

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- cohortDiagnostics(cdm$my_cohort)

CDMConnector::cdmDisconnect(cdm = cdm)

Database diagnostics

Description

phenotypeR diagnostics on the cdm object.

Diagnostics include: * Summarise a cdm_reference object, creating a snapshot with the metadata of the cdm_reference object. * Summarise the observation period table getting some overall statistics in a summarised_result object.

Usage

databaseDiagnostics(cdm)

Arguments

cdm

CDM reference

Value

A summarised result

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- databaseDiagnostics(cdm)

CDMConnector::cdmDisconnect(cdm = cdm)

Compare characteristics of cohort matched to database population

Description

A summary of the cohort that is matched to the original cohort that has been given by the user. Such summary contains basic cohort summary including number of visits within one year prior of the cohort_start_date, as well as a large scale charactersitics using the following domians of OMOP CDM:

* condition_occurrence * visit_occurrence * measurement * procedure_occurrence * observation * drug_exposure

Usage

matchedDiagnostics(cohort, matchedSample = 1000)

Arguments

cohort

Cohort table in a cdm reference

matchedSample

The number of people to take a random sample for matching. If NULL, no sampling will be performed.

Value

A summarised result

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- matchedDiagnostics(cdm$my_cohort)

CDMConnector::cdmDisconnect(cdm = cdm)

Function to create a mock cdm reference for mockPhenotypeR

Description

'mockPhenotypeR()' creates an example dataset that can be used to show how the package works

Usage

mockPhenotypeR(
  nPerson = 100,
  con = DBI::dbConnect(duckdb::duckdb()),
  writeSchema = "main",
  seed = 111
)

Arguments

nPerson

number of people in the cdm.

con

A DBI connection to create the cdm mock object.

writeSchema

Name of an schema on the same connection with writing permissions.

seed

seed to use when creating the mock data.

Value

cdm object

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

cdm

Phenotype a cohort

Description

This comprises all the diagnostics that are being offered in this package, this includes:

* A diagnostics on the database via 'databaseDiagnostics'. * A diagnostics on the cohort_codelist attribute of the cohort via 'codelistDiagnostics'. * A diagnostics on the cohort via 'cohortDiagnostics'. * A diagnostics on the population via 'populationDiagnostics'. * A diagnostics on the matched cohort via 'matchedDiagnostics'.

Usage

phenotypeDiagnostics(
  cohort,
  databaseDiagnostics = TRUE,
  codelistDiagnostics = TRUE,
  cohortDiagnostics = TRUE,
  populationDiagnostics = TRUE,
  populationSample = 1e+06,
  populationDateRange = as.Date(c(NA, NA)),
  matchedDiagnostics = TRUE,
  matchedSample = 1000
)

Arguments

cohort

Cohort table in a cdm reference

databaseDiagnostics

If TRUE, database diagnostics will be run.

codelistDiagnostics

If TRUE, codelist diagnostics will be run.

cohortDiagnostics

If TRUE, cohort diagnostics will be run.

populationDiagnostics

If TRUE, population diagnostics will be run.

populationSample

Number of people from the cdm to sample. If NULL no sampling will be performed

populationDateRange

Two dates. The first indicating the earliest cohort start date and the second indicating the latest possible cohort end date. If NULL or the first date is set as missing, the earliest observation_start_date in the observation_period table will be used for the former. If NULL or the second date is set as missing, the latest observation_end_date in the observation_period table will be used for the latter.

matchedDiagnostics

If TRUE, cohort to population diagnostics will be run.

matchedSample

The number of people to take a random sample for matching. If NULL, no sampling will be performed.

Value

A summarised result

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- phenotypeDiagnostics(cdm$my_cohort)

CDMConnector::cdmDisconnect(cdm = cdm)

Population-level diagnostics

Description

phenotypeR diagnostics on the cohort of input with relation to a denomination population. Diagnostics include:

* Incidence * Prevalence

Usage

populationDiagnostics(
  cohort,
  populationSample = 1e+06,
  populationDateRange = as.Date(c(NA, NA))
)

Arguments

cohort

Cohort table in a cdm reference

populationSample

Number of people from the cdm to sample. If NULL no sampling will be performed

populationDateRange

Two dates. The first indicating the earliest cohort start date and the second indicating the latest possible cohort end date. If NULL or the first date is set as missing, the earliest observation_start_date in the observation_period table will be used for the former. If NULL or the second date is set as missing, the latest observation_end_date in the observation_period table will be used for the latter.

Value

A summarised result

Examples

library(PhenotypeR)
library(dplyr)

cdm <- mockPhenotypeR()

dateStart <- cdm$my_cohort |>
  summarise(start = min(cohort_start_date, na.rm = TRUE)) |>
  pull("start")
dateEnd   <- cdm$my_cohort |>
  summarise(start = max(cohort_start_date, na.rm = TRUE)) |>
  pull("start")

result <- cdm$my_cohort |>
  populationDiagnostics(populationDateRange = c(dateStart, dateEnd))

CDMConnector::cdmDisconnect(cdm = cdm)

Create a shiny app summarising your phenotyping results

Description

A shiny app that is designed for any diagnostics results from phenotypeR, this includes:

* A diagnostics on the database via 'databaseDiagnostics'. * A diagnostics on the cohort_codelist attribute of the cohort via 'codelistDiagnostics'. * A diagnostics on the cohort via 'cohortDiagnostics'. * A diagnostics on the population via 'populationDiagnostics'. * A diagnostics on the matched cohort via 'matchedDiagnostics'.

Usage

shinyDiagnostics(result, directory, open = rlang::is_interactive())

Arguments

result

A summarised result

directory

Directory where to save report

open

If TRUE, the shiny app will be launched in a new session. If FALSE, the shiny app will be created but not launched.

Value

A shiny app

Examples

library(PhenotypeR)

cdm <- mockPhenotypeR()

result <- phenotypeDiagnostics(cdm$my_cohort)

shinyDiagnostics(result, tempdir())

CDMConnector::cdmDisconnect(cdm = cdm)