Behind the scenes

In previous vignettes we have seen numerous R functions that can help us to add a cohort of interest to a cdm reference. When our cdm reference is to tables in a database, as is often the code, our R code will have been translated to SQL that is run against tables in the databases (for more details on how this is all implemented see https://oxford-pharmacoepi.github.io/Tidy-R-programming-with-OMOP/).

Let’s again work with Eunomia and get the codes needed to create a set of drug cohorts.

library(CDMConnector)
library(CodelistGenerator)
library(PatientProfiles)
library(CohortConstructor)
library(dplyr)

con <- DBI::dbConnect(duckdb::duckdb(), 
                      dbdir = eunomiaDir())
cdm <- cdmFromCon(con, cdmSchema = "main", writeSchema = "main", 
                  writePrefix = "my_study_")
drug_codes <- getDrugIngredientCodes(cdm, 
                                     name = c("acetaminophen",
                                              "amoxicillin", 
                                              "diclofenac", 
                                              "simvastatin",
                                              "warfarin"))

To capture all the SQL executed as we use CohortConstructor functions we can set a global option. For this, we just need to point to a directory in which we’ll save each SQL statement run behind the scenes. Note that in this example we’re using duckdb so the SQL is for this database management system. If you were running on a different type of database the SQL would be adapted accordingly.

dir_sql <- file.path(tempdir(), "sql_folder")
dir.create(dir_sql)
options("omopgenerics.log_sql_path" = dir_sql)

cdm$drugs <- conceptCohort(cdm, 
                           conceptSet = drug_codes,
                           exit = "event_end_date",
                           name = "drugs")

# print sql in order they were saved
files <- file.info(list.files(dir_sql, full.names = TRUE))
sorted_files <- rownames(files[order(files$ctime),])
for(i in seq_along(sorted_files)) {
  cat(paste0("### ", sorted_files[i], "\n\n"))
  sql_with_quotes <- paste0('"', paste(readLines(sorted_files[i]), collapse = '\n'), '"')
  cat(sql_with_quotes, "\n```\n\n")
}

If we want even more detail, we also have the option to see the execution plan along with the SQL.

dir_explain <- file.path(tempdir(), "explain_folder")
dir.create(dir_explain)
options("omopgenerics.log_sql_explain_path" = dir_explain)

cdm$drugs <- cdm$drugs |> 
  requireIsFirstEntry()

files <- list.files(dir_explain, full.names = TRUE)
file_names <- list.files(dir_explain, full.names = FALSE)

for(i in seq_along(files)) {
  cat(paste0("### ", file_names[i], "\n\n"))
  sql_with_quotes <- paste0('"', paste(readLines(files[i]), collapse = '\n'), '"')
  cat(sql_with_quotes, "\n```\n\n")
}

mirror server hosted at Truenetwork, Russian Federation.