gwas2crispr: From GWAS to CRISPR-ready Files

Overview

gwas2crispr prepares genome-wide association study (GWAS) results for downstream clustered regularly interspaced short palindromic repeats (CRISPR) workflows.

The package retrieves significant single-nucleotide polymorphisms (SNPs) for an Experimental Factor Ontology (EFO) trait from the EMBL-EBI GWAS Catalog REST API v2 and returns CRISPR-ready outputs for the GRCh38/hg38 human genome build.

The main outputs are:

Installation

Install from CRAN:

install.packages("gwas2crispr")

Optional packages for FASTA output:

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install(c(
  "Biostrings",
  "GenomeInfoDb",
  "BSgenome.Hsapiens.UCSC.hg38"
))

Development version:

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")

devtools::install_github("leopard0ly/gwas2crispr")

Fetch GWAS associations

library(gwas2crispr)

gwas_data <- fetch_gwas(
  efo_id  = "EFO_0000707",
  p_cut   = 1e-6,
  verbose = FALSE
)

names(gwas_data)
head(gwas_data$associations)

Run without writing files

By default, no files are written.

res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = NULL,
  verbose    = FALSE
)

res$summary
head(res$snps_full)
head(res$bed)

Write files safely

To write output files, provide out_prefix. In examples, use tempdir().

out_prefix <- file.path(tempdir(), "lung")

res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = out_prefix,
  verbose    = FALSE
)

res$written

Expected output paths:

paste0(out_prefix, "_snps_full.csv")
paste0(out_prefix, "_snps_hg38.bed")
paste0(out_prefix, "_snps_flank300.fa")

The FASTA file is created only when the optional genome packages are available.

Output structure

names(res)

Common outputs:

res$summary
res$snps_full
res$bed
res$fasta
res$written

Session information

sessionInfo()
#> R version 4.4.3 (2025-02-28 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22621)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=C                  LC_CTYPE=Arabic_Libya.utf8   
#> [3] LC_MONETARY=Arabic_Libya.utf8 LC_NUMERIC=C                 
#> [5] LC_TIME=Arabic_Libya.utf8    
#> 
#> time zone: Africa/Tripoli
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.39     R6_2.6.1          fastmap_1.2.0     xfun_0.56        
#>  [5] cachem_1.1.0      knitr_1.51        htmltools_0.5.9   rmarkdown_2.30   
#>  [9] lifecycle_1.0.5   cli_3.6.5         sass_0.4.10       jquerylib_0.1.4  
#> [13] compiler_4.4.3    rstudioapi_0.18.0 tools_4.4.3       evaluate_1.0.5   
#> [17] bslib_0.10.0      yaml_2.3.10       otel_0.2.0        jsonlite_2.0.0   
#> [21] rlang_1.1.6

mirror server hosted at Truenetwork, Russian Federation.