gwas2crispr: From GWAS to CRISPR-ready files (hg38)

Overview

gwas2crispr retrieves significant genome-wide association study (GWAS) SNPs for an Experimental Factor Ontology (EFO) trait, aggregates variant/gene/study metadata, and optionally exports CSV, BED, and FASTA files for downstream functional genomics and CRISPR guide design. The package targets GRCh38/hg38.

Key design for CRAN compliance: functions do not write by default. File writing happens only if you set out_prefix. In examples/tests/vignettes, write to tempdir().

Runtime prerequisites: the GWAS Catalog client gwasrapidd is required for data retrieval; Biostrings + BSgenome.Hsapiens.UCSC.hg38 are required only if you want FASTA output.

Core functions

This vignette does not run network calls or write files (global eval = FALSE) to keep CRAN checks deterministic.

Installation

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install(c("Biostrings", "BSgenome.Hsapiens.UCSC.hg38"))

install.packages("gwasrapidd")  # required for GWAS retrieval

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
devtools::install_github("leopard0ly/gwas2crispr")

Quick examples (primary + CRAN-safe)

A) Primary workflow — write outputs to the current working directory

library(gwas2crispr)

# Lung disease (EFO_0000707), GRCh38/hg38
run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = "lung"   # produces: lung_snps_full.csv / lung_snps_hg38.bed / lung_snps_flank300.fa
)

B) CRAN-safe — write into a temporary directory

library(gwas2crispr)

tmp <- tempdir()  # CRAN-safe target
res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = file.path(tmp, "lung"),  # writes here, not to user's home
  verbose    = FALSE
)

# Files written (list components or vector of paths, depending on return structure):
res$csv
res$bed
res$fasta  # present only if BSgenome/Biostrings are installed

CLI usage (optional)

Rscript "$(Rscript -e \"cat(system.file('scripts','gwas2crispr.R', package='gwas2crispr'))\")" \
  -e EFO_0000707 -p 1e-6 -f 300 -o "$(Rscript -e \"cat(tempdir())\")/lung"

The -o path in CLI should point to a temporary or user-chosen directory. Avoid writing to the package root when reproducing examples under CRAN-like conditions.

Session info

sessionInfo()
#> R version 4.4.3 (2025-02-28 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22621)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=C                  LC_CTYPE=Arabic_Libya.utf8   
#> [3] LC_MONETARY=Arabic_Libya.utf8 LC_NUMERIC=C                 
#> [5] LC_TIME=Arabic_Libya.utf8    
#> 
#> time zone: Africa/Tripoli
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.37     R6_2.6.1          fastmap_1.2.0     xfun_0.52        
#>  [5] cachem_1.1.0      knitr_1.50        htmltools_0.5.8.1 rmarkdown_2.29   
#>  [9] lifecycle_1.0.4   cli_3.6.5         sass_0.4.10       jquerylib_0.1.4  
#> [13] compiler_4.4.3    rstudioapi_0.17.1 tools_4.4.3       evaluate_1.0.4   
#> [17] bslib_0.9.0       yaml_2.3.10       rlang_1.1.6       jsonlite_2.0.0

mirror server hosted at Truenetwork, Russian Federation.