Parallelize 'GSVA' functions
Henrik Bengtsson
Source:vignettes/futurize-81-GSVA.md
futurize-81-GSVA.Rmd
+
=

The futurize package allows you to easily turn
sequential code into parallel code by piping the sequential code to the
futurize() function. Easy!
Introduction
This vignette demonstrates how to use this approach to parallelize the GSVA functions.
The GSVA
Bioconductor package implements gene set variation analysis, a
non-parametric, unsupervised method for estimating variation of gene set
enrichment through the samples of an expression data set. The main
function gsva() computes enrichment scores for each gene
set and sample, which can be parallelized across gene sets.
Example: Running gsva() in parallel
The gsva() function computes gene set enrichment scores
using different methods depending on the parameter object passed to
it:
library(GSVA)
# Create example data
set.seed(42)
n_genes <- 200L
n_samples <- 120L
expr <- matrix(rnorm(n_genes * n_samples), nrow = n_genes, ncol = n_samples)
rownames(expr) <- paste0("gene", seq_len(n_genes))
colnames(expr) <- paste0("sample", seq_len(n_samples))
geneSets <- list(
geneSet1 = paste0("gene", sample(n_genes, 30L)),
geneSet2 = paste0("gene", sample(n_genes, 50L)),
geneSet3 = paste0("gene", sample(n_genes, 40L))
)
param <- gsvaParam(expr, geneSets)
es <- gsva(param)Here gsva() runs sequentially, but we can easily make it
run in parallel by piping to futurize():
This will distribute the work across the available parallel workers, given that we have set up parallel workers, e.g.
plan(multisession)The built-in multisession backend parallelizes on your
local computer and works on all operating systems. There are other parallel
backends to choose from, including alternatives to parallelize
locally as well as distributed across remote machines, e.g.
plan(future.mirai::mirai_multisession)and
plan(future.batchtools::batchtools_slurm)Other enrichment methods
GSVA supports multiple enrichment methods through different parameter
objects. All of them can be parallelized with
futurize():
## ssGSEA method
es <- gsva(ssgseaParam(expr, geneSets)) |> futurize()
## PLAGE method
es <- gsva(plageParam(expr, geneSets)) |> futurize()
## Combined z-score method
es <- gsva(zscoreParam(expr, geneSets)) |> futurize()Supported Functions
The following GSVA functions are supported by
futurize():
-
gsva()- requires GSVA (>= 2.4.2 or >= 2.5.7) gsvaRanks()gsvaScores()spatCor()