Skip to contents

Run a function on the CpGs in bedfiles across genomic regions. Currently supported functions are aggregate and average. Parallelized across files using threads from the "iscream.threads" option.

Usage

region_map(
  bedfiles,
  regions,
  fun = "aggregate",
  bismark = FALSE,
  mval = TRUE,
  set_region_rownames = FALSE,
  nthreads = NULL
)

Arguments

bedfiles

A vector of bedfile paths

regions

A vector of genomic regions strings. If a named vector is provided, the names will be used in the feature column instead of the genomic regions string

fun

Function to apply over the region. See details.

bismark

hether the input is a bismark coverage (or BSbolt Bedgraph) file

mval

Whether to calculate the M value (coverage \(\times \beta\)) or use the beta value when applying the function.

set_region_rownames

Use the region strings as the returned data frame's rownames. Can be useful if you have a named vector of regions and want both the rownames and the feature names.

nthreads

Set number of threads to use overriding the "iscream.threads" option. See ?set_threads for more information.

Value

A data.frame

Details

Available functions:

  • "aggregate" sums the values in the region with aggregated beta values if mval = FALSE and aggregated M values if mval = TRUE

  • "average" averages the values in the region with average beta values if mval = FALSE and average M values if mval = TRUE

Examples

bedfiles <- system.file("extdata", package = "iscream") |>
  list.files(pattern = "[a|b|c|d].bed.gz$", full.names = TRUE)
# examine the bedfiles
colnames <- c("chr", "start", "end", "beta", "coverage")
lapply(bedfiles, function(i) knitr::kable(read.table(i, col.names = colnames)))
#> [[1]]
#> 
#> 
#> |chr  | start| end| beta| coverage|
#> |:----|-----:|---:|----:|--------:|
#> |chr1 |     1|   2|  1.0|        1|
#> |chr1 |     3|   4|  1.0|        1|
#> |chr1 |     5|   6|  0.0|        2|
#> |chr1 |     7|   8|  0.0|        1|
#> |chr1 |     9|  10|  0.5|        2|
#> |chr1 |    11|  12|  1.0|        2|
#> |chr1 |    13|  14|  1.0|        3|
#> 
#> [[2]]
#> 
#> 
#> |chr  | start| end| beta| coverage|
#> |:----|-----:|---:|----:|--------:|
#> |chr1 |     1|   2|    0|        2|
#> |chr1 |     5|   6|    1|        2|
#> |chr1 |     7|   8|    1|        1|
#> |chr1 |    11|  12|    0|        2|
#> |chr1 |    13|  14|    1|        1|
#> 
#> [[3]]
#> 
#> 
#> |chr  | start| end| beta| coverage|
#> |:----|-----:|---:|----:|--------:|
#> |chr1 |     3|   4|    1|        2|
#> |chr1 |     7|   8|    0|        2|
#> |chr1 |     9|  10|    1|        1|
#> 
#> [[4]]
#> 
#> 
#> |chr  | start| end| beta| coverage|
#> |:----|-----:|---:|----:|--------:|
#> |chr1 |     1|   2|  1.0|        1|
#> |chr1 |     3|   4|  1.0|        2|
#> |chr1 |     7|   8|  0.0|        1|
#> |chr1 |     9|  10|  0.5|        2|
#> |chr1 |    13|  14|  1.0|        1|
#> 

# make a vector of regions
regions <- c("chr1:1-6", "chr1:7-10", "chr1:11-14")
region_map(bedfiles, regions)
#> [16:50:07.825190] [iscream::region_map] [info] Aggregating 3 regions from 4 bedfiles
#> 
#>       Feature Cell coverage M
#> 1    chr1:1-6    a        4 2
#> 2   chr1:7-10    a        3 1
#> 3  chr1:11-14    a        5 5
#> 4    chr1:1-6    b        4 2
#> 5   chr1:7-10    b        1 1
#> 6  chr1:11-14    b        3 1
#> 7    chr1:1-6    c        2 2
#> 8   chr1:7-10    c        3 1
#> 9  chr1:11-14    c        0 0
#> 10   chr1:1-6    d        3 3
#> 11  chr1:7-10    d        3 1
#> 12 chr1:11-14    d        1 1
region_map(bedfiles, regions, mval = FALSE)
#> [16:50:07.827519] [iscream::region_map] [info] Aggregating 3 regions from 4 bedfiles
#> 
#>       Feature Cell coverage beta
#> 1    chr1:1-6    a        4  2.0
#> 2   chr1:7-10    a        3  0.5
#> 3  chr1:11-14    a        5  2.0
#> 4    chr1:1-6    b        4  1.0
#> 5   chr1:7-10    b        1  1.0
#> 6  chr1:11-14    b        3  1.0
#> 7    chr1:1-6    c        2  1.0
#> 8   chr1:7-10    c        3  1.0
#> 9  chr1:11-14    c        0  0.0
#> 10   chr1:1-6    d        3  2.0
#> 11  chr1:7-10    d        3  0.5
#> 12 chr1:11-14    d        1  1.0
region_map(bedfiles, regions, fun = "average")
#> [16:50:07.829643] [iscream::region_map] [info] Averaging 3 regions from 4 bedfiles
#> 
#>       Feature Cell coverage         M
#> 1    chr1:1-6    a 1.333333 0.6666667
#> 2   chr1:7-10    a 1.500000 0.5000000
#> 3  chr1:11-14    a 2.500000 2.5000000
#> 4    chr1:1-6    b 2.000000 1.0000000
#> 5   chr1:7-10    b 1.000000 1.0000000
#> 6  chr1:11-14    b 1.500000 0.5000000
#> 7    chr1:1-6    c 2.000000 2.0000000
#> 8   chr1:7-10    c 1.500000 0.5000000
#> 9  chr1:11-14    c 0.000000 0.0000000
#> 10   chr1:1-6    d 1.500000 1.5000000
#> 11  chr1:7-10    d 1.500000 0.5000000
#> 12 chr1:11-14    d 1.000000 1.0000000