| Title: | Read bigWig and bigBed Files |
|---|---|
| Description: | Read bigWig and bigBed files using "libBigWig" <https://github.com/dpryan79/libBigWig>. Provides lightweight access to the binary bigWig and bigBed formats developed by the UCSC Genome Browser group. |
| Authors: | Jay Hesselberth [aut, cre], RNA Bioscience Initiative [fnd, cph], Devon Ryan [cph] |
| Maintainer: | Jay Hesselberth <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3.1.9000 |
| Built: | 2026-06-26 13:16:13 UTC |
| Source: | https://github.com/rnabioco/cpp11bigwig |
Reads the bigBed header without loading any intervals. This is useful for
identifying the BED variant a file holds before reading it: a genuine BED12
has defined_field_count == 12, whereas a bed9+3 file (9 standard BED
columns plus 3 custom fields) has defined_field_count == 9 and
field_count == 12.
bigbed_info(bbfile)bigbed_info(bbfile)
bbfile |
path or URL for a bigBed file. Remote files
( |
A named list with elements version, n_chroms, field_count,
defined_field_count, n_bases_covered, and autosql (the embedded
autoSql schema string, or "" when the file has none).
bb <- system.file("extdata", "test.bb", package = "cpp11bigwig") info <- bigbed_info(bb) info$defined_field_countbb <- system.file("extdata", "test.bb", package = "cpp11bigwig") info <- bigbed_info(bb) info$defined_field_count
Reads the bigWig header without loading any intervals. The summary
statistics (min, max, mean, std) are the file-level values stored in
the header and computed over all covered bases.
bigwig_info(bwfile)bigwig_info(bwfile)
bwfile |
path or URL for a bigWig file. Remote files
( |
A named list with elements version, n_levels, n_chroms,
n_bases_covered, min, max, mean, and std.
bw <- system.file("extdata", "test.bw", package = "cpp11bigwig") bigwig_info(bw)bw <- system.file("extdata", "test.bw", package = "cpp11bigwig") bigwig_info(bw)
Columns are automatically typed based on the autoSql schema embedded
in the bigBed file. Integer types (uint, int) become R integers,
floating point types (float, double) become R doubles, and all
other types (including array types like int[blockCount]) remain
as character strings.
read_bigbed(bbfile, chrom = NULL, start = NULL, end = NULL)read_bigbed(bbfile, chrom = NULL, start = NULL, end = NULL)
bbfile |
path or URL for a bigBed file. Remote files
( |
chrom |
chromosome(s) to read. Either a character vector of
chromosome names, or a GenomicRanges::GRanges of query regions (in
which case |
start |
start position(s) for data. May be a vector describing
several ranges, recycled against |
end |
end position(s) for data. May be a vector describing
several ranges, recycled against |
When a bigBed file has no embedded autoSql schema (for example one
produced by bedToBigBed without -as), columns are still recovered
using the standard BED field names (name, score, strand,
thickStart, thickEnd, itemRgb, blockCount, blockSizes,
blockStarts) derived from the file's field counts. Any additional
(bedN+) fields beyond the standard BED columns are returned as
generic fieldN character columns. Because those names are inferred
rather than declared by the file, a message() is emitted in this
case; silence it with base::suppressMessages().
tibble
https://github.com/dpryan79/libBigWig
https://github.com/brentp/bw-python
bb <- system.file("extdata", "test.bb", package = "cpp11bigwig") read_bigbed(bb) read_bigbed(bb, chrom = "chr10") # query several chromosomes in one call read_bigbed(bb, chrom = c("chr1", "chr10")) # restrict each query to a window read_bigbed(bb, chrom = c("chr1", "chr10"), start = c(0, 0), end = c(5e6, 5e6)) # pass a GRanges of regions; 1-based coords are converted automatically gr <- GenomicRanges::GRanges( c("chr1", "chr10"), IRanges::IRanges(start = 1, width = 1e7) ) read_bigbed(bb, chrom = gr)bb <- system.file("extdata", "test.bb", package = "cpp11bigwig") read_bigbed(bb) read_bigbed(bb, chrom = "chr10") # query several chromosomes in one call read_bigbed(bb, chrom = c("chr1", "chr10")) # restrict each query to a window read_bigbed(bb, chrom = c("chr1", "chr10"), start = c(0, 0), end = c(5e6, 5e6)) # pass a GRanges of regions; 1-based coords are converted automatically gr <- GenomicRanges::GRanges( c("chr1", "chr10"), IRanges::IRanges(start = 1, width = 1e7) ) read_bigbed(bb, chrom = gr)
Read data from bigWig files.
read_bigwig( bwfile, chrom = NULL, start = NULL, end = NULL, as = NULL, fill = 0 )read_bigwig( bwfile, chrom = NULL, start = NULL, end = NULL, as = NULL, fill = 0 )
bwfile |
path or URL for a bigWig file. Remote files
( |
chrom |
chromosome(s) to read. Either a character vector of
chromosome names, or a GenomicRanges::GRanges of query regions (in
which case |
start |
start position(s) for data. May be a vector, recycled
against |
end |
end position(s) for data. May be a vector, recycled
against |
as |
return data as a specific type. One of |
fill |
value used for bases with no data when |
Multiple ranges can be queried in one call by passing equal-length
(or length-1, recycled) chrom, start, and end vectors, where
range i is (chrom[i], start[i], end[i]). Alternatively, pass a
GenomicRanges::GRanges as chrom; its regions are used directly.
Because GRanges is 1-based and inclusive while bigWig is 0-based and
half-open, a region is converted as start(gr) - 1 to end(gr).
When as = "Rle", the result is an S4Vectors::Rle whose expanded
length equals the queried range, i.e. end - start when both are
supplied, otherwise the extent of the returned data for each
chromosome. Bases with no data in the file are set to fill. bigWig
coordinates are 0-based and half-open, so element i corresponds to
genomic position start + i - 1. A single-range query returns a bare
Rle; a multi-range (or multi-chromosome) query returns a named
IRanges::RleList with one element per range.
A tibble, GRanges, or Rle/RleList depending on as.
https://github.com/dpryan79/libBigWig
https://github.com/brentp/bw-python
bw <- system.file("extdata", "test.bw", package = "cpp11bigwig") read_bigwig(bw) read_bigwig(bw, chrom = "10") read_bigwig(bw, chrom = "1", start = 100, end = 130) read_bigwig(bw, as = "GRanges") read_bigwig(bw, chrom = "1", start = 100, end = 130, as = "Rle") # query several ranges in one call with equal-length vectors read_bigwig(bw, chrom = c("1", "10"), start = c(0, 0), end = c(50, 50)) # multiple windows on the same chromosome (chrom recycles) read_bigwig(bw, chrom = "1", start = c(0, 100), end = c(50, 130)) # a multi-range "Rle" query returns a named RleList, one element per range read_bigwig(bw, chrom = "1", start = c(0, 100), end = c(50, 130), as = "Rle") # pass a GRanges of regions; 1-based coords are converted automatically gr <- GenomicRanges::GRanges( c("1", "10"), IRanges::IRanges(start = c(1, 1), end = c(50, 50)) ) read_bigwig(bw, chrom = gr)bw <- system.file("extdata", "test.bw", package = "cpp11bigwig") read_bigwig(bw) read_bigwig(bw, chrom = "10") read_bigwig(bw, chrom = "1", start = 100, end = 130) read_bigwig(bw, as = "GRanges") read_bigwig(bw, chrom = "1", start = 100, end = 130, as = "Rle") # query several ranges in one call with equal-length vectors read_bigwig(bw, chrom = c("1", "10"), start = c(0, 0), end = c(50, 50)) # multiple windows on the same chromosome (chrom recycles) read_bigwig(bw, chrom = "1", start = c(0, 100), end = c(50, 130)) # a multi-range "Rle" query returns a named RleList, one element per range read_bigwig(bw, chrom = "1", start = c(0, 100), end = c(50, 130), as = "Rle") # pass a GRanges of regions; 1-based coords are converted automatically gr <- GenomicRanges::GRanges( c("1", "10"), IRanges::IRanges(start = c(1, 1), end = c(50, 50)) ) read_bigwig(bw, chrom = gr)