Skip to contents

Setup and differential analysis for occupancy/binding experiments using limma. Accepts output from `load_data_peaks` or `load_data_genes`, prepares experiment matrix, fits linear models, and returns DE loci.

Usage

differential_binding(data_list, cond, cond_names = NULL, fdr = 0.05)

Arguments

data_list

List. Output from load_data_peaks or load_data_genes.

cond

Vector (character). Two strings identifying the two conditions to compare. The order matters: `cond[1]` is used as Condition 1, `cond[2]` as Condition 2.

cond_names

Vector (character, optional). Custom display names for the two conditions in outputs/plots. Order maps to `cond`.

fdr

Numeric. FDR threshold for significance (default 0.05).

Value

A `DamIDResults` object containing the results. Access slots using the `@` accessor (e.g., `results@analysis`). The object includes:

upCond1

data.frame of regions enriched in condition 1

upCond2

data.frame of regions enriched in condition 2

analysis

data.frame of full results for all tested regions

cond

A named character vector mapping display names to internal condition names

data

The original `data_list` input

Examples

# Create a mock GRanges object for gene annotations
# This object, based on the package's unit tests, avoids network access.
mock_genes_gr <- GenomicRanges::GRanges(
    seqnames = S4Vectors::Rle("2L", 7),
    ranges = IRanges::IRanges(
        start = c(1000, 2000, 3000, 5000, 6000, 7000, 8000),
        end = c(1500, 2500, 3500, 5500, 6500, 7500, 20000000)
    ),
    strand = S4Vectors::Rle(GenomicRanges::strand(c("+", "-", "+", "+", "-", "-", "+"))),
    gene_id = c("FBgn001", "FBgn002", "FBgn003", "FBgn004", "FBgn005", "FBgn006", "FBgn007"),
    gene_name = c("geneA", "geneB", "geneC", "geneD", "geneE", "geneF", "LargeTestGene")
)

# Get path to sample data files included with the package
data_dir <- system.file("extdata", package = "damidBind")

# Load data
loaded_data <- load_data_peaks(
    binding_profiles_path = data_dir,
    peaks_path = data_dir,
    ensdb_genes = mock_genes_gr,
    quantile_norm = TRUE
)
#> Locating binding profile files
#> Building binding profile dataframe from input files ...
#>  - Loaded: Bsh_Dam_L4_r1-ext300-vs-Dam.kde-norm
#>  - Loaded: Bsh_Dam_L4_r2-ext300-vs-Dam.kde-norm
#>  - Loaded: Bsh_Dam_L5_r1-ext300-vs-Dam.kde-norm
#>  - Loaded: Bsh_Dam_L5_r2-ext300-vs-Dam.kde-norm
#> Applying quantile normalisation
#> Locating peak files
#> Calculating occupancy over peaks
#> Calculating average occupancy per region ...

# Run differential binding analysis
diff_results <- differential_binding(
    loaded_data,
    cond = c("L4", "L5"),
    cond_names = c("L4 Neurons", "L5 Neurons")
)
#> Differential analysis setup:
#> Condition 1: 'L4' (display as 'L4 Neurons')
#>   Found 2 replicates:
#>     Bsh_Dam_L4_r1-ext300-vs-Dam.kde-norm_qnorm
#>     Bsh_Dam_L4_r2-ext300-vs-Dam.kde-norm_qnorm
#> Condition 2: 'L5' (display as 'L5 Neurons')
#>   Found 2 replicates:
#>     Bsh_Dam_L5_r1-ext300-vs-Dam.kde-norm_qnorm
#>     Bsh_Dam_L5_r2-ext300-vs-Dam.kde-norm_qnorm
#> limma contrasts: L4-L5
#> 
#> 277 loci enriched in L4 Neurons
#> Highest-ranked genes:
#> LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene
#> 
#> 173 loci enriched in L5 Neurons
#> Highest-ranked genes:
#> LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene, LargeTestGene

# View the results summary
diff_results
#> An object of class 'DamIDResults'
#> Differentially bound regions
#> Comparison: 'L4 Neurons' vs 'L5 Neurons'
#> - 277 regions enriched in L4 Neurons
#> - 173 regions enriched in L5 Neurons
#> - 1208 total regions tested
#> 
#> Access results with accessor functions like analysisTable(object).