Differential accessibility analysis for CATaDa (`NOISeq` based) — differential

Setup and differential analysis for CATaDa chromatin accessibility experiments using `NOISeq`. Accepts output from `load_data_peaks`, prepares a count matrix, performs `NOISeq` analysis, and returns differentially-accessible loci.

Usage

differential_accessibility(data_list, cond, regex = FALSE, norm = "n", q = 0.8)

Arguments

data_list: List. Output from load_data_peaks.
cond: A named or unnamed character vector of length two. The values are strings or regular expressions used to identify samples for each condition. If the vector is named, the names are used as user-friendly display names for the conditions in plots and outputs. If unnamed, the match strings are used as display names. The order determines the contrast, e.g., `cond[1]` vs `cond[2]`.
regex: Logical. If `TRUE`, the strings in `cond` are treated as regular expressions for matching sample names. If `FALSE` (the default), fixed string matching is used.
norm: Normalisation method passed to NOISeq. Defaults to "n" (no normalisation), but "uqua" (upper quantile) or "tmm" (trimmed mean of M) are options if needed
q: Numeric. Q-value threshold for NOISeq significance (default 0.8).

Value

A `DamIDResults` object containing the results. Access slots using accessors (e.g., `analysisTable(results)`). The object includes:

upCond1: data.frame of regions enriched in condition 1
upCond2: data.frame of regions enriched in condition 2
analysis: data.frame of full results for all tested regions
cond: A named character vector mapping display names to internal condition names
data: The original `data_list` input

Examples

# NOTE: This example uses mock counts data, as the package's sample
# data is in log2-ratio format.

# Create a mock data_list with plausible count data
mock_occupancy_counts <- data.frame(
    name = c("peak1", "peak2", "peak3"),
    gene_name = c("GeneA", "GeneB", "GeneC"),
    gene_id = c("ID_A", "ID_B", "ID_C"),
    GroupA_rep1 = c(100, 20, 50), GroupA_rep2 = c(110, 25, 45),
    GroupB_rep1 = c(10, 200, 55), GroupB_rep2 = c(15, 220, 60),
    row.names = c("peak1", "peak2", "peak3")
)

mock_data_list <- list(
    occupancy = mock_occupancy_counts,
    test_category = "accessible"
)

# Run differential accessibility analysis
diff_access_results <- differential_accessibility(
    mock_data_list,
    cond = c("Group A" = "GroupA", "Group B" = "GroupB")
)
#> Condition display names were sanitized for internal data:
#>   'Group A' -> 'Group.A'
#>   'Group B' -> 'Group.B'
#> Differential analysis setup:
#> Condition 1 Display Name: 'Group A' (Internal: 'Group.A', Match Pattern: 'GroupA')
#>   Found 2 replicates:
#>     GroupA_rep1
#>     GroupA_rep2
#> Condition 2 Display Name: 'Group B' (Internal: 'Group.B', Match Pattern: 'GroupB')
#>   Found 2 replicates:
#>     GroupB_rep1
#>     GroupB_rep2
#> [1] "1 differentially expressed features (up in first condition)"
#> [1] "1 differentially expressed features (down in first condition)"
#> 
#> 1 loci enriched in Group A
#> Highest-ranked genes:
#> GeneA
#> 
#> 1 loci enriched in Group B
#> Highest-ranked genes:
#> GeneB

# View the results summary
diff_access_results
#> An object of class 'DamIDResults'
#> Differentially accessible regions
#> Comparison: 'Group A' vs 'Group B'
#> - 1 regions enriched in Group A
#> - 1 regions enriched in Group B
#> - 3 total regions tested