Gene Quantification

The featurecounts command runs featureCounts (from the Subread package) over per-cell BAM files produced by split-bam and merges the results into a single gene-by-cell counts matrix. Its primary purpose is to provide a quick, lightweight QC on relevant metrics — the counts matrix feeds directly into qc-metrics for per-cell gene detection counts, mitochondrial/ribosomal fractions, library complexity plots, and featureCounts assignment summaries, without requiring a separate quantification pipeline.

featureCounts Availability

featureCounts (Subread v2.0.6) is bundled with Tranquillyzer — no additional installation or container setup is needed. This version is pinned because certain other featureCounts releases exhibit unexpected behavior during quantification.

When Tranquillyzer detects that featureCounts is already available on $PATH (as it will be in both conda-based and container-based installations), it runs the binary directly without any container overhead. If featureCounts is not found on $PATH, Tranquillyzer falls back to pulling a container image (varishenlab/featurecounts:subread2.0.6_py3.10.12 from Docker Hub) via the first available container runtime (Apptainer, Singularity, or Docker). This fallback exists for edge cases where Subread was not installed alongside Tranquillyzer.

For advanced use cases requiring a different featureCounts version, the --container-runtime and --container-image options allow running an alternative image explicitly.

Batched Execution

Large single-cell experiments can produce thousands of per-cell BAM files. Running featureCounts on all of them in a single invocation can fail or become unwieldy, so Tranquillyzer splits the BAMs into batches (default 200 BAMs per batch, configurable with --batch-size). If a batch fails, it is automatically bisected and retried with progressively smaller groups until the failing BAMs are isolated. After all batches complete, the per-batch count tables are merged into a single gene-by-cell counts matrix.

Usage

tranquillyzer featurecounts \
    --threads 8 \
    SPLIT_BAM_DIR \
    ANNOTATION.gtf \
    OUTPUT_DIR

Command Line Options

Option	Default	Description	When to change
`BAM_DIR`	required	Directory containing per-cell BAMs (output of `split-bam`)
`GTF`	required	GTF annotation file passed to featureCounts (`-a`)
`OUT_DIR`	required	Output directory for batch results and merged matrix
`--container-runtime`	`auto`	Container runtime: `auto`, `apptainer`, `singularity`, `docker`, or `native`	Normally not needed — auto-detects the bundled binary. Set to force a specific runtime.
`--container-image`	`varishenlab/featurecounts:subread2.0.6_py3.10.12`	Container image (used only when running via a container runtime)	Override to use a different featureCounts version
`--image-cache`	`<package_dir>/container_images/`	Directory to cache pulled SIF images	Set to a writable path if the default location is read-only
`--bind`	None	Comma-separated extra bind paths for the container	Add paths the container needs beyond the input/output directories
`--batch-size`	200	Number of BAMs per featureCounts call	Decrease if featureCounts runs out of memory
`--threads`	8	Total threads (split across workers)	Match your available cores
`--workers`	1	Parallel featureCounts batches	Increase to run multiple batches concurrently
`--extra`	`-t exon -g gene_id -O`	Extra arguments passed to featureCounts	Add `-s 1` or `-s 2` for stranded libraries
`--matrix-name`	`counts_matrix.tsv`	Filename for the merged counts matrix
`--no-run`	off	Skip running featureCounts; merge existing batch outputs only	Use to re-merge after manual batch fixes

Output

<OUT_DIR>/counts_matrix.tsv: merged gene-by-cell counts matrix (rows = genes, columns = cells)
<OUT_DIR>/batches/featurecounts_batch*.txt: individual batch outputs from featureCounts

The counts matrix can be passed directly to qc-metrics --counts-matrix for gene-level QC plots.