QC Metrics
The qc-metrics command generates a comprehensive quality control report from annotation outputs and, optionally, from aligned BAM files and gene count matrices. The output is an interactive HTML dashboard with all metrics and plots, plus individual TSV files compatible with MultiQC.
Available Metrics
The metrics fall into three tiers, depending on what input files you provide:
Tier 1: Annotation-Based Metrics (Always Available)
These require only the annotation parquet files:
- Read counts:total, valid, invalid, and validity rate
- Demux summary:demultiplexed vs ambiguous read counts (if barcode-corrected)
- Read-length distributions:for all, valid, invalid, demuxed, and ambiguous reads
- Segment length distributions:per-segment box plots
- PolyA/T tail-length distribution
- Read orientation balance:forward vs reverse complement
- Cell-barcode knee plot:log-log rank-count curve for cell barcodes
- Per-cell read counts
- Barcode edit distance distributions:per barcode type
Tier 2: BAM-Based Metrics (Requires --bam)
Providing an aligned BAM file (with CB and UB tags) enables:
- Sequencing saturation curve:UMI saturation as a function of sequencing depth
- Unique UMIs per cell
- Mapping rate per cell
- Duplicate rate per cell
- Gene body coverage (also requires
--gtf)
Tier 3: Gene Quantification Metrics (Requires --counts-matrix and --gtf)
Providing a featureCounts counts matrix and GTF annotation enables:
- Genes detected per cell
- UMIs per cell
- Mitochondrial read fraction per cell
- Ribosomal read fraction per cell
- Library complexity (genes vs UMIs colored by mitochondrial %)
- Top expressed genes
- Gene biotype breakdown
- featureCounts assignment summary
Usage
Basic (Annotation Only)
tranquillyzer qc-metrics \
--threads 4 \
INPUT_DIRWith BAM and Gene Counts
tranquillyzer qc-metrics \
--threads 4 \
--bam aligned_files/dup_marked.bam \
--counts-matrix counts_matrix.tsv \
--gtf gencode.v44.annotation.gtf \
INPUT_DIRCommand Line Options
| Option | Default | Description | When to change |
|---|---|---|---|
INPUT_DIR |
required | Directory containing annotation_metadata/ | |
--output-dir |
INPUT_DIR/qc_metrics |
Where to write the report | |
--threads |
4 | Threads for parallel metric computation | Increase for faster report generation |
--sample-name |
directory name | Label used in the report | Set for clearer report titles |
--valid-file |
auto-detect | Path to valid annotations parquet | Only if non-standard location |
--invalid-file |
auto-detect | Path to invalid annotations parquet | Only if non-standard location |
--bam |
None | Coordinate-sorted BAM with CB/UB tags | Provide for saturation and alignment metrics |
--counts-matrix |
None | featureCounts counts matrix TSV | Provide for gene-level QC |
--gtf |
None | GTF annotation file | Required with --counts-matrix; also used for gene body coverage |
--read-len-bin-width |
100 | Bin width for read-length histograms | Decrease for finer resolution |
Output
All outputs are written to the QC metrics directory:
report.html: self-contained interactive HTML dashboard (Plotly-based; zoomable, pannable, with hover tooltips). This is the primary QC artifact.plot_data/*.tsv: individual metric TSV files for MultiQC integration:barcode_assignment_mqc.tsvread_architecture_mqc.tsvedit_distance_mqc.tsvread_length_dist_mqc.tsvknee_plot_mqc.tsvsaturation_curve_mqc.tsv(if BAM provided)genes_per_cell_mqc.tsv(if counts matrix provided)- and others
Recommendations
- The QC report can be generated at any point after annotation. You do not need to wait for alignment or deduplication. Run it early with just annotations, then re-run with
--bamand--counts-matrixfor the full picture. - Metrics are computed independently in parallel, so increasing
--threadsdirectly speeds up report generation. - The HTML report is fully self-contained and can be shared or viewed without a web server.